Substrai — Open-source infrastructure for autonomous systems

Automated LLM evaluation pipeline generator. Describe your use case and get complete evaluation infrastructure with metrics, synthetic test data, and drift detection.

→CI/CD with pytest and coverage

→Parallel metric execution (ThreadPool)

→Custom metric plugin system (@metric)

→HTML report generation with SVG charts

→Dataset versioning with content-hash

→CI/CD quality gate (exit code 0/1)

→Evaluation result comparison between runs

→LLM-powered synthetic test data generator

→Factual consistency metric (NLI-based scoring)

→Response latency metric with percentile tracking

→Cost-per-quality metric (cost efficiency scoring)

→Bias detection metric across demographic dimensions

PythonStep Functions

Open Source

Community contributions.

Active contributions to industry-defining AI and cloud-native projects.

LlamaIndex✓ Merged

Fix AgentWorkflow state mutation leaks

★ 49.7k

LangChain● Open

Forward token details in streaming events

★ 137.7k

HuggingFace● Open

Fix consecutive system messages crash

★ 27.6k

MCP SDK● Open

Respect server capabilities in client

★ 23.1k

Apache Airflow● Open

Add BedrockRerankOperator for RAG

★ 45.6k

OpenTelemetry● Open

Fix Kafka partition recording bug

★ 780+

OpenSSF● Open

Model Signing spec governance update

★ 171

Kubernetes● Open

Add sort indicators to table headers

★ 5.2k

AWS CDK● Open

Configurable OpenSearch vector index params

★ 536

Technical Leadership

Architecture as the foundation of impact.

fig.01 — guardrailgraph topology

In the next decade, software will not be written manually — it will be orchestrated. Substrai was founded on the premise that if LLMs are the engine, the industry still lacks the chassis and the transmission.

The work here focuses on building the deterministic layers that allow probabilistic models to function safely in critical systems — formal verification, type-safe policy graphs, and reproducible runtime environments.

Open source is not just a distribution model. It is a governance strategy. By building Substrai in the open, we establish a standard for how autonomous systems should be deployed, audited, and secured.

Local-first

Develop offline. Deploy anywhere.

Verifiable

Every output is auditable.

Composable

Each primitive ships standalone.

Writing

Technical insights.

Deep dives into the engineering challenges behind agentic infrastructure.

2026-05-30Securing the AI Supply Chain: Model Signing and MCP Safety10 min 2026-05-28Contributing to LlamaIndex: Fixing State Leaks in Agent Workflows5 min 2026-05-26Building a CDK Construct for Bedrock AI Agents8 min 2026-05-20Why LangChain Fails in Lambda (And What Does)8 min 2026-05-18Cost-Aware GenAI: Model Routing for Serverless6 min 2026-05-15From Use Case to Evaluation Pipeline in 10 Minutes7 min

Join the ecosystem

Build the substrate for the agentic future.

Star the repositories, open an issue, ship a PR — or just lurk in the Discord.

Browse the org Read the writing

Infrastructure forautonomous systems.

Core Frameworks

lambdallm

guardrailgraph

costsentinel

promptops

agentdeploy

evalforge

Open Source

Technical Leadership

Writing

Build the substrate for the agentic future.

Infrastructure for
autonomous systems.