System Architecture

Overview

Varpulis Runtime Engine overview

* RocksDB requires the persistence feature flag: cargo build --features persistence

Processing Flow

Processing flow pipeline

Components

Compiler

Parse VPL via Pest PEG parser
Generates IR (Intermediate Representation)
Static optimizations

Execution Graph

DAG (Directed Acyclic Graph) of operations
Intelligent scheduling
Operator fusion when possible

Ingestion Layer

Source connectors (Kafka, files, HTTP, etc.)
Deserialization (JSON)
Schema validation

Pattern Matcher (SASE+ with ZDD)

NFA-based SASE+ engine for sequence and Kleene pattern detection
ZDD (varpulis-zdd crate): compactly represents Kleene capture combinations — e.g., 100 matching events produce ~100 ZDD nodes instead of 2^100 explicit subsets
Temporal constraints, negation, partition-by support

State Manager

See state-management.md

Aggregation Engine (Hamlet)

Aggregation functions (sum, avg, count, min, max, stddev, etc.)
Temporal windows (tumbling, sliding, session)
Key-based grouping
Hamlet (hamlet/ module): multi-query trend aggregation with graphlet-based sharing — O(1) per-event propagation, 3-100x faster than the experimental zdd_unified/ aggregation module
See trend-aggregation.md

Note on ZDD modules: The varpulis-zdd crate (used by the Pattern Matcher above) compactly represents Kleene match combinations during detection. The zdd_unified/ module is a separate experimental research module that explored ZDD-based aggregation — Hamlet supersedes it for production use.

Pattern Forecasting (PST)

Prediction Suffix Tree (pst/ module): variable-order Markov model for pattern completion forecasting
Online learning — no pre-training required, starts predictions after a configurable warmup period
Pattern Markov Chain (PMC): maps PST predictions onto SASE NFA states to forecast which events complete active sequence patterns
Sub-microsecond prediction latency (51 ns single symbol, 105 ns full distribution)
Activated via the .forecast() operator in VPL
See forecasting.md

Parallelism Manager

See parallelism.md

Context Orchestrator

Named execution contexts with OS thread isolation
CPU affinity pinning via core_affinity
Cross-context routing via bounded mpsc channels
Zero overhead when no contexts are declared
See contexts guide

Observability Layer

See observability.md

Checkpoint Manager

Versioned state snapshots (windows, SASE runs, aggregation, variables, watermarks)
Three backends: in-memory, filesystem, RocksDB (requires persistence feature)
Crash recovery with automatic watermark restoration
Opt-in by design — see Feature Maturity above
See state-management.md

Feature Maturity

Several advanced features are opt-in by design — they are production-ready but require explicit activation. This table clarifies their status to prevent them from being mistaken for incomplete implementations during audits.

Feature	Status	Activation	Module
Checkpointing	Production	Pass a `StateStore` to `Engine`; call `checkpoint_tick()` periodically	`persistence.rs`
Contexts	Production	Declare `context` blocks in VPL; engine auto-creates OS threads	`context.rs`
`.concurrent()`	Production	Add `.concurrent()` to a stream; creates a rayon thread pool	`engine/compilation.rs`
Worker Pool	Production	Instantiate `WorkerPool` with a config and dispatch events to it	`worker_pool.rs`
Hamlet Aggregation	Production	Use `.trend_aggregate()` in VPL after a sequence pattern	`hamlet/`
PST Forecasting	Production	Use `.forecast()` in VPL after a sequence pattern	`pst/`

Why opt-in? These features add runtime cost (threads, memory, I/O) that would be wasted in simple pipelines. A filter-only pipeline should not pay for checkpointing or multi-threading. The engine defaults to zero-overhead single-threaded execution and scales up only when the VPL program or API caller requests it.

LSP Server

Language Server Protocol implementation for VPL
Go-to-definition, find-references, completions, hover docs, semantic tokens
See lsp.md

MCP Server

Model Context Protocol server exposing VPL validation, pipeline management, and engine status to AI assistants
See mcp.md

System Architecture ​

Overview ​

Processing Flow ​

Components ​

Compiler ​

Execution Graph ​

Ingestion Layer ​

Pattern Matcher (SASE+ with ZDD) ​

State Manager ​

Aggregation Engine (Hamlet) ​

Pattern Forecasting (PST) ​

Parallelism Manager ​

Context Orchestrator ​

Observability Layer ​

Checkpoint Manager ​

Feature Maturity ​

LSP Server ​

MCP Server ​