The Hermes Pipeline¶
Abstraction: From Instructions to Execution¶
At its core, Hermes is an abstract pipeline engine — a framework for turning a declarative description of what should happen into an executable sequence of how it happens. While Hermes ships with nodes for OpenFOAM and CFD simulations, the pipeline itself is entirely domain-agnostic. It can orchestrate any sequence of instructions: file operations, shell commands, Python code, template rendering, or simulation setup.
The key insight is the separation between defining a workflow and executing it. You describe your pipeline as a JSON document — a data structure, not a program. Hermes transforms that data into executable code, resolves all dependencies, and runs the tasks in the correct order.
flowchart LR
subgraph Define ["Define (JSON)"]
What["What to do\n& in what order"]
end
subgraph Transform ["Transform (Hermes)"]
How["Resolve dependencies\n& generate code"]
end
subgraph Execute ["Execute (Engine)"]
Run["Schedule & run\ntasks"]
end
Define --> Transform --> Execute
Five Layers of Abstraction¶
The Hermes pipeline is built from five layers, each with a clear responsibility:
flowchart TB
L1["Layer 1: JSON Workflow\n─────────────────\nDeclarative definition\nof nodes, parameters,\nand dependencies"]
L2["Layer 2: Workflow Engine\n─────────────────\nLoads JSON, resolves\ntemplates, builds\ndependency graph"]
L3["Layer 3: Task Wrappers\n─────────────────\nWraps each node with\nmetadata, parameter\nmapping, connectivity"]
L4["Layer 4: Code Generation\n─────────────────\nTranslates task graph\ninto engine-specific\nexecutable code"]
L5["Layer 5: Execution\n─────────────────\nEngine schedules and\nruns tasks respecting\ndependencies"]
L1 --> L2 --> L3 --> L4 --> L5
Layer 1: JSON Workflow Definition¶
The workflow is pure data — a JSON document that declares what nodes exist, what parameters they need, and how they depend on each other. There is no executable code in the workflow file itself.
{
"workflow": {
"nodeList": ["PrepareCase", "RunSimulation", "PostProcess"],
"nodes": {
"PrepareCase": {
"type": "general.CopyDirectory",
"Execution": {
"input_parameters": {
"Source": "template_case",
"Target": "my_simulation"
}
}
},
"RunSimulation": {
"type": "general.RunOsCommand",
"Execution": {
"input_parameters": {
"Command": "cd {PrepareCase.output.Target} && ./Allrun"
}
}
},
"PostProcess": {
"type": "general.RunPythonCode",
"Execution": {
"input_parameters": {
"ModulePath": "my_analysis",
"ClassName": "Analyzer",
"MethodName": "run",
"Parameters": {
"case_dir": "{PrepareCase.output.Target}"
}
}
}
}
}
}
}
This is the only thing you write. Everything below happens automatically.
Layer 2: Workflow Engine¶
The workflow class loads the JSON, resolves any template references, and builds a complete dependency graph. It detects dependencies in two ways:
- Implicit — by scanning parameter values for
{NodeName.output.*}references - Explicit — through the
requiresfield
The result is a directed acyclic graph (DAG) of tasks with fully resolved parameters.
Layer 3: Task Wrappers¶
Each node is wrapped in a hermesTaskWrapper that holds:
- The node's identity (name, type)
- Resolved input parameters
- List of required upstream tasks
- Mapping from parameter paths to upstream outputs
The wrapper is an abstraction layer between the workflow definition and the execution engine — it carries all the metadata needed to generate executable code without being tied to any specific engine.
Layer 4: Code Generation¶
A builder (currently LuigiBuilder) translates the task wrapper graph into engine-specific code. For Luigi, each node becomes a Python Task class with:
requires()— returns the list of dependency tasksoutput()— specifies where results are stored (as JSON files)run()— resolves parameters from upstream outputs, invokes the node's executer, and writes the result
Layer 5: Execution¶
The execution engine (Luigi) schedules and runs all tasks, respecting the dependency order. Each task reads its inputs from the JSON outputs of its dependencies, runs the domain-specific logic, and writes its own JSON output for downstream consumers.
The Executer Pattern¶
The pipeline's flexibility comes from the executer pattern. Each node type has a corresponding executer — a Python class that implements the actual work:
flowchart TB
subgraph Abstract ["Abstract Layer (Hermes Core)"]
WF["Workflow Engine"]
TW["Task Wrapper"]
end
subgraph Concrete ["Concrete Layer (Executers)"]
direction LR
E1["CopyDirectory\nexecuter"]
E2["RunOsCommand\nexecuter"]
E3["JinjaTransform\nexecuter"]
E4["ControlDict\nexecuter"]
E5["Your Custom\nexecuter"]
end
WF --> TW
TW --> E1
TW --> E2
TW --> E3
TW --> E4
TW --> E5
All executers follow the same interface:
- Accept a dictionary of input parameters
- Perform their specific operation
- Return a dictionary of output values
This means the pipeline doesn't know or care what each node does — it only manages the flow of data between them. A CopyDirectory node, an OpenFOAM ControlDict node, and your own custom node all look the same to the pipeline.
The Executer Registry¶
When Hermes encounters a node type like general.CopyDirectory, it uses the executer registry to locate the right Python class:
"general.CopyDirectory" → hermes/Resources/general/CopyDirectory/executer.py
"openFOAM.system.ControlDict" → hermes/Resources/openFOAM/system/ControlDict/executer.py
Executers are loaded dynamically at runtime (late binding), so you can add new node types without modifying any core code.
Data Flow: How Parameters Move Through the Pipeline¶
The parameter reference system is what connects nodes into a coherent pipeline. When you write:
This is what happens at execution time:
- The
PrepareCasetask runs and writes its output as JSON:{"Target": "/absolute/path/to/my_simulation", ...} - The
RunSimulationtask readsPrepareCase's JSON output - The path
{PrepareCase.output.Target}is resolved to/absolute/path/to/my_simulation - The command becomes:
cd /absolute/path/to/my_simulation && ./Allrun
Every piece of data flows through JSON files — making the pipeline fully observable and debuggable. You can inspect any intermediate JSON file to see exactly what a node produced.
flowchart LR
A["Node A\nruns & writes\nA.json"] -- "A.json" --> B["Node B\nreads A.json,\nresolves params,\nruns & writes B.json"]
B -- "B.json" --> C["Node C\nreads B.json,\nresolves params,\nruns & writes C.json"]
Why This Abstraction Matters¶
Simulation workflows become reproducible¶
Because the entire workflow is a JSON file, you can:
- Version control it alongside your simulation files
- Compare two workflows to see exactly what changed
- Re-run the exact same pipeline on a different machine
- Store workflows in a database for querying and analysis
The pipeline is not limited to simulations¶
The same pipeline can orchestrate:
| Use Case | Node Types Used |
|---|---|
| CFD simulation | Parameters → BlockMesh → ControlDict → FvSchemes → BuildAllrun |
| File processing | CopyDirectory → RunOsCommand → FilesWriter |
| Data transformation | Parameters → JinjaTransform → FilesWriter |
| Custom computation | RunPythonCode → RunOsCommand → CopyDirectory |
| Mixed workflows | Any combination of the above |
Separation of concerns¶
| Concern | Who handles it | How |
|---|---|---|
| What to do | You (JSON file) | Declare nodes and parameters |
| How to do it | Executers | Domain-specific Python classes |
| When to do it | Execution engine (Luigi) | Automatic dependency scheduling |
| Where data goes | Parameter references | {Node.output.Field} paths |
This means you can change the simulation parameters without touching the pipeline logic, add new node types without modifying the engine, or switch execution engines without rewriting your workflows.
Putting It All Together¶
Here's the complete lifecycle of a Hermes pipeline execution:
sequenceDiagram
participant User
participant CLI as hermes-workflow
participant Expand as Expand
participant WF as Workflow Engine
participant Builder as Luigi Builder
participant Luigi as Luigi Engine
participant Exec as Executers
User->>CLI: buildExecute workflow.json
CLI->>Expand: Resolve templates
Expand-->>CLI: Expanded JSON
CLI->>WF: Load expanded JSON
WF->>WF: Parse nodes & parameters
WF->>WF: Detect dependencies
WF->>WF: Build task wrapper graph
CLI->>Builder: Generate Luigi code
Builder-->>CLI: Python file with Task classes
CLI->>Luigi: Run workflow
loop For each task (in dependency order)
Luigi->>Exec: Run executer with resolved params
Exec-->>Luigi: Output JSON
end
Luigi-->>User: Workflow complete
Each step in this sequence adds one layer of concreteness — from an abstract JSON definition to concrete executed results — while keeping each layer independent and replaceable.