Roadmap¶
Planned architectural improvements for future Hera releases.
Contract-First Design with Typed Interfaces¶
Status: Planned
Problem¶
Currently, Hera's internal interfaces rely on duck typing and dictionaries. Function signatures accept str, dict, or geopandas.GeoDataFrame interchangeably, and the desc metadata field is an untyped dict. This makes it hard to:
- Know what parameters a function actually expects without reading the source
- Validate inputs before they reach MongoDB
- Auto-generate API documentation with accurate type information
- Catch errors early (wrong column name, missing field, wrong CRS)
Proposed changes¶
-
Pydantic models for document metadata — Define typed schemas for
descfields instead of free-form dicts: -
Typed method signatures — Replace
**kwargsand**descpatterns with explicit typed parameters: -
Enum for data formats — Replace string constants with a proper enum:
-
Protocol classes for toolkit layers — Define what
analysisandpresentationlayers must implement:
Migration path¶
- Phase 1: Add type annotations to all public methods (backward compatible)
- Phase 2: Introduce Pydantic models alongside existing dict interfaces
- Phase 3: Deprecate untyped interfaces with warnings
- Phase 4: Remove untyped interfaces in a major version
Unified Toolkit Registry¶
Status: Planned
Problem¶
Currently, toolkits are discovered from two different sources:
- Internal (built-in): Hardcoded Python dict in
ToolkitHome.__init__— requires source code edit to add/remove - Dynamic (external): Registered in MongoDB via
registerToolkitCLI — no source code change needed
This creates: - Two code paths for toolkit resolution (dict lookup vs DB query) - Built-in toolkits can't be overridden without modifying source - No single source of truth for "what toolkits are available" - Adding a new built-in toolkit requires editing Python code
Proposed changes¶
-
Single registration mechanism — All toolkits (built-in and external) are registered in the database using the same
ToolkitDataSourcedocument type -
Built-in registry JSON — Ship a
toolkits_registry.jsonwith Hera: -
Registration command —
make installrunshera-project registerBuiltinswhich reads the JSON and registers all built-in toolkits in the DB -
Single resolution path —
getToolkit(name)always queries the DB. No hardcoded fallback dict. -
Populate command —
make populateloads all repositories into all projects (already implemented)
Flow after unification¶
make install
→ mongo-up
→ hera-project registerBuiltins ← reads toolkits_registry.json → DB
→ make populate ← loads repositories into all projects
Adding a new built-in toolkit:
1. Create the class in hera/
2. Add one line to toolkits_registry.json
3. Run: make install
Adding an external toolkit:
1. Create the class anywhere on disk
2. Run: hera-project addToolkit myToolkit /path/to/toolkit
3. Run: make populate (optional, to propagate to all projects)
Migration path¶
- Phase 1: Create
toolkits_registry.jsonandregisterBuiltinscommand (keep hardcoded dict as fallback) - Phase 2:
make installruns registration automatically - Phase 3: Remove hardcoded
_toolkitsdict — DB is the single source of truth - Phase 4: Constants like
toolkitHome.GIS_RASTER_TOPOGRAPHYremain as string aliases (no behavior change for users)
Backward compatibility¶
toolkitHome.getToolkit("MeteoLowFreq")continues to work — resolution just comes from DB instead of dicttoolkitHome.GIS_RASTER_TOPOGRAPHYconstant still works — it's just a string"GIS_Raster_Topography"- Existing registered dynamic toolkits continue to work unchanged
- Users who never run
registerBuiltinsget the hardcoded fallback (during transition)
Other Planned Improvements¶
Environment variable configuration for MongoDB¶
Support HERA_DB_* environment variables as an override for ~/.pyhera/config.json, enabling container and CI deployments without config files.
Async support¶
Add async versions of database operations for use in web servers and Jupyter notebooks with event loops.
Plugin system for data handlers¶
Allow third-party packages to register custom DataHandler_* classes without modifying datahandler.py.