Skip to content

Experiment Toolkit

Toolkit name: experiment

Manages experimental workflows — organizing raw data files into structured experiments with trials, device types, and sensors. Provides analysis (transmission frequency, turbulence, metadata enrichment) and presentation (device maps, heatmaps, LaTeX reports).

from hera import toolkitHome

# Tip: if you created the project with `hera-project project create`, you can omit projectName
home = toolkitHome.getToolkit(toolkitHome.EXPERIMENT, projectName="MY_PROJECT")

# List available experiments
print(home.keys())  # ['IMS_experiment', 'Haifa2014']

# Get a specific experiment
exp = home.getExperiment("Haifa2014")
# or dictionary-style:
exp = home["Haifa2014"]

# Access trial data
trial = exp.trialSet["Measurements"]["Trial_01"]
df = trial.getData(deviceType="Sonic")

# Analyze transmission health
freq = exp.analysis.getDeviceTypeTransmissionFrequencyOfTrial(
    deviceType="Sonic", trialName="Trial_01"
)

# Visualize device functionality heatmap
exp.presentation.plotDeviceTypeFunctionality(
    deviceType="Sonic", trialName="Trial_01"
)

For the full API, see the API Reference. For implementation details, see the Developer Guide.


Concepts

The Argos data model

An experiment is defined in the Argos experiment management system (ArgosWEB) and exported as a ZIP file. The data model has four core objects:

Entity Types and Entities — devices and sensors:

  • An Entity Type is a class of device (e.g., "Sonic", "TRH", "Gateway"). It defines the attribute schema — which properties every device of this type has.
  • An Entity is a specific device instance (e.g., "sonic01", "TRH_North"). It has its own attribute values.

Trial Sets and Trials — experimental configurations:

  • A Trial Set groups related trials (e.g., "Measurements", "Calibration"). It defines the trial-level property schema.
  • A Trial is a specific time-bounded experimental run. It assigns entities to locations, sets per-trial attribute values, and defines TrialStart/TrialEnd timestamps.

Property scopes

Each attribute has a scope that determines where its value is set:

Scope Level Changes per trial? Example
Constant Entity type No — same for all devices of this type StoreDataPerDevice=false
Device Entity instance No — fixed per device stationName="Check_Post", height=9
Trial Per-device-per-trial Yes — different in each trial location, calibration values, thresholds

Containment hierarchy

Entities can be nested — a TRH sensor can be "contained in" a sonic anemometer station. Child entities inherit location and attributes from their parents. For example, if TRH01 is contained in sonic01, and TRH01 has no location set, it inherits sonic01's location.

Hera class hierarchy

In Hera, the Argos data model is extended with data-engine awareness:

Level Class Description
Experiment Home experimentHome Factory — lists and retrieves experiments in a project
Experiment experimentSetupWithData A single experiment with its configuration, trials, and devices
Trial Set TrialSetWithData A named group of trials (e.g., "Measurements", "Calibration")
Trial TrialWithdata A single trial with start/end times and data access
Entity Type EntityTypeWithData A device type (e.g., "Sonic", "TRH") — all sensors of that kind
Entity EntityWithData A single sensor/device (e.g., "S01", "TRH_North")

Each experiment has a data engine that handles the actual data retrieval — Parquet files, MongoDB via Pandas, or MongoDB via Dask. All trial and entity objects share the same engine instance.


Experiment lifecycle

1. Define in ArgosWEB

Create the experiment in the Argos web UI: - Define entity types and their attribute schemas - Create entity instances (devices/sensors) - Create trial sets and trials with TrialStart/TrialEnd dates - Place devices on map images with coordinates - Set up containment hierarchy (which sensor is on which station) - Export as ZIP file

2. Create experiment directory

hera-experiment create MyExperiment --zip /path/to/exported.zip --path /experiments/

This creates the standard directory structure:

MyExperiment/
├── code/
│   └── MyExperiment.py              # Experiment class (customisable)
├── data/                            # Parquet files (one per device type)
│   ├── Sonic.parquet
│   └── TRH.parquet
├── runtimeExperimentData/
│   ├── Datasources_Configurations.json
│   └── MyExperiment.zip             # Argos metadata
└── MyExperiment_repository.json     # For loading into Hera projects

3. Collect data

During the experiment, data flows from sensors to Parquet files:

Devices → Node-RED → Kafka → pyArgos consumer → Parquet files
         (normalise)  (1 topic   (batch consume)   (data/ dir)
                      per type)

Or data can be loaded from Campbell binary/TOA5 files after the fact.

4. Load into Hera project

# Register repository (one-time)
hera-project repository add MyExperiment/MyExperiment_repository.json

# Create project (loads all registered repositories)
hera-project project create MY_PROJECT

# Or update existing project
hera-project project updateRepositories MY_PROJECT

5. Analyse in Python

from hera import toolkitHome

home = toolkitHome.getToolkit(toolkitHome.EXPERIMENT, projectName="MY_PROJECT")
exp = home["MyExperiment"]

# Access trial data
df = exp.trialSet["Measurements"]["Trial_01"].getData(deviceType="Sonic")

# Analyse
exp.analysis.addTrialProperties(df, "Trial_01")

Exploring experiment metadata

Before accessing data, you can inspect the experiment's structure:

exp = home["MyExperiment"]

# Experiment configuration
print(exp.name)
print(exp.configuration)

# Entity types and their properties
for name, etype in exp.entityType.items():
    print(f"{name}: {etype.numberOfEntities} entities")
    print(etype.propertiesTable)       # attribute schema
    print(etype.entitiesTable)         # all devices as DataFrame

# Trial sets and trials
for ts_name, ts in exp.trialSet.items():
    print(f"Trial set: {ts_name}")
    print(ts.trialsTable)              # all trials as DataFrame
    for trial_name, trial in ts.items():
        print(f"  Trial: {trial_name}")
        print(f"    Start: {trial.properties['TrialStart']}")
        print(f"    End: {trial.properties['TrialEnd']}")
        print(trial.entitiesTable())   # devices in this trial with locations

Data storage: StoreDataPerDevice

Each entity type (device type) has a StoreDataPerDevice flag that controls how measurement data is organized on disk:

StoreDataPerDevice Parquet file layout Example
false (default) One file per entity type — all devices of that type in a single parquet file, with a deviceName column to distinguish them data/Sonic.parquet contains data from sonic01, sonic02, ...
true One file per device — each device has its own parquet file data/sonic01.parquet, data/sonic02.parquet, ...

This flag is defined in the experiment metadata (Argos zip file) as a Constant-scope property on the entity type. It affects:

  • How data is stored: the repository JSON creates one Experiment_rawData document per type (if false) or per device (if true)
  • How data is queried: when StoreDataPerDevice=false, the engine loads the single file and filters by deviceName; when true, it loads the specific device's file directly
  • CLI usage: when using hera-experiment data, pass --perDevice True if the entity type stores data per device
# StoreDataPerDevice=false (default): one file, filter by device name
df = trial.getData(deviceType="Sonic", deviceName="sonic01")
# Loads Sonic.parquet, filters to sonic01 rows

# StoreDataPerDevice=true: separate files per device
df = trial.getData(deviceType="PID", deviceName="PID_01")
# Loads PID_01.parquet directly

Listing experiments

home = toolkitHome.getToolkit(toolkitHome.EXPERIMENT, projectName="MY_PROJECT")

# List experiment names
home.keys()
# ['IMS_experiment', 'Haifa2014']

# Get a map of experiment names → datasource documents
home.getExperimentsMap()

# Get a formatted table of all experiments
home.getExperimentsTable()

Loading an experiment

exp = home.getExperiment("Haifa2014")

# Experiment properties
print(exp.name)                    # 'Haifa2014'
print(exp.configuration)           # full config dict
print(exp.defaultTrialSet)         # name of the default trial set

# Available trial sets and device types
print(list(exp.trialSet.keys()))   # ['Measurements', 'Calibration']
print(list(exp.entityType.keys())) # ['Sonic', 'TRH', 'PID']

Accessing trial data

Trials are time-bounded segments of an experiment. Each trial has TRIALSTART and TRIALEND properties that are automatically used when you call getData() without specifying a time range.

# Navigate: experiment → trial set → trial
trial = exp.trialSet["Measurements"]["Trial_01"]

# Get all Sonic data for this trial
df = trial.getData(deviceType="Sonic")

# Get data for a specific device
df = trial.getData(deviceType="Sonic", deviceName="S01")

# Get data with device metadata merged in
df = trial.getData(deviceType="Sonic", withMetadata=True)

# Override time range
df = trial.getData(
    deviceType="TRH",
    startTime="2024-03-15 08:00",
    endTime="2024-03-15 12:00"
)

Shortcut: default trial set

# Access trials from the default trial set directly
trials = exp.trialsOfDefaultTrialSet

Accessing device data

Entity types (device types) and entities (individual devices) also provide data access:

# All data for a device type
sonic_type = exp.entityType["Sonic"]
df_all = sonic_type.getData()

# Data for a device type during a specific trial
df_trial = sonic_type.getDataTrial(trialSetName="Measurements", trialName="Trial_01")

# Data for a single device
device = sonic_type["S01"]
df_device = device.getData()

# With time filtering
df_device = device.getData(startTime="2024-03-15 08:00", endTime="2024-03-15 12:00")

Time-range queries

For queries not tied to a specific trial, use getDataFromDateRange on the experiment:

df = exp.getDataFromDateRange(
    deviceType="TRH",
    startTime="2024-03-15 00:00",
    endTime="2024-03-16 00:00",
    deviceName="TRH_North",    # optional — all devices if omitted
    withMetadata=True           # merge device metadata
)

Direct data engine access

For advanced use, access the data engine directly:

engine = exp.getExperimentData()

# Parquet engine: lazy Dask DataFrame
dask_df = engine.getData(deviceType="Sonic", autoCompute=False)
pandas_df = dask_df.compute()

# Or compute immediately
pandas_df = engine.getData(deviceType="Sonic", autoCompute=True)

# Per-device organization
df = engine.getData(deviceType="Sonic", perDevice=True)

Analysis

The analysis layer provides methods for device diagnostics, metadata enrichment, and turbulence calculations.

analysis = exp.analysis

Device locations

locations = analysis.getDeviceLocations(
    entityTypeName="Sonic",
    trialName="Trial_01",
    trialSetName="Measurements"   # uses default trial set if omitted
)
# Returns DataFrame with device positions and metadata

Transmission frequency

Analyze how reliably each device transmitted data during a trial:

freq = analysis.getDeviceTypeTransmissionFrequencyOfTrial(
    deviceType="Sonic",
    trialName="Trial_01",
    trialSetName="Measurements",   # uses default trial set if omitted
    samplingWindow="1min",         # time bin size (default: "1min")
    normalize=True,                # normalize to planned message rate
    completeTimeSeries=True,       # fill gaps with zeros
    completeDevices=True,          # include non-transmitting devices
    wideFormat=True,               # pivot table (devices × time)
    recalculate=False              # use cached result if available
)

When normalize=True, values represent fraction of expected messages (1.0 = perfect). Results are cached in the data layer — set recalculate=True to force recomputation.

Planned message count

expected = analysis.getDeviceTypePlannedMessageCount(
    deviceType="Sonic",
    samplingWindow="1min"
)
# Returns float: expected messages per window

Adding metadata to data

# Merge device metadata (location, properties) into a DataFrame
df_with_meta = analysis.addMetadata(
    dataset=raw_df,
    trialName="Trial_01",
    trialSetName="Measurements"
)

# Add time-from-start and time-from-release columns
df_enriched = analysis.addTrialProperties(
    data=df_with_meta,
    trialName="Trial_01",
    trialSetName="Measurements"
)
# Adds columns: fromStart, fromRelease, fromStartSeconds, fromReleaseSeconds

Turbulence statistics

For sonic anemometer data:

stats = analysis.getTurbulenceStatistics(
    sonicData=sonic_df,
    samplingWindow="30min",
    height=10   # measurement height in meters
)

Presentation

The presentation layer provides visualizations for experiment setup, device diagnostics, and reporting.

pres = exp.presentation

# Control figure saving
pres.saveFigures = True
pres.savePath = "/path/to/output"

Experiment site image

# Plot an experiment site image with coordinate grid
ax = pres.plotImage(
    imageName="site_overview",
    withGrid=True,
    majorLocator=10   # grid spacing
)

Device locations on map

# Plot devices on an experiment map image
fig, ax = pres.plotDevicesOnImage(
    trialSetName="Measurements",
    trialName="Trial_01",
    deviceType="Sonic",
    mapName="floor_plan"
)

# Plot devices in ITM coordinates
fig, ax = pres.plotDevices(
    trialSetName="Measurements",
    trialName="Trial_01",
    deviceType="Sonic",
    mapName="site_overview"
)

Device functionality heatmap

Visualize transmission health across devices and time. Color-codes each cell: red = no data, orange = partial, green = healthy.

ax, pivot_table = pres.plotDeviceTypeFunctionality(
    deviceType="Sonic",
    trialName="Trial_01",
    trialSetName="Measurements",
    samplingWindow="1min",
    equalSquares=False   # True for square cells
)

LaTeX report generation

Generate a PDF report with device maps and metadata tables:

pres.generateLatexTable(
    latex_template="report_template.tex",   # Jinja2 template
    folder_path="/path/to/output"
)

CLI reference

List experiments

hera-experiment list --projectName MY_PROJECT

Show experiment table

hera-experiment table --projectName MY_PROJECT

Retrieve data

hera-experiment data Haifa2014 Sonic --projectName MY_PROJECT
hera-experiment data Haifa2014 TRH --deviceName TRH_North --perDevice True

Create a new experiment

Scaffolds a complete experiment directory with boilerplate code, config files, and a repository JSON:

hera-experiment create my_experiment --path /path/to/experiments
hera-experiment create my_experiment --zip /path/to/argos_export.zip --relative

This creates:

my_experiment/
├── code/
│   └── my_experiment.py              # Experiment class (extends experimentSetupWithData)
├── data/                             # Place data files here
├── runtimeExperimentData/
│   ├── Datasources_Configurations.json
│   └── my_experiment.zip             # Argos metadata (if --zip provided)
└── my_experiment_repository.json     # Repository for loading into projects

The generated class provides hooks for custom analysis and presentation:

class my_experiment(experimentSetupWithData):
    def __init__(self, projectName, pathToExperiment, filesDirectory):
        super().__init__(projectName, pathToExperiment, filesDirectory)
        self._analysis = my_experimentAnalysis(self)
        self._presentation = my_experimentPresentation(self, self.analysis)

class my_experimentAnalysis(experimentAnalysis):
    pass  # Add custom analysis methods here

class my_experimentPresentation(experimentPresentation):
    pass  # Add custom presentation methods here

Load experiment into project

# Method 1: Register repository, then create/update project
hera-project repository add my_experiment/my_experiment_repository.json
hera-project project create MY_PROJECT
# or: hera-project project updateRepositories MY_PROJECT

# Method 2: Direct load
hera-experiment load --experiment /path/to/my_experiment MY_PROJECT

Data engine types

The experiment toolkit supports three data backends, selected at initialization:

Engine Constant Backend Returns
Parquet (default) PARQUETHERA Hera data layer + Parquet files dask.DataFrame (lazy) or pandas.DataFrame
Pandas/MongoDB PANDASDB Direct MongoDB queries pandas.DataFrame
Dask/MongoDB DASKDB MongoDB via Dask dask.DataFrame (lazy)

To use a non-default engine when loading an experiment programmatically:

from hera.measurements.experiment.dataEngine import PANDASDB

exp = experimentSetupWithData(
    projectName="MY_PROJECT",
    pathToExperiment="/path/to/experiment",
    dataType=PANDASDB
)

Complete example

from hera import toolkitHome

# Load experiment
# Tip: if you created the project with `hera-project project create`, you can omit projectName
home = toolkitHome.getToolkit(toolkitHome.EXPERIMENT, projectName="WindTunnel")
exp = home["march_2024"]

# Explore structure
print(f"Experiment: {exp.name}")
print(f"Trial sets: {list(exp.trialSet.keys())}")
print(f"Device types: {list(exp.entityType.keys())}")

# Get trial data with metadata
trial = exp.trialSet["Measurements"]["Release_01"]
df = trial.getData(deviceType="Sonic", withMetadata=True)

# Enrich with trial timing
df = exp.analysis.addTrialProperties(df, trialName="Release_01")
print(df[["deviceName", "fromReleaseSeconds", "wind_speed"]].head())

# Check device health
ax, freq = exp.presentation.plotDeviceTypeFunctionality(
    deviceType="Sonic",
    trialName="Release_01",
    samplingWindow="1min"
)

# Get device locations
locations = exp.analysis.getDeviceLocations(
    entityTypeName="Sonic",
    trialName="Release_01"
)

# Plot devices on site map
fig, ax = exp.presentation.plotDevicesOnImage(
    trialSetName="Measurements",
    trialName="Release_01",
    deviceType="Sonic",
    mapName="site_plan"
)