Skip to main content

Trainer

Trainer is Limen's promotion layer for finished experiment rounds. It takes a completed artifact-rich experiment directory, reconstructs the manifest and round parameters, validates selected permutations, and retrains them into reusable Sensor objects.

This is the bridge between "a good row in an experiment" and "a trained model object I can carry forward."

What Trainer Needs

Trainer works only with experiments that were run through the artifact-rich UEL path and wrote an experiment_dir.

At minimum, the directory must contain:

  • metadata.json
  • round_data.jsonl

If results.csv is also present, Trainer performs Pass 1 validation against the original metrics. If it is missing, Trainer skips validation and proceeds directly to retraining.

Prerequisites

  • the experiment must have been created with experiment_dir=...
  • the SFD must be manifest-driven
  • the experiment directory must be trusted

The trust warning matters because Trainer imports the SFD module path stored in metadata.json and executes its top-level code.

SFD Module Resolution

Trainer.__init__ reads metadata.json["sfd_module"] and resolves it in two stages:

  1. Experiment-local file — if <experiment_dir>/<sfd_module>.py exists, it is loaded via importlib.util.spec_from_file_location + module_from_spec. The SFD module is not added to sys.path and is not registered in sys.modules under its bare name, so each Trainer construction loads the SFD freshly without polluting global import state. This is the path used by self-contained experiment bundles produced by tools like Praxis's trainer_prep.py.
  2. importlib.import_module fallback — if no experiment-local file is present, the name is passed to importlib.import_module and resolved against sys.path like any other Python import. This is the legacy path used by experiments referencing built-in SFDs by fully-qualified package path (e.g. limen.sfd.foundational_sfd.logreg_binary).

Names are validated up-front: sfd_module must be a dotted sequence of valid Python identifiers (name.split('.') and str.isidentifier() per segment). Anything else — .., /, \, leading/trailing dots, empty segments — raises ValueError before either branch runs, so a malicious metadata.json cannot path-traverse out of experiment_dir via the local-file branch.

The import_module fallback still trusts any module name that resolves on sys.path, including arbitrary site-packages modules. That residual surface is documented as TD-001 in docs/TechnicalDebt.md and should be tightened (e.g. allowlisted) before any live-trading deploy where the upstream bundle pipeline is not under the same trust boundary as the deploy operator.

Typical Workflow

Start from an existing experiment directory:

import limen

trainer = limen.Trainer(
experiment_dir='path/to/experiment',
data=my_data, # optional override
)

sensors = trainer.train([0, 7, 19])
sensor = sensors[0]

result = sensor.predict({'x_test': live_features})

In this workflow:

  • Trainer reconstructs the manifest and round metadata
  • train([ ... ]) validates and retrains the selected rounds
  • each returned Sensor wraps one trained ReferenceModel

Why Trainer Exists

Experiment runs are usually done on train/validation/test splits. Promotion is different: once a round is selected, you usually want to retrain it on all available data before carrying it downstream.

Trainer handles that transition cleanly by:

  • reconstructing the original experiment logic
  • validating that the pipeline still reproduces the logged round
  • retraining on all data with split_config=(1,0,0)

Two-Pass Training

Pass 1: Validation

Trainer reruns:

  • manifest.prepare_data(...)
  • manifest.run_model(...)

with the original round parameters and compares the resulting metrics against the original experiment log.

If the model is deterministic, validation expects an exact match within a very small float tolerance. If the model is stochastic, Trainer uses a looser scaled tolerance.

If validation fails, Trainer raises ReconstructionError.

from limen import ReconstructionError

try:
sensors = trainer.train([42])
except ReconstructionError as e:
print(e)

This is Limen's guard against pipeline drift.

Pass 2: Retraining

After validation, Trainer deep-copies the manifest with:

split_config=(1, 0, 0)

and retrains the resolved ReferenceModel on the full dataset.

That trained model is then wrapped in a Sensor.

Deterministic Vs Stochastic Models

Trainer uses the model class's deterministic attribute to choose the validation tolerance.

Model typeValidation style
deterministic = Truenear-exact metric match
deterministic = Falsescaled tolerance for expected randomness

This is why promotion is more reliable for deterministic reference models than for intentionally stochastic ones.

Trainer And The Reference-Architecture Contract

Trainer does not promote arbitrary model objects. It resolves exactly one ReferenceModel subclass from the original model module and uses that class for retraining.

That means the promotion stack depends on the Reference Architecture contract:

  • train(data, **params)
  • predict(data)
  • evaluate(data, inline_metrics=True)
  • deterministic

On a live local logreg_binary promotion run in this repo:

  • Pass 1 validation completed with validation_mismatches == []
  • the promoted Sensor.results included task metrics plus backtest_* keys
  • Sensor.predict() returned _preds and _probs
  • the promoted sensor produced predictions for 884 test bars

On a live local random_binary promotion run in this repo, Trainer raised ReconstructionError because the stochastic rerun did not reproduce the original logged metrics closely enough.

This is expected behavior, not a special case in the docs.

Trainer(experiment_dir, data=None)

Arguments

ArgumentMeaning
experiment_dirpath to the completed experiment directory
dataoptional dataframe override; if omitted, Trainer fetches data from the reconstructed manifest

Use data= when you already have the exact dataframe you want Trainer to use. Otherwise Trainer falls back to manifest.fetch_data().

train(permutation_ids)

sensors = trainer.train([0, 1, 2])

This method:

  • verifies that the requested permutation ids exist in round_data.jsonl
  • validates them against results.csv when available
  • retrains them on all data
  • returns list[Sensor]

Raises:

  • ValueError if a permutation id is missing
  • ReconstructionError if validation detects metric drift

Sensor

A Sensor is the promoted form of a trained round.

Each sensor stores:

  • permutation_id
  • model
  • round_params
  • metadata
  • results

Example

sensor = sensors[0]

print(sensor.permutation_id)
print(sensor.round_params)
print(sensor.metadata['sfd_module'])

pred = sensor.predict({'x_test': live_features})

Sensors are also callable:

pred = sensor({'x_test': live_features})

What predict() expects

Most reference models only need:

  • x_test

Some models may require more. The requirement comes from the underlying model class, not from the Sensor wrapper itself.

What results contains

Sensor.results comes from the Pass 1 evaluation result, not from a stripped-down inference-only payload.

In a live local logreg promotion run in this repo, the stored keys included:

  • _preds
  • accuracy
  • auc
  • backtest_edge_per_signal_bps_p50
  • backtest_trade_pnl_net_bps_p50
  • backtest_cvar_95_return_bps

When the promoted round used calibration, results also includes:

  • optimal_threshold — the threshold chosen during the validation pass
  • val_score — the metric score at that threshold

That is why Sensor.results is useful for provenance and review, while Sensor.predict() is the smaller live inference surface.

What Trainer Reads From Disk

Trainer uses:

  • metadata.json to discover the SFD module and experiment metadata
  • round_data.jsonl to load round_params, stored predictions, and alignment metadata
  • results.csv when available for Pass 1 metric validation

On a live local artifact-rich run in this repo, metadata.json contained:

  • sfd_module
  • limen_version
  • created_at

and round_data.jsonl contained entries with:

  • round_id
  • round_params
  • preds
  • alignment

Scope Note

Trainer depends on the artifact-rich UEL path, which in turn depends on a concrete SearchStrategy. Limen ships built-in strategies (GridStrategy, RandomStrategy) and the SearchStrategy abstraction for writing your own.

So the clean mental model is:

  • UEL artifact-rich runs create the promotion-ready experiment directory
  • Trainer turns selected rounds from that directory into sensors
  • Continue to Reference Architecture for the class-based model contract that Trainer reconstructs and retrains.
  • Continue to Cohort if you want to bind selected sensors into an ensemble inference surface.
  • Continue to Universal Experiment Loop if you need the run layer that produces experiment_dir.