Skip to main content

limen.scalers

Fit train-only feature scaling and apply it consistently across validation and test data.

Canonical docs

What this package owns

Owns fitted scaler classes, the scaler registry, and the rule-based logic that decides how columns should be transformed. Does not own raw feature creation or the experiment loop itself.

Key entry points

Entry pointUse it whenNotes
LinearScalerYou want rule-based scaling over mixed market-data columnsExported at the package root
LogRegScalerYou want the standard scaler used by logistic-regression SFDsExported at the package root
RobustScalerYou want median and IQR scaling for outlier-heavy dataExported at the package root
CausalRollingRobustScalerYou want robust scaling that adapts to drift, with no look-aheadExported at the package root
RankGaussScalerYou want rank-based GaussianizationExported at the package root
SCALER_REGISTRYYou want to resolve scalers by manifest parameter nameUsed by set_scaler_from_params()
build_rules, inverse_transformYou need to customize or interpret LinearScaler behaviorAvailable from the module-level implementations

Adjacent modules

  • limen.experiment.Manifest is the main consumer of this package.
  • limen.transforms handles stateless transforms, which is a different stage from fitted scaling.
  • limen.features and limen.indicators produce the columns that scalers later operate on.

Quick orientation

scalers/
├── linear_scaler.py # LinearScaler, rule helpers, inverse transform
├── logreg_scaler.py # Logistic-regression tuned scaling
├── robust_scaler.py # Median and IQR scaling
├── causal_rolling_robust_scaler.py # Causal trailing-window median and IQR scaling
├── rank_gauss_scaler.py # Rank-based Gaussianization
└── registry.py # SCALER_REGISTRY

Things to know

  • Scalers are fit on the training split and then reused on validation and test splits. That fit/apply discipline is part of why this package stays separate from limen.transforms.
  • LinearScaler uses ordered regex rules. The first matching rule wins.
  • Unrecognized columns fall through to the catch-all none rule and stay unchanged.
  • Zero-variance columns are passed through rather than exploding the fit step.