Picture a predictive-maintenance system trained on vibration and thermal sensor streams from industrial cells. Drift is not an academic topic when your false-negative means a motor seizes at 2am.
Four drift scenarios you will see
- Sensor drift. A transducer's baseline shifts by 3% per year. Silent until you look.
- Regime shift. The factory starts a new product line; the physics of the measurement change.
- Covariate shift. Temperature in the hall changed because HVAC was upgraded.
- Label shift. Maintenance policies changed; what used to be called a “minor anomaly” is now “urgent”.
The detection stack
- ADWIN on the residuals. First line of defence; reliable, cheap.
- Page-Hinkley on the prediction confidence distribution.
- Population stability index on feature distributions.
- A second, slower model trained on last-30-day data, compared to the main model.
- A human-in-the-loop review when two of the four agree.
The retraining protocol
Drift is a signal, not a verdict. When it fires we don't retrain blindly. We:
- Freeze the current model version.
- Generate a drift report: which features, which segments, suspected cause.
- Check whether the drift is explained by a known operational change.
- Only then trigger a shadow retrain + A/B vs the frozen model.
- Promote only if the new model beats the old on the full validation suite, not just on drifted segments.
Boring? Yes. But the operators stop getting surprise failures, and that's the metric that matters.