Anomaly detection for batch yield: a small model beats our heuristics

← All field notes

For the first two years of Platform development we used hand-written heuristics to flag batches trending toward poor yield. They worked, mostly. An experienced operator would recognise the same patterns by eye, and the rules codified that intuition. The problem was that intuition is a moving target, and rules need someone to update them. When we finally trained a small anomaly detection model on our historical batch data, it outperformed the heuristics in both recall and lead time. This post explains the approach and what we found.

Why we kept the model small

The temptation when a problem looks like a machine-learning problem is to reach for the largest available tool. We resisted that for two reasons. First, our dataset is narrow: a few thousand completed batches with logged sensor readings and final yield outcomes. A large model would overfit on noise we cannot yet separate from signal. Second, operationally, a model that a rearing technician can roughly understand is one they will trust and correct when it misfires.

The model we settled on is an isolation forest trained on six features:

Mean substrate moisture across days 3 to 7 of a cycle
Variance in daily weight gain during the same window
Feed conversion ratio at the midpoint weigh-in
Ambient temperature range (max minus min) over the preceding 48 hours
Larval density deviation from target stocking rate
Number of manual interventions logged against the tray

A model you cannot explain to the person acting on its output is a model that will be ignored when it matters most.

What the comparison showed

We ran the model in shadow mode alongside the heuristics for 90 days, flagging batches without surfacing those flags to operators. At the end of the window we compared predictions to outcomes.

The heuristics caught 61% of failing batches, with a median lead time of 1.9 days before the yield shortfall became measurable. The model caught 78% of failing batches with a median lead time of 3.4 days. False positive rates were similar: 11% for heuristics, 14% for the model.

The extra 1.5 days of lead time is the operational prize. It is enough to adjust misting, add supplemental feed, or in the worst case, close out the batch early and reclaim resources.

What we have not solved

The model degrades when we introduce new substrate blends or change stocking density meaningfully, because those shifts move the feature distributions. We retrain quarterly, which is manageable, but it means the first few batches after a process change run on slightly stale predictions. Flagging process changes explicitly in the data pipeline is the next improvement on the list.