Overview
This was the workbench, not the product. The question was simple to ask and annoying to answer: which statistical measures actually distinguish a meaningful absence pattern from random noise? Rather than guess, the approach was to generate synthetic attendance data with known patterns baked in, then throw candidate functions at it and see which ones reliably told consecutive absences apart from scattered ones.
It's the prototyping layer that fed the Student Absence Anomaly Detection tool -; the place where the math got argued with before it got shipped.
Background
The honest reason this exists as its own entry: most of the value of the absence work was in the experimentation, not the final detector. Using AI as a fast collaborator made it cheap to test a hypothesis, look at the result, and throw the idea out within minutes -; which is exactly what you want when you're hunting for the right statistical signal and have no idea up front which one it'll be.
Synthetic datasets mattered here because real attendance data is messy and privacy-laden. Generating data with a known ground truth meant you could actually measure whether a function was picking up the pattern you planted or just hallucinating one.
How It Works
The loop: build a synthetic dataset with a deliberate pattern, compute a candidate measure across it, check whether the measure separated the planted signal from the control, keep or discard. Iterating this way surfaced the period-weighting, non-consecutivity, and spread measures that later anchored the production detector.
An attendance reformatter sat alongside this work as the practical data-prep step -; a browser-based tool (JavaScript, PapaParse, Lodash) that turns raw one-row-per-period exports into the consolidated one-row-per-student-day format the analysis expects.
Current Status
Archived as a Summer 2025 effort. It did its job -; the useful functions graduated into the absence detector -; so there's nothing more to do here. It's preserved as the record of how that detector's math got found.
- Statistical functions for consecutive-vs-scattered absence discrimination, validated on synthetic data.
- Browser-based attendance reformatter for data prep (drag-drop, client-side, CSV out).
- Outputs folded into Student Absence Anomaly Detection; closed out as a prototype.