Msgpack detection-blob contract¶
Per-frame object data (one bounding box per object per frame, for every frame of a video) is too granular to store as relational rows for every classification — so it's encoded as a MessagePack blob, zstd-compressed, and stored as a single file on Classification.frames_msgpack. This is the wire format, defined in trapper/apps/media_classification/frames.py.
Source of truth
This page describes the Expert-side dataclasses (AggregateFramesPayload/ObjectFrames/FrameSample). The AI Worker emits an equivalent structure on its side of the contract — if you're working in trapper-ai-worker, cross-check against its own payload-building code rather than assuming byte-for-byte identity; the two services share the shape of the contract, not necessarily a single shared schema package. See Frame timeseries for why this file, not the ObjectFrameObservation hypertable, is the source of truth.
Encoding¶
raw = msgpack.packb(payload.asdict(), use_bin_type=True)
compressed = zstd_compressor.compress(raw)
Served over HTTP with Content-Type: application/x-msgpack+zstd via GET /api/media-classifications/classifications/<id>/frames/.
Top-level: AggregateFramesPayload¶
One per classification (AI, USER, or FEEDBACK — never FINAL).
| Field | Type | Notes |
|---|---|---|
classification_type |
"AI" | "USER" | "FEEDBACK" |
Mirrors Classification.classification_type. |
classification_id |
int | |
base_timestamp |
ISO string or null |
Resource's base timestamp. |
timezone |
string | Resource's timezone name. |
fps |
float or null |
null for images. |
duration |
float or null |
Video duration in seconds; null for images. |
classification_model |
string or null |
AI model name — only set for AI-type classifications. |
objects |
list of ObjectFrames |
One entry per tracked object. |
ObjectFrames¶
One per detected/tracked object (track) within the classification.
| Field | Type | Notes |
|---|---|---|
id |
int | The corresponding ClassificationDynamicAttrs row ID. |
data |
list of FrameSample |
Per-frame observations for this object. |
observation_type |
string or null |
animal / human / vehicle / blank / etc. |
species |
string or null |
Latin name, if resolved. |
FrameSample¶
One per frame where the object was observed (sparse — frames with no detection simply have no entry, they aren't padded with nulls in this list).
| Field | Type | Notes |
|---|---|---|
frame_index |
int | Zero-based, within the resource's timeline. |
x, y, w, h |
float or null |
Normalized bounding box, xywh. |
confidence |
float or null |
Detection confidence. |
ts |
string or null |
Timestamp, %Y-%m-%dT%H:%M:%S%z. |
distance |
float or null |
Per-frame distance estimate in metres — see Distance estimation. |
kpts |
list of int or null |
Sparse pose keypoints (17 OKS-anchor subset), flat array of round(coord_norm * 10000) pairs. Only present on pose-model target-FPS frames. |
kpt_scores |
list of int or null |
Per-keypoint confidence, round(score * 1000). |
Don't confuse this with the frontend's sparse bboxes array
The video annotation frontend works with a dense, null-padded bboxes array indexed by frame number (so it can interpolate gaps). The wire format above is the sparse on-disk encoding — the frontend's dense array is built from this on load, and collapsed back to sparse FrameSample entries on save.
Rebuilding from ClassificationDynamicAttrs¶
For images only, the msgpack can be rebuilt on-the-fly from the relational dyn_attrs JSON columns if the file goes missing (e.g. a storage misconfiguration) — SmartFrameService().build_from_dyn_attrs(classification=...). Video tracks cannot be rebuilt this way — the full per-frame track only ever existed in the msgpack (or the project's ObjectFrameObservation hypertable snapshot, if it's been populated), never in ClassificationDynamicAttrs itself (which only stores the first occurrence's bbox).
See also¶
- Frame timeseries — why the msgpack, not the hypertable, is authoritative
- Frame hypertable — the queryable mirror built from this contract, populated per-project on demand
- Video annotation & interpolation — the main consumer/editor of this data