.. _configuration:

=============
Configuration
=============

This page covers TRAPPER Expert's runtime configuration — the values
that flow through the rendered ``.env`` file, the database / storage /
email knobs, and the cross-cutting subsystems (AI pipeline, trackers,
distance estimation, frame-level data). For first-time deployment
setup, see :ref:`installation` and the ``trapper-setup`` README.

Where things live
+++++++++++++++++

In v2 the deployment toolchain is layered:

.. list-table::
   :header-rows: 1
   :widths: 30 70

   * - Concern
     - Where it's configured
   * - Compose stack composition (which services run, ports, SSL, volumes)
     - ``trapper-setup/docker-compose*.yml`` + the per-profile overlay rendered
       by ``trapper-setup/configure.py``.
   * - Per-profile env values (DB credentials, secrets, feature flags)
     - ``trapper-setup/profiles/<profile>/.env``, generated by ``configure.py``.
   * - Expert-side runtime options (this module)
     - This page. Values are read by Django from the same ``.env`` at startup.
   * - AI Manager runtime options
     - ``trapper-ai`` repo's settings, with values flowing from the same
       ``.env``.
   * - AI Worker / coordinator options
     - Per-profile ``coordinator.yaml`` rendered by ``configure.py`` from the
       template under ``trapper-setup/config/coordinator/``.
   * - Per-Classification-Project AI behaviour
     - Django admin (this module). See :ref:`admin-classification-projects`.
   * - Per-Deployment depth / distance settings
     - Django admin (this module). See :ref:`admin-distance`.

If you find yourself editing ``.env`` by hand on a deployed instance,
re-render the profile through ``configure.py`` instead — that way the
values stay coherent across the stack and survive a re-run of the
wizard. ``./configure.py interactive --from-profile <name>`` is the
canonical "edit one value, keep the rest" path. See the
``trapper-setup`` README for ``--reuse-env`` semantics.

Basic settings
++++++++++++++

The values below appear in every ``.env`` regardless of profile flavour.

.. code-block:: bash

   DOMAIN_NAME=demo.trapper-project.org
   SECRET_KEY='cudv86xkzva)o1lkt-ow72)y__*xwsb1@43wtp1cc@zak=i@@e'

The ``SECRET_KEY`` is used by Django for cryptographic signing and must
be a unique, unpredictable value
(see `the Django docs <https://docs.djangoproject.com/en/5.1/ref/settings/#secret-key>`_).
``configure.py`` generates one for you on first run; never reuse it
across deployments.

Admin and manager notifications:

.. code-block:: bash

   ADMIN1=admin1,admin1@oscf.org      # {username},{email}
   ADMIN2=admin2,admin2@oscf.org

   MANAGER1=manager1,manager1@oscf.org
   MANAGER2=manager2,manager2@oscf.org

When :ref:`configuration-email` is configured, admins receive 500
notifications and managers receive new-user-registration / new-research-
project notifications.

HTTP / HTTPS port mappings (consumed by the Traefik or Nginx container,
depending on overlay):

.. code-block:: bash

   HTTP_PORTS=80:80
   HTTPS_PORTS=443:443

.. _configuration-ssl-certificates:

SSL certificates
++++++++++++++++

The ``trapper-setup`` wizard handles three SSL paths — Let's Encrypt,
static certificates, self-signed — and renders the right Compose
overlay for each. See its README for the wizard prompts.

If you're on the legacy Expert-only ``./start.sh`` path
(:ref:`installation-expert-only-dev`), you can point at existing
certificates directly:

.. code-block:: bash

   # default
   SSL_CERTIFICATE=${PWD}/ssl/cert.pem
   SSL_CERTIFICATE_KEY=${PWD}/ssl/key.pem

   # Let's Encrypt
   SSL_CERTIFICATE=/etc/letsencrypt/live/trapper/fullchain.pem
   SSL_CERTIFICATE_KEY=/etc/letsencrypt/live/trapper/privkey.pem

These same certificates secure the HTTPS Nginx proxy and the FTPS
``pure-ftpd`` container. If neither variable is set, the start script
generates self-signed certs with ``openssl``.

.. _configuration-database:

Database
++++++++

The default Compose stack ships a TimescaleDB-enabled PostgreSQL image
under ``trapper-postgresql:timescaledb``. The TimescaleDB extension is
required when ``USE_FRAMES_HYPERTABLE=true`` (default); see
:ref:`configuration-frame-hypertable` for the optional flag that lets
you fall back to plain Postgres.

Default credentials (overridable):

.. code-block:: bash

   POSTGRES_USER=trapper
   POSTGRES_PASSWORD=trapper
   POSTGRES_DB=trapper

External database
=================

Production deployments often want an external Postgres (RDS, on-prem,
…). Set:

.. code-block:: bash

   DB_NAME=trapper
   DB_USER=trapper
   DB_PASS=trapper
   DB_HOST=172.17.0.1
   DB_PORT=5432

The host ``172.17.0.1`` is the Docker bridge gateway on most Linux
hosts; use it when Postgres is running on the same machine outside
Docker. ``trapper-setup``'s wizard exposes this as
``USE_EXTERNAL_POSTGRES=true``.

Your external server needs:

- PostGIS extension (always required).
- TimescaleDB extension (required unless
  ``USE_FRAMES_HYPERTABLE=false``).
- Network access from the Docker host.
- ``listen_addresses = '*'`` in ``postgresql.conf``.
- A permissive entry in ``pg_hba.conf``, e.g.

  .. code-block:: text

     # TYPE   DATABASE   USER   ADDRESS       METHOD
     host     all        all    172.0.0.0/8   md5

Tighten that as appropriate for your network.

Sanity-check the connection before starting the stack:

.. code-block:: bash

   psql -h <host> -U trapper -d trapper

Storage
+++++++

Multimedia files live either on local disk (default) or in a supported
cloud bucket. Local mode:

.. code-block:: bash

   DEFAULT_FILE_STORAGE=
   VOL_MEDIA=/storage/trapper_data/media
   VOL_EXTERNAL_MEDIA=/storage/trapper_data/external_media

The ``VOL_MEDIA`` volume keeps original images and videos plus their
previews and thumbnails:

.. code-block:: text

   protected/storage/resources/{user_id}/{date_uploaded}/
   protected/storage/resources/{user_id}/{date_uploaded}/previews/
   protected/storage/resources/{user_id}/{date_uploaded}/thumbnails/

``VOL_EXTERNAL_MEDIA`` holds user-uploaded archives (FTP drops, zip
batches) and exported data packages.

The per-classification frames msgpack files
(``Classification.frames_msgpack``) live under:

.. code-block:: text

   protected/media_classification/msgpacks/{project_id}/{deployment_id}/{resource_id}/

.. _configuration-storage-cloud:

Cloud storage
=============

Cloud-storage support is via `django-storages <https://django-storages.readthedocs.io/>`_.

.. note::

   The active storage backend is wired through Django's ``STORAGES`` setting
   (``conf/settings/base.py``), which points the ``default`` alias at
   ``trapper.apps.storage.cloud_storages.TrapperMediaStorage``. The
   ``DEFAULT_FILE_STORAGE`` *environment variable* (``amazon_s3`` | ``ovh`` |
   ``azure`` | unset) still selects the concrete cloud engine; the legacy
   Django ``DEFAULT_FILE_STORAGE`` *setting* was removed in Django 5.1 and is no
   longer used. Azure deployments must also install ``azure-storage-blob``.

**Azure**

.. code-block:: bash

   DEFAULT_FILE_STORAGE=azure
   AZURE_ACCOUNT_NAME=trapper
   AZURE_ACCOUNT_KEY=key
   AZURE_CONTAINER=trapper_data
   AZURE_CUSTOM_DOMAIN=trapper.blob.core.windows.net
   MEDIA_LOCATION=media

**Amazon S3** (and S3-compatible: MinIO, Backblaze B2, …)

.. code-block:: bash

   DEFAULT_FILE_STORAGE=amazon_s3
   AWS_ACCESS_KEY_ID=...
   AWS_SECRET_ACCESS_KEY=...
   AWS_STORAGE_BUCKET_NAME=trapper-media
   AWS_S3_REGION_NAME=eu-central-1
   AWS_S3_ENDPOINT_URL=https://s3.example.com    # only for non-AWS S3

.. _configuration-email:

Email service
+++++++++++++

The notification framework is off by default. To enable it (gmail
example):

.. code-block:: bash

   EMAIL_NOTIFICATIONS=True
   EMAIL_NOTIFICATIONS_RESEARCH_PROJECT=True
   EMAIL_HOST=smtp.gmail.com
   EMAIL_PORT=587
   EMAIL_HOST_USER=project@gmail.com
   EMAIL_HOST_PASSWORD=password
   EMAIL_USE_TLS=True

A working email backend is needed for new-user activation links,
admin error reports, manager notifications about new project
requests, and the *Mail users* admin action.

User registration
+++++++++++++++++

.. code-block:: bash

   USER_REGISTRATION_OPEN=True

   USE_RECAPTCHA=True
   RECAPTCHA_PUBLIC_KEY=public_key
   RECAPTCHA_PRIVATE_KEY=private_key

When registration is open, anyone can sign up — but the resulting User
row is ``is_active=False`` until an admin runs the
:ref:`set roles <admin-users>` action.

Debugging
+++++++++

For developers:

.. code-block:: bash

   USE_DEBUG_TOOLBAR=True
   DEBUG_TOOLBAR_USERS=admin1,admin2

   USE_SILK=True

`Django Debug Toolbar <https://github.com/jazzband/django-debug-toolbar>`_
shows a per-request panel; ``django-silk`` profiles SQL and view-level
performance.

.. _configuration-ai-pipeline:

AI pipeline
+++++++++++

The AI pipeline is what turns an uploaded resource into a set of
detections, species classifications, and (optionally) per-frame distance
estimates. It involves all three Trapper services plus the GBIF-keyed
species table:

.. code-block:: text

   1. User uploads resource           [Expert backend]
        │
        ▼
   2. ClassificationProject auto-     [Expert post-save signal]
      submits resource to AI Manager
        │  POST /api/prediction_jobs/
        ▼
   3. AI Manager batches resources    [trapper-ai]
      and dispatches to runtime queue
        │  Celery → trapperai_runtime.run_batch_prediction
        ▼
   4. AI Worker coordinator picks the [trapper-ai-worker, host process]
      task, downloads the weights via
      model_manifest.yaml, runs the
      configured runtime
        │  msgpack blob (zstd) of detections
        ▼
   5. Worker POSTs results back       [→ trapper-ai]
        │  /api/resource_batches/<id>/save_results/
        ▼
   6. Expert post-processor creates   [Expert]
      AI Classification +
      ClassificationDynamicAttrs
      rows; resolves species via
      provider's categories JSON
      (gbifSpeciesKey → Species.taxon_id)
        │
        ▼
   7. Frames msgpack written;         [Expert]
      ObjectFrameObservation
      hypertable populated (if
      USE_FRAMES_HYPERTABLE=true)

The configuration that drives this is split between three places:

#. **AI provider catalog** — what models are available. Sourced from
   ``trapper-schemas``, registered on the AI Manager via
   ``sync_models_from_schemas``, surfaced on the Expert via
   ``sync_ai_models``. Run on every upgrade. See
   :ref:`admin-ai-providers`.
#. **ClassificationProject AI fields** — which providers to use, when
   to require AI, IoU thresholds, blurring rules, video FPS. See
   :ref:`admin-classification-projects`.
#. **Job-time overrides** — the snapshot ``job_config`` JSON written
   per AI Classification Job. Captures exactly what was submitted
   for that run; usable for forensic debugging via
   :ref:`admin-ai-jobs`.

Stages of the pipeline
======================

Each new resource flows through these stages in order. The
``ClassificationProject`` controls which stages run by setting AI
providers on the relevant fields:

#. **Object detection.** Set ``object_detection_ai_model`` to a
   *detection*-type AI Provider (typically MegaDetector v5/v6, YOLOv8
   variants, or a Trapper-tuned detector).
#. **Species classification.** Set ``species_ai_model`` to a
   *classification*-type provider (DeepFaune, EfficientNet, SDZWA …).
   Runs only on objects classified as ``observation_type=animal`` by
   stage 1.
#. **Sequence building.** Triggered by the
   ``celery_build_sequences`` task (Celery beat) — groups resources
   into ecological events using the time interval configured per
   collection.
#. **Privacy blurring.** Triggered after detection if the project sets
   ``blur_humans`` / ``blur_vehicles``. Rewrites the original media
   files; ``blur_backup`` keeps copies.
#. **Distance estimation** *(optional)*. See
   :ref:`configuration-distance` for the dedicated configuration path.

.. _configuration-trackers:

Trackers
========

For video resources, a tracker is what turns frame-by-frame detections
into multi-frame *DetectedObject* tracks. The Expert doesn't run the
tracker itself — it ships in the ``trapper-ai-worker`` packages
(``trapperai-trackers`` and ``trapperai-trackers-ultralytics``). Expert
configures *which* tracker to use by the ``tracker`` field of the
``DetectionJob.spec`` it submits to the AI Manager.

Available trackers (defined in ``trapper-schemas`` and implemented in
the worker):

+------------------------+-----------------------------------------------+
| Tracker                | When to pick it                               |
+========================+===============================================+
| ``noop``               | No tracking; each detection is its own        |
|                        | "track of length 1". Cheapest. Default for    |
|                        | image collections (where tracking is          |
|                        | meaningless).                                 |
+------------------------+-----------------------------------------------+
| ``majority_voting``    | Consensus across frames inside a single       |
|                        | classification job. Cheap; works well when    |
|                        | the camera is static and the animal is        |
|                        | mostly visible.                               |
+------------------------+-----------------------------------------------+
| ``bytetrack``          | Ultralytics-shipped ByteTrack.                |
|                        | High-quality, good identity preservation,     |
|                        | needs more compute.                           |
+------------------------+-----------------------------------------------+
| ``botsort``            | Ultralytics-shipped BoT-SORT.                 |
|                        | Strongest in dense / overlapping-animals      |
|                        | scenarios.                                    |
+------------------------+-----------------------------------------------+

Where the choice is configured
------------------------------

Today, tracker selection is wired into the project-level AI
configuration via the AI Manager's job-spec template. To override per
project:

#. Open the AI Manager admin.
#. Edit the ``PredictionModel`` row for the detection model used by
   your project.
#. Adjust ``model_config.tracker`` (a ``trapper-schemas`` ``Tracker``
   discriminated-union object).

For one-off override of a single AIClassificationJob, edit
``job_config`` on the job row before re-submitting (typically by
re-running with a tweaked AI Manager config and using the
:ref:`rerun_ai_pipeline <admin-classification-projects>` action).

Schema reference: ``trapper_schemas.jobs.spec`` and the predictor /
tracker classes in ``trapper_schemas.predictors``. Worker-side
implementations live under
``trapper-ai-worker/packages/trackers/`` and
``trapper-ai-worker/packages/trackers-ultralytics/``.

.. _configuration-distance:

Distance calibration & estimation
+++++++++++++++++++++++++++++++++

Trapper Expert estimates the distance from each detected object to the
camera by combining a depth-model prediction with a per-deployment
calibration model. The schemas for the configuration are in
``trapper-schemas`` (subpackage ``depth``), the inference runs in
``trapper-ai-worker`` (``trapperai-predictors-torch-depth`` /
``trapperai-predictors-hailo-depth``), and the per-deployment fitting +
running is configured in this admin.

The two phases
==============

#. **Calibration** — once per deployment hardware (camera + lens).
   A *calibration deployment* (Deployment with
   ``is_calibration=True``) captures a few resources where you mark
   reference points at known distances. Trapper fits a
   ``CalibrationModel`` (linear or curve-knot) that maps raw depth-
   model output to metres for that camera.
#. **Estimation** — every time a real (non-calibration) deployment
   runs the AI pipeline, the configured distance-estimation
   parameters integrate the depth-model output across each tracked
   object and write the median distance into
   ``ClassificationDynamicAttrs.distance``.

The calibration model
=====================

``CalibrationConfig`` (rendered as a grouped fieldset in the admin)
holds:

- **Reference points** — a list of ``(image_pixel, world_distance_metres)``
  pairs collected from the UI.
- **Method** — ``linear`` (single scale + offset) or ``curve_knots``
  (piecewise linear with N control knots; better for cameras with
  non-linear depth behaviour).
- **Alignment** — how the fitted model is anchored against the
  depth-model output (``mean`` / ``median`` / ``min``).

The fitted ``CalibrationModel`` is stored on the calibration
Deployment and read by every estimation run that references this
calibration.

To re-fit (e.g. after editing reference points):

#. Select the calibration deployment(s) in the admin.
#. Run the **Rerun depth calibration**
   (``rerun_depth_calibration``) action.
#. Verify the fitted parameters in the deployment row.

The distance-estimation configuration
=====================================

``DistanceEstimationConfig`` is a ~20-field Pydantic model. Field
groups (rendered as separate admin fieldsets):

+------------------------+--------------------------------------------+
| Group                  | Fields                                     |
+========================+============================================+
| Frame sampling         | ``target_fps`` — downsample input frames   |
|                        | before depth/SAM. ``max_frames_per_object``|
|                        | — cap per-track frame count.               |
+------------------------+--------------------------------------------+
| SAM (segmentation)     | ``sam_model`` — which SAM variant.         |
|                        | ``sam_threshold`` — mask cut-off.          |
|                        | ``sam_use_best_box`` — whether to feed     |
|                        | only the highest-confidence detection box  |
|                        | into SAM.                                  |
+------------------------+--------------------------------------------+
| Kalman filter          | ``kalman_q``, ``kalman_r``, ``kalman_init``|
|                        | — process / measurement noise; initial     |
|                        | state. Smooths jitter in raw per-frame     |
|                        | distance.                                  |
+------------------------+--------------------------------------------+
| Smoothing              | ``smoothing_window`` — frames; 5 is a good |
|                        | starting point. ``smoothing_method``       |
|                        | — ``moving_avg`` / ``median`` / ``savgol``.|
+------------------------+--------------------------------------------+
| Calibration alignment  | ``alignment_method`` — how to align this   |
|                        | run's depth-model output against the       |
|                        | calibration. ``alignment_max_offset``      |
|                        | — clamp on alignment correction.           |
+------------------------+--------------------------------------------+

Per-deployment defaults are set in the Deployment edit form. To
override for a single re-run without persisting, use the
**Rerun distance estimation** (``rerun_distance_estimation``) admin
action — its form exposes the same fields.

Walkthrough: see :ref:`tutorial-distance-calibration` for a
step-by-step tour from "create the calibration deployment" through
"verify the distances on a test deployment".

.. _configuration-frame-hypertable:

Frame-level data and ``USE_FRAMES_HYPERTABLE``
++++++++++++++++++++++++++++++++++++++++++++++

Per-frame trajectory data lives in two places:

#. The ``Classification.frames_msgpack`` file on disk (zstd-compressed
   msgpack) — the **source of truth**, served to clients via
   ``ClassificationFramesDownloadView``.
#. The ``ObjectFrameObservation`` TimescaleDB hypertable — a
   **queryable index** built from the msgpack, used internally by
   analytics and the cleanup tasks.

The hypertable is optional. Set:

.. code-block:: bash

   USE_FRAMES_HYPERTABLE=false

…in the profile ``.env`` to disable it. Effects:

- Migration 0095 skips the ``CREATE EXTENSION timescaledb`` and
  ``create_hypertable`` calls — fresh installs work on plain Postgres.
- Migration 0096's per-record loop writes per-frame data directly to
  ``Classification.frames_msgpack`` instead of the hypertable.
- Migration 0096b skips the compression policy.
- Runtime: ``upsert_frames_from_msgpack`` Celery tasks no-op,
  ``celery_cleanup_dangling_frame_observations`` and
  ``celery_get_hypertable_stats`` raise.
- Management commands ``configure_hypertable`` and
  ``rebuild_frames_msgpack`` refuse to run.
- The ``hypertable_cleanup`` Celery beat entry is not registered.

Disabling the flag never drops data. Re-enabling it later does *not*
auto-backfill — for that, run ``populate_frames_hypertable`` (see
:ref:`admin-mgmt-commands`).

When to disable: when running on plain Postgres, when the migration
cost of 0096 is unacceptable on a multi-million-row instance, or when
you don't need the analytics queries the hypertable enables.

When to keep enabled (default): when you want fast per-frame queries
across millions of objects, or use the
:ref:`admin-cmd-generate-thumbnails` analytics features.

Trapper Tools (data upload)
+++++++++++++++++++++++++++

The companion CLI ``trapper-tools`` packages and uploads camera-trap
data to a Trapper Expert instance. Configuration on the *Trapper Tools
side* lives under ``~/.trapper-tools/default.toml``; on the *Expert
side* the relevant knobs are:

- FTPS endpoint — Pure-FTPd on port 21, secured by the same SSL certs
  as the web stack. Per-user accounts created via the
  :ref:`admin-users` action.
- Chunked upload service — FastAPI on port 8088 (the
  ``trapper_uploader`` Compose service). Used by both ``trapper-tools``
  and the Citizen Science frontend for resumable uploads.
- ``MAX_UPLOAD_SIZE`` (per-file) and ``MAX_UPLOAD_COLLECTION_SIZE``
  (per-collection) — cap individual transfers.

The tooling-side workflow is documented in :ref:`tutorial-upload-tools`.

Trapper Expert features in the AI Manager / Worker
++++++++++++++++++++++++++++++++++++++++++++++++++

A few Expert-side settings affect how the AI Manager / Worker behave:

- ``TRAPPER_AI_API_URL`` / ``TRAPPER_AI_API_AUTH_LOGIN`` /
  ``TRAPPER_AI_API_AUTH_PASSW`` — the AIProviderConnection defaults
  rendered into ``.env`` by ``configure.py``. The admin wizard
  populates these on first run and the
  :ref:`admin-users` *Create AI Worker user* action keeps them in
  sync.
- ``USE_FRAMES_HYPERTABLE`` (above) — if ``false``, the AI Worker still
  produces per-frame data but the hypertable side is skipped.

The AI Worker coordinator's own configuration
(``coordinator.yaml``, hardware autodetect, runtime selection,
``$TRAPPERAI_HOME`` paths) lives in the worker repo and is documented
in ``trapper-ai-worker``'s README.