Models and Datasets¶
xcalib can load released paper checkpoints and the public A9 dataset cache
from the Hugging Face Hub. Local files also work, which is useful for offline
deployment, custom checkpoints, or experiments.
Load A Pretrained Matcher¶
The usual path is Matcher.from_pretrained(model, site=...). It resolves the
packaged config and downloads the matching checkpoint when the release artifact
is available:
from xcalib import Matcher
matcher = Matcher.from_pretrained("crlite", site="a9_dataset_r02_s01")
Pin a release tag or commit when reproducibility matters:
matcher = Matcher.from_pretrained(
"crlite",
site="a9_dataset_r02_s01",
revision="v0.3.0",
)
Download Weights Ahead Of Time¶
For machines that should not download at runtime:
xcalib pull-weights --model crlite --site a9_dataset_r02_s01 --out checkpoints/
Then load the local files:
matcher = Matcher.from_pretrained(
"crlite",
weights="checkpoints/crlite_a9_dataset_r02_s01_best.pth",
config="checkpoints/crlite_a9_dataset_r02_s01.yaml",
)
Load The A9 Dataset Cache¶
load_dataset() first checks for a local cache and then falls back to released
Hub artifacts:
from xcalib import load_dataset
loader = load_dataset("a9_dataset_r02_s01", split="test")
You can also pre-fetch a split:
xcalib pull-dataset --site a9_dataset_r02_s01 --split test --out datasets/
Local Custom Weights¶
Custom or fine-tuned weights should be loaded from local paths with the matching YAML config:
matcher = Matcher.from_pretrained(
"crlite",
weights="runs/site42/best.pth",
config="runs/site42/crlite_site42.yaml",
)
If the config changes model dimensions, the checkpoint must have been trained with that same config.
Integrity¶
For released artifacts, compare downloaded files against the checksums listed in the model or dataset card.
PowerShell:
Get-FileHash .\checkpoints\crlite_a9_dataset_r02_s01_best.pth -Algorithm SHA256
Linux/macOS:
sha256sum checkpoints/crlite_a9_dataset_r02_s01_best.pth
Dataset Terms¶
The A9 HDF5 caches derive from the TUM Traffic / A9 dataset. Users remain responsible for following upstream dataset terms and citations.