Skip to content

commaai/comma_video_compression_challenge

Repository files navigation

comma video compression challenge

./videos/0.mkv is a 1 minute 37.5 MB dashcam video. Make it as small as possible while preserving semantic content and temporal dynamics.

  • semantic content distortion is measured using:
    • a SegNet: average class disagreements between the predictions of a SegNet evaluated on original vs. reconstructed frames
  • temporal dynamics distortion is measured using:
    • a PoseNet: MSE of the outputs of a PoseNet evaluated on original vs. reconstructed 2 consecutive frames
  • the compression rate is:
    • the size of the compressed archive divided by the size of the original archive
  • the final score is computed as (lower is better):
    • score = 100 * segnet_distortion + 25 * rate + √ (10 * posenet_distortion)

image

prize pool - submit by May, 3rd 2026 11:59pm AOE

The challenge is still open for submissions! Submit to get on the leaderboard, apply for a job/internship, or just for fun! See submission format and rules

Congratulations to the competition winners! See leaderboard for more submissions.

quickstart

Clone the repo

git clone https://github.com/commaai/comma_video_compression_challenge.git && cd comma_video_compression_challenge

Install dependencies

sudo apt-get update && sudo apt-get install -y git-lfs ffmpeg  # Linux
brew install git-lfs ffmpeg                                    # (or) macOS (with Homebrew)
git lfs install && git lfs pull
curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync --group cpu                                            # cpu|cu126|cu128|cu130|mps
source .venv/bin/activate

Test Dataloaders and Models

python frame_utils.py
python modules.py

Create a submission dir and copy the fast baseline_fast scripts

mkdir -p submissions/my_submission
cp submissions/baseline_fast/{compress.sh,inflate.{sh,py}} submissions/my_submission/

Compress

bash submissions/my_submission/compress.sh

Evaluate

bash evaluate.sh --submission-dir ./submissions/my_submission --device cpu  # cpu|cuda|mps

If everything worked as expected, this should producce a report.txt file with this content:

=== Evaluation config ===
  batch_size: 16
  device: cpu
  num_threads: 2
  prefetch_queue_depth: 4
  report: submissions/baseline_fast/report.txt
  seed: 1234
  submission_dir: submissions/baseline_fast
  uncompressed_dir: /home/batman/comma_video_compression_challenge/videos
  video_names_file: /home/batman/comma_video_compression_challenge/public_test_video_names.txt
=== Evaluation results over 600 samples ===
  Average PoseNet Distortion: 0.38042614
  Average SegNet Distortion: 0.00946623
  Submission file size: 2,244,900 bytes
  Original uncompressed size: 37,545,489 bytes
  Compression Rate: 0.05979147
  Final score: 100*segnet_dist + √(10*posenet_dist) + 25*rate = 4.39

submission format and rules

A submission is a Pull Request to this repo that includes:

  • a download link to archive.zip — your compressed data.
  • inflate.sh — a bash script that converts the extracted archive/ into raw video frames.
  • optional: a compression script that produces archive.zip from the original videos, and any other assets you want to include (code, models, etc.)

See submissions/baseline_fast/ for a working example, and ./evaluate.sh for how the evaluation process works.

Open a Pull Request with your submission and follow the template instructions to be evaluated.

evaluation

bash evaluate.sh --submission-dir ./submissions/baseline_fast --device cpu|cuda|mps

The official evaluation has a time limit of 30 minutes. Pick your runtime: github's "linux-nvidia-t4" GPU instance (RAM: 26GB, VRAM: 16GB) or github's "ubuntu-latest" CPU instance (CPU: 4, RAM: 16GB).

rules

  • External libraries and tools can be used and won't count towards compressed size, unless they use large artifacts (neural networks, meshes, point clouds, etc.), in which case those artifacts should be included in the archive and will count towards the compressed size. This applies to the PoseNet and SegNet.
  • You can use anything for compression, including the models, original uncompressed video, and any other assets you want to include.
  • Submissions are done via public Pull Requests. You may include your compression script in the submission, but it's not required.
  • Final ranking will be based on the public leaderboard, no private testing will be performed.

leaderboard (lower is better)

score name link
0.192 hnerv_fec6_fixed_huffman_k16 #110
0.193 hnerv_ft_microcodec 👑 #101
0.195 hnerv_lc_ac 👑 #103
0.195 hnerv_lc_v2_scale095_rplus1 👑 #102
0.195 hnerv_lc_v2 💡 #100
0.197 hnerv_muon_finetuned_from_pr95 #98
0.198 kitchen_sink #105
0.199 hnerv_muon 💡 📖 #95
0.206 rem2_HNeRV #96
0.209 belt_and_suspenders #106
0.229 vibe_coder_final_boss #97
0.229 apogee #107
0.231 qhnerv_ft_best #104
0.249 hpac_coder_hybrid #91
0.258 adaptive_masking_joint_frame_model #85
0.260 qzs3_range_joint_r258 #92
0.274 jas0xf_adversarial_neural_representation 💡 #86
0.275 adaptive_range_mask #84
0.280 qrepro 💡 #90
0.288 qzs3_range_mask 💡 #81
0.315 qpose14_r55_segactions_minp #79
0.315 qzs3_tile_delta_r147 #77
0.316 qpose14_qzs3_filmq9g_slsb1_r55 #67
0.320 henosis_qz_n3z_r25_clean #65
0.321 flatpup #93
0.325 qpose14 #63
0.331 unified_brotli #64
0.333 quantizr 💡 #55
0.344 qpose14_poseq6 #76
0.368 ph4ntom_drv #74
0.375 fp4_mask_gen #62
0.382 selfcomp #56
0.602 mask2mask 💡 #53
0.717 tomasdousek #71
1.236 codex_metric_yshift_av1 💡 #60
1.891 neural_inflate 💡 #49
1.914 svtav1_dilated_ren #58
1.944 roi_v2 #48
1.947 av1_roi_lanczos_unsharp #31
1.979 svtav1_av1grain_10bit #51
1.981 damir_bearclaw_002 💡 #30
2.005 roi_gop300_c34 #43
2.020 v4_qp_aq2_roi 💡 #44
2.033 av1_crf31_bicubic #52
2.052 svtav1_cheetah #24
2.070 svtav1_45pct_unsharp20_direct #27
2.083 svtav1_gop360_binomial_unsharp #26
2.083 av1_sharp1_adaptive #23
2.086 svtav1_45pct_unsharp 💡 #20
2.158 svtav1_spline_fg22 #37
2.200 svt_av1_lanczos_fg #18
2.553 h265_g16_512x384_veryslow #21
3.323 optimized #22
3.833 delta_codec 💡 #61
4.390 baseline_fast #1
5.086 damir_bearclaw_003 #39
25.000 no_compress #0

mirrored from comma.ai/leaderboard

going further

Check out this large grid search over various ffmpeg parameters. Each point in the figure corresponds to a ffmpeg setting. The fastest encoder setting was submitted as the baseline_fast. You can inspect the grid search here and look for patterns.

image

You can also use test_videos.zip, which is a 2.4 GB archive of 64 driving videos from the comma2k19 dataset, to test your compression strategy on more samples.

The evaluation script and the dataloader are designed to be scalable and can handle different batch sizes, sequence lengths, and video resolutions. You can modify them to fit your needs.

community write-ups and forks

About

lossy video compression challenge

Resources

License

Stars

Watchers

Forks

Contributors