GitHub - wheevu/focus-lock-rs: App for generating tracked vertical fancams from landscape videos. (original) (raw)

Rust Tauri Svelte License

Automated fancam generator. It takes a standard landscape video and a reference photo of a person (say, your bias), tracks them, and generates a stabilized, vertical (9:16) cropped video locked onto them.

It features a modular Rust core for high-speed video processing, a CLI for batch operations, and a modern Tauri v2 desktop application for easy usage.

Features

Architecture

This project is organized as a Cargo workspace:

Logic Flow

Offline two-pass (default render path)

  1. Decode pass source frames with FFmpeg
  2. Detect + score identities and build short-term tracklets
  3. Solve globally across tracklets to stabilize identity assignment
  4. Plan camera path from the selected solved identity observations
  5. Render pass writes stabilized vertical H.264 output

Online fallback (legacy-compatible path)

  1. Decode video frames with FFmpeg
  2. Detect people with YOLOv8
  3. Match the target identity with ArcFace
  4. Track motion across frames with Kalman smoothing
  5. Render a stabilized vertical crop to H.264

Prerequisites

Setup

git clone https://github.com/wheevu/focus-lock-rs.git cd focus-lock-rs cargo build --release -p cli

Create a models/ directory in the project root and add:

For macOS, ensure a CoreML-enabled libonnxruntime.dylib is available at models/onnxruntime/lib/ (or set ORT_DYLIB_PATH).

Desktop Application (GUI)

cd ui npm install npm run tauri:dev

For a production build:

CLI

Generate a fancam from a landscape video and reference image:

cargo run --release -p cli -- fancam
--video "/path/to/concert.mp4"
--bias "/path/to/face_photo.jpg"
--output "output_fancam.mp4"
--yolo-model "models/yolov8n.onnx"
--face-model "models/w600k_mbf.onnx"
--threshold 0.6

Note: render runs a mandatory offline prepass (tracklet build + global solve) before final encode, so startup may be slower on long videos.

License

MIT

Tracking, performance, and GUI details

Tracking and identity locking

The pipeline combines person detection and face recognition to keep the crop locked onto a specific subject rather than just the most visible person in frame.

This makes the tracker more stable in crowded performance footage where multiple people may appear and disappear across frames.

Identity discovery pass (GUI)

The desktop app includes a pre-tracking discovery flow designed to make target selection more reliable before rendering begins.

The Tauri backend persists scan sessions and validates review state server-side before allowing a render to begin.

Scan session lifecycle

To make the review and render flow more robust, scan sessions track explicit lifecycle states:

Audit events are recorded through the session lifecycle, and run_fancam enforces that a validated session and selected identity match exist on the backend side, not just in the UI.

The GUI also supports manual split requests per identity, with a split-rescan path that refreshes candidate clustering when the initial grouping is not clean enough.

Smoothing and motion stability

To avoid shaky or jumpy crops, the render path uses a 2D Kalman filter to smooth subject motion across frames.

This helps with:

If the subject becomes occluded, the filter predicts the next likely position based on previous motion until visual confirmation is regained.

Performance pipeline

The online processing path is built around a 3-thread decode / inference / encode pipeline with bounded channels.

The offline render path adds a first pre-render pass to build tracklets and a solved camera plan before the final encode pass.

Performance-oriented behavior includes:

These optimizations are aimed at keeping the pipeline responsive and practical for longer videos without turning the whole thing into a heater-core cosplay.

Rendering behavior

Rendering is optimized for vertical fancam output while remaining resilient when tracking quality changes.

This keeps output usable even when the tracker cannot confidently maintain a tight crop for every frame.

Interfaces

The project supports two main usage paths:

Desktop app

The Tauri desktop application is intended for interactive use:

CLI

The CLI is better suited for:

Cross-platform scope

The project is designed to run across Windows, macOS, and Linux, with a shared Rust processing core and a Tauri-based desktop frontend.