Find where people contradict themselves.
Point it at a subject. It discovers the videos, reads the transcripts, and surfaces the moments where one person says two things that cannot both be true.
A pipeline that turns a question into evidence.
Six stages, each doing one job. The expensive work never runs on a video that does not earn it.
Discover
Seed queries become a deduplicated list of candidate videos.
Gate
An LLM judges each video against the subject from metadata alone, so noise never costs a transcript.
Transcribe
Captions are pulled through a residential proxy and stored with per-word timing.
Analyze
Every check-worthy claim is extracted with speaker, stance, quote, time, and a strength tier.
Contradict
A stronger model matches claims across the corpus and judges which pairs genuinely conflict.
Surface
Confirmed contradictions land in one place, each with its quotes and timestamps.
Two planes, one job each.
The brain runs on Cloudflare. The muscle runs on a server with a residential IP, because YouTube hands datacenter addresses a login wall on transcript requests.
Orchestration and discovery
- ≠ A durable, event-driven workflow drives every run
- ≠ Discovery runs natively on Cloudflare IPs
- ≠ D1 holds the workspaces, claims, and contradictions
The proxy-bound work
- ≠ Transcription through a cellular proxy, one pull at a time
- ≠ Claim extraction on a fast local model
- ≠ Contradiction matching on a stronger one
The server never holds a long request open. The workflow asks it to start a stage, then waits on a callback. Nothing polls.
The matching is the product.
Older tools paired claims by matching speaker names, which breaks the moment the same person appears under two spellings. Here a model matches the claims directly. It resolves identity itself, so Araghchi, Araqchi, and the Arabic spelling are treated as one person, and it judges the conflict in the same pass.
Claim lists travel as TOON, a compact tabular notation, so long lists cost a fraction of the tokens JSON would spend repeating keys.