AI Hiring Verification Platform
How we built EvidenceHire: a 7-stage LLM pipeline that cross-checks every candidate claim against real evidence and runs live AI video interviews
Project Overview
Product Type
B2B SaaS Platform
Category
Hiring Verification & Assessment
AI Models
GPT-5 + Realtime Video
Runtime
Cloudflare Workers (Edge)
Technologies & Frameworks
The Challenge
Hiring runs on self-reported claims. Résumés and profiles are easy to inflate and, increasingly, padded with AI, while the tools meant to catch this are siloed, gameable, and disconnected from the candidate's real background:
- ×AI tools let candidates generate polished résumés that pass keyword filters but have zero substance
- ×Traditional ATS and job boards surface self-reported profiles with no way to tell who is telling the truth
- ×Assessment platforms use static, gameable question banks disconnected from the candidate's actual background
- ×The cost of a bad hire runs 30–50% of annual salary, yet HR teams face 100+ applicants per role
- ×Agencies re-screen the same talent pool against many client roles, paying for redundant analysis each time
- ×Hiring decisions must be fair and defensible, free of gender, race, age, and other protected attributes
Recruiters needed a second layer of verification: one that checks every claim against real evidence, proves skills directly, and produces defensible, auditable decisions at volume.
The Verification Pipeline
We built a durable, cache-aware Cloudflare Workflow where specialized LLM stages transform raw candidate evidence into a verified, cited, ranked fit report, with a skill-assessment engine feeding results back in:
1. Ingest & Normalize
LinkedIn is scraped, linked sources (GitHub, portfolio, Dribbble, Behance, Upwork, publications) are fetched, and the résumé is parsed. The LLM produces a structured profile where every field cites its exact source excerpt.
2. Skill Evidence Check
Each claimed skill is graded individually against all sources (corroborated, weakly corroborated, claim-only, or contradicted) with depth, recency, confidence, and verbatim evidence pointers.
3. Leadership, Relocation & Flight Risk
Parallel stages estimate team size and P&L ownership, location and willingness compatibility, and a 0–100 retention likelihood with drivers and counterforces.
4. Fit Assessment & Recruiter Intel
A weighted composite score against the job description's must-haves, plus recruiter intel: outreach templates, interview prep, and compensation estimates. Verified assessment results feed back in to raise the score.
The Assessment Engine
When a profile shows weak or unsubstantiated skills, EvidenceHire gives the candidate a chance to prove competence directly: no multiple choice, ever. Recruiters assign assessments individually or in bulk to everyone above a threshold, and candidates start from a one-time link with no login.
- Text: LLM-generated open scenario questions targeting the candidate's exact evidence gaps and the role's must-haves, graded against dynamic rubrics and checked for AI authorship.
- Live video: a conversational AI voice interviewer over WebRTC, with full recording and transcript, transcript AI-authorship signals, and a vision-based real-person and liveness check.
- Feedback loop: passing an assessment promotes the skill to verified, which refreshes the fit score against every role the candidate is being considered for.
Platform Capabilities
Evidence Verification
Core DifferentiatorEvery candidate claim is cross-checked against real, sourced evidence and cited to the exact excerpt. Unsupported claims are flagged, not silently ranked.
Text Assessment
Skill ProofPer-candidate, LLM-generated open scenario questions targeting the exact evidence gaps in a profile. No multiple choice, graded against dynamic rubrics.
Live AI Video Interview
Deep VerificationA conversational AI voice interviewer over WebRTC, with full recording, transcript, AI-authorship signals, and a vision-based real-person / liveness check.
Fit Reports
Decision SupportRanked, weighted fit scores against a specific job description, with skill-by-skill matching, flight-risk, and relocation fit, all auditable.
Recruiter Intel
Outreach & PrepOutreach templates, opening hooks, interview questions, and compensation estimates generated from the candidate's own evidence.
Bulk & Branded
Agencies & VolumeCSV import, bulk assessment assignment above a threshold, spam-resistant public apply, and AI-summarized branded PDF shortlists.
The Transformation
Before EvidenceHire
- ×Screening 500 résumés by hand with no way to verify a single claim
- ×AI-padded résumés passing keyword filters undetected
- ×Gameable multiple-choice tests disconnected from the candidate's background
- ×Re-interviewing the same talent for every new client role
- ×Subjective, hard-to-defend hiring decisions
After EvidenceHire
- Multi-source cross-checking flags any skill claimed but unsupported by GitHub, LinkedIn, portfolio, or publications
- Per-skill verbatim citations make every verdict auditable: no black-box scores
- Free-text and live AI video-interview assessments prove competence, with AI-authorship and real-person detection
- A cache-aware pipeline reuses normalized profiles, so re-screening across roles costs a fraction of a fresh run
- Weighted, objective fit scoring that explicitly excludes gender, race, age, religion, disability, and photo
- A durable Cloudflare Workflow analyzes every applicant in parallel, so the report is ready before a recruiter opens it
Results & Impact
Final Outcome
EvidenceHire shipped as a production B2B platform running entirely on Cloudflare's edge. It gives recruitment agencies and high-volume hiring teams a second layer of verification: cross-checking every claim against real evidence, proving skills through free-text and live AI video interviews, and producing ranked, cited, bias-free fit reports, all built by AppliPlus on durable Workflows, GPT-5, and the OpenAI Realtime API.
Fairness & Integrity by Design
Bias-Free Scoring
Gender, race, age, religion, disability, and profile photo are excluded from every score
Fully Auditable
Every verdict cites the exact excerpt and source it came from: no silent inference
AI-Manipulation Detection
Assessment answers and interviews are checked for AI authorship and real-person liveness
Multi-Source Truth
A claim absent from every corroborating source is flagged, never quietly ranked
Spam-Resistant Intake
Turnstile-protected, rate-limited, blacklist-aware public apply keeps the pipeline clean
Edge Runtime
Cloudflare Workers, Workflows, Queues, and R2: global, durable, and cost-efficient
Want to Build an AI-Powered Product?
From durable multi-agent LLM pipelines to live AI video and edge-native SaaS, we build products that scale