Back to case library[ PROJECT CASE ]

TalentAI: hiring intelligence performance case

This case shows how a hiring intelligence system moves from demo to measurable delivery: the public workflow proves usability, estimates give target ranges, benchmarks verify P50 / P95 / QPS / cost, and evaluation verifies Recall@K, evidence coverage, and regression risk.

Open demo View product page

[ CASE SUMMARY ]

Business capacity, agent quality, and verification boundary first

Business capacity

Latency, throughput, and cost show whether hiring systems can carry load

TalentAI is not only about matching. Candidate-library scale, P50 / P95 / P99, successful matches per minute, and cost per successful task belong in the case scope.

Agent quality

Recall@K and Evidence Coverage measure matching judgment

A hiring intelligence system cannot return scores alone. It must show expert-approved candidates are recalled, ranked, and explained with resume evidence.

Delivery boundary

Estimates can be stated; verification needs scope

The public demo shows the workflow, engineering estimates give target ranges, controlled benchmarks verify numbers, and production SLA is calibrated in the client environment.

[ METRIC STATES ]

Estimates are legitimate when they can be verified

The website can state engineering estimates and target baselines when assumptions, data scale, concurrency, and verification paths are explicit. The deliverable is making those numbers real, measured, and monitored.

Demo-observedWorkflow opens live

Talent pool, JD matching, resume parsing, and evidence explanation are real surfaces for walkthroughs.

Engineering estimateP50 / P95 / QPS / Cost

Target ranges are based on dataset size, concurrency, model routing, cache, and retry assumptions.

Benchmark-verifiedThroughput and stability

Fixed data, environment, version, and raw logs verify latency, throughput, failure rate, and unit cost.

EvaluationRecall@K / Evidence Coverage

Gold sets and regression sets verify recall, ranking, evidence coverage, and degradation risk.

[ RETRIEVAL QUALITY ]

RAG is the implementation behind the metrics, not the case itself

Hiring matching should not ask a model to guess fit from nothing. Candidate recall must be accurate, ranking stable, and evidence explicit before generation or explanation happens.

Query Decomposition

Break the JD into retrieval intents

Role requirements are split into hard filters, skill entities, experience semantics, industry background, and risk signals before entering structured filters, full-text search, and vector recall.

Hybrid Recall

Full-text, field filters, and vector recall run together

Chinese skill terms are covered by zhparser and full-text indexes, profile fields are filtered in PostgreSQL, and experience/project descriptions use pgvector HNSW for semantic recall.

Fusion Ranking

RRF reduces single-channel recall bias

Keyword, field, and vector recall each have bias. RRF fuses rankings first, then the LLM Gateway reranks, scores, and explains role fit by dimension.

Grounded Answer

Match conclusions must carry evidence snippets

The output is not just similarity. It returns candidates, matched fields, resume evidence, risk points, and scoring rationale for recruiter review.

[ RAG STACK ]

Retrieval is split across fields, full-text, vectors, and reranking

Nuxt 3Retrieval result and evidence workspace

FastAPIRAG retrieval and matching API

PostgreSQLStructured profiles and filters

zhparserChinese full-text and skill recall

pgvector HNSWSemantic vector recall index

RRF + LLM GatewayFusion ranking, rerank, and explanation

[ DELIVERY FLOW ]

From resumes to explanations, then private operations

Resume structuring

PDF, Word, and text resumes become raw records plus experience, skills, education, companies, and project context.

JD query decomposition

Job descriptions are split into hard filters, soft preferences, skill entities, experience semantics, and negative risk signals.

Multi-channel recall and fusion

PostgreSQL, zhparser, and pgvector HNSW recall candidates in parallel, then RRF merges rankings.

Evidence explanation and private boundary

The LLM Gateway scores only over retrieved evidence while permissions, logs, backups, and recovery stay in a controlled client environment.

[ INTERFACE PROOF ]

The live UI handles demos; the case page explains delivery logic

Talent pool view

Scan candidates by skills, company, tenure, and match quality.

JD match view

Role requirements, candidate scores, and explanations live in one decision surface.

Resume parsing view

Uploads become structured experience and skill tags with less manual entry.

[ NEXT ]

Hiring, knowledge bases, and CRM must become measurable systems

The method is to translate business complexity into latency, throughput, success rate, unit cost, recall quality, and human intervention rate, then decide retrieval, scheduling, model routing, evaluation, and private operations boundaries.

Discuss a similar project