Nova uses bias testing to look for unfair scoring patterns in AI-assisted candidate review. The goal is practical: find role-irrelevant differences in scores or rankings before they affect hiring workflows. Bias testing is one control, not a guarantee. It works alongside job-relevant criteria, explainable scoring, human review, data protection controls, and customer hiring policies.Documentation Index
Fetch the complete documentation index at: https://nova.dweet.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Recruitment and candidate-evaluation AI systems may be subject to additional legal requirements in some jurisdictions, including the EU. This page focuses only on Nova’s bias-testing method.
What We Test
Nova’s current public bias test is a synthetic cohort stress test. It varies demographic-linked signals across generated profiles, scores the cohort, then reviews whether lower-selection groups need closer inspection before interpreting the result. We test for:- Average score differences across demographic groups.
- Selection-rate differences at defined scoring thresholds.
- Race and sex intersections where the sample size supports it.
- Criteria or scoring patterns that could introduce role-irrelevant proxy signals.
- Whether assessments remain tied to job criteria and candidate evidence.
Methodology
Define the role and criteria
We start with a representative job, job description, and scoring criteria. The criteria should be job-relevant, resume-verifiable, and separated by importance.
Create synthetic candidate profiles
We generate resumes with broadly comparable role-relevant qualifications and varied demographic-linked signals, such as names, locations, education signals, age-related experience patterns, or disability-related wording.
Score through an evaluation harness
The synthetic profiles are scored through a Nova scoring evaluation harness. The output is the candidate score and supporting evidence used for review.
Compare outcomes
We compare score distributions and selection rates across groups. The main check asks whether one group passes the review threshold much less often than another comparable group.
Metrics
| Metric | What it means |
|---|---|
| Score distribution | Whether one group receives materially different scores from another group. |
| Selection rate | The share of a group scoring above the selected review threshold. |
| Impact ratio | A group’s selection rate divided by the highest selection rate in the comparison. |
| Intersectional impact ratio | The same comparison across combined groups, such as race x sex. |
The four-fifths rule is a screening benchmark, not a complete fairness test. Small samples, unusual candidate pools, and role-specific requirements can all affect interpretation.
Result Labels
| Label | Meaning |
|---|---|
| Clear | The synthetic test did not find a material adverse-impact signal. |
| Review | The test found a possible signal, so a person should review the result before drawing conclusions. |
| Concern | The test found a stronger signal. Nova should investigate the criteria, test data, or scoring behavior before relying on that setup. |
Current Public Evaluation
Nova has a public bias evaluation summary at nova.dweet.com/bias-evaluation. The public run was generated on May 26, 2025. It uses 500 synthetic profiles for a Senior Software Engineer role and compares scoring patterns across sex, race and ethnicity, age, disability status, and race x sex intersections. Use that page as the product-level summary of the public run. The public run is limited to a representative role and synthetic profiles. It does not evaluate customer-specific criteria, recruiter behavior, interview decisions, final hiring decisions, sourcing, ATS filters, or real applicant pools.How We Use Findings
When a test raises a signal, Nova reviews it by checking:- Check whether the synthetic profiles are comparable and realistic.
- Check whether the criterion is job-relevant and verifiable.
- Look for role-irrelevant proxies, such as school prestige, geography, name signals, age proxies, or accommodation wording.
- Review candidate-level scoring evidence to see what drove the score.
- Adjust criteria or test data where needed.
- Re-run the test and keep the before/after evidence.
Customer Responsibilities
Nova can test and explain its scoring behavior, but customers still control the hiring process. You should:- Configure job-relevant criteria.
- Avoid criteria that rely on protected characteristics or weak proxies.
- Review Nova outputs before making employment decisions.
- Provide candidate, employee, or worker notices required by your policies and applicable law.
How To Read The Results
Bias testing can show useful evidence, but it should be read in context:- Synthetic tests are designed to stress-test scoring. They do not represent every real applicant pool.
- A clean result for one role does not prove the same result for every role.
- Real-world outcomes also depend on sourcing, human review, interviews, and final decisions.