Workflow-risk triage, not forensic proof.
VeracityAPI is designed to answer the product question agents need before acting on content: should this workflow allow, revise, queue for human review, or reject?
What we measure
- Specificity gaps and generic slop risk.
- Weak provenance, unsupported claims, and missing source context.
- Visible synthetic-image cues for uploaded/public image URLs.
- Synthetic-audio and workflow risk cues for short HTTPS audio URLs.
What we do not claim
- No proof of human vs AI authorship.
- No legal, academic, employment, identity, or forensic determination.
- No speaker identity verification or voice-clone evidence.
- No guarantee that content is true, false, safe, or lawful.
Trust model
Each response combines model-observed evidence with deterministic post-processing. Agents should branch on recommended_action, store evidence and limitations, and escalate high-stakes uncertainty to humans.
| Signal | Purpose | Safe use |
|---|---|---|
content_trust_score | Summarize confidence that content is usable for the workflow. | Rank queues and monitor quality trends. |
risk_level | Normalize model and modality-specific risk. | Set default routing thresholds. |
recommended_action | Expose deterministic workflow routing. | allow / revise / human_review / reject. |
evidence | Explain why a route was chosen. | Give reviewers and rewrite agents concrete spans. |
Privacy posture
Default text requests avoid raw-content retention. Image/audio analysis stores no raw media bytes, base64 payloads, or full media URLs. Logs keep operational metadata such as request ID, hashed inputs, hostname, latency, and model version.
Evaluation posture
Published evals report routing-action agreement and macro F1. The benchmark is calibration evidence for workflow routing, not a forensic detector certification.
Recommended escalation policy
- Use
allowfor low-risk reversible workflows. - Use
revisewhen evidence can be addressed by adding specificity, sources, provenance, or disclosure. - Use
human_reviewfor uncertainty, safety claims, identity/media claims, financial consequences, or citation/training workflows. - Use
rejectonly when local policy and evidence agree the content should be blocked or quarantined.