Audio detector for impersonator call review
Define how agents should handle suspected executive, family, vendor, or support impersonation clips with the live audio workflow triage with transcript return endpoint.
Business value
- Protects trust in impersonator audio workflows before automated agents take irreversible action.
- Creates a concrete QA policy agents can apply consistently.
- Turns risk evidence into review, revision, or provenance requests.
Agent job to be done
Act as the impersonator audio trust triage layer. Score or prepare the asset, inspect evidence, and choose allow, revise, human_review, or reject based on workflow stakes.
format: otherintended_use: moderatedomain: impersonator call review
When to call VeracityAPI
Run after asset intake/export and before publish, moderation, citation, training, payment, or account-impacting decisions.
What audio URL to submit
HTTPS audio URL, optional caller transcript; VeracityAPI returns a Gemini-generated transcript, and local metadata submitted to the live audio workflow-triage endpoint.
Decision policy
- allow: low risk and low-stakes use with no conflicting local signals.
- revise: medium risk or evidence that can be fixed by replacement, disclosure, or better provenance.
- human_review: high risk, sensitive claims, identity/fraud implications, or evidentiary use.
- reject: repeated high-risk assets combined with policy violations or missing provenance.
Request template
curl https://api.veracityapi.com/v1/analyze -H "Authorization: Bearer DOC_KEY" -H "Content-Type: application/json" -d '{"type":"text","content":"Paste content here","context":{"format":"article","intended_use":"publish"}}'Automation recipe
- Store consented audio in your own controlled storage.
- Call POST /v1/analyze with type=audio, an HTTPS audio URL in content, optional caller transcript; VeracityAPI returns a Gemini-generated transcript, context,.
- Route medium/high-risk clips to human review with evidence and recommended fixes.
- Record reviewer outcomes for future calibration.
- Never treat audio scores as voice-clone proof or speaker identity verification.
Evidence spans agents should inspect
- synthetic-looking texture or cadence
- geometry, text, label, transcript, or continuity mismatch
- weak or missing provenance
- signals that conflict with local metadata
Policy pseudocode
if (result.recommended_action === "allow") continueWorkflow(); if (result.recommended_action === "revise") rewriteWith(result.evidence, result.recommended_fixes); if (result.recommended_action === "human_review") queueForHumanReview(result); if (result.recommended_action === "reject") discardOrRebuild();
KPIs to track
- assets triaged
- human-review precision
- bad publishes or decisions prevented
- false-positive appeal rate
- average review latency
What can go wrong
- Audio scoring is a beta workflow-triage endpoint, not a forensic detector or voice-clone evidence.
- Do not use a single score as forensic evidence.
- Combine VeracityAPI with local metadata, source reputation, and human escalation.
Cost and latency notes
Audio workflow triage is a flat $0.01 per request. Submit an HTTPS audio URL under the size cap plus optional caller transcript; VeracityAPI returns a Gemini-generated transcript; VeracityAPI stores no audio bytes, base64, or full URL.
Agent evaluation checklist
- Is the page clearly marked as beta workflow triage, not forensic proof?
- What consent, retention, and privacy rules govern the audio clips?
- What independent verification is required before consequential decisions?
- How will the agent behave if audio scoring is unavailable or inconclusive?
- Which reviewer outcome should be captured for future calibration?