Research-stage integrity check

Reddit source validation for content sourcing

When agents mine Reddit for victim stories, scam reports, tips, or product feedback, VeracityAPI can flag suspiciously generic or weak-provenance posts before they become source material.

Get API key All use cases Docs

Business value

  • Protects content integrity before bad sources enter the research corpus.
  • Reduces risk from AI-planted marketing, competitor sabotage, astroturfing, or generic reposts.
  • Helps research agents prioritize firsthand accounts with concrete details.

Agent job to be done

Act as a source triage filter. Score candidate posts/comments, keep credible-specific sources, and send suspicious ones to verification or discard queues.

format: social_postintended_use: citedomain: Reddit source validation / travel scams

When to call VeracityAPI

Run after Reddit search/scrape and deduplication, before summarization or content generation.

What text to submit

Post title, body, selected comments, subreddit, timestamp, author age/karma if available, and permalink. VeracityAPI scores text; your agent should keep metadata separately.

Decision policy

  • allow: low risk with concrete first-person details, timestamps, places, sequence of events, and no obvious promotion.
  • human_review: medium risk, missing provenance, or source is important to a claim.
  • reject: high risk for citation/training workflows, especially if evidence flags generic phrasing or unsupported claims.
  • local override: never cite a Reddit post as fact without corroboration; use it as lead/source material.

Request template

curl https://api.veracityapi.com/v1/analyze -H "Authorization: Bearer DOC_KEY" -H "Content-Type: application/json" -d '{"type":"text","content":"Paste content here","context":{"format":"article","intended_use":"publish"}}'

Automation recipe

  • Research agent collects candidate Reddit posts.
  • Deduplicate near-identical stories.
  • Score each candidate with intended_use=cite.
  • Rank by content_trust_score and evidence quality.
  • Send top sources to corroboration search; discard or quarantine high-risk posts.

Evidence spans agents should inspect

  • story lacks place/time/sequence details
  • generic warning language without lived experience
  • promotional phrasing
  • claims that read like summarized advice rather than firsthand report

Policy pseudocode

if (result.recommended_action === "allow") continueWorkflow();
if (result.recommended_action === "revise") rewriteWith(result.evidence, result.recommended_fixes);
if (result.recommended_action === "human_review") queueForHumanReview(result);
if (result.recommended_action === "reject") discardOrRebuild();

KPIs to track

  • percentage of scraped posts filtered
  • corroboration success rate
  • human reviewer acceptance rate
  • bad-source incidents prevented
  • research time saved

What can go wrong

  • VeracityAPI cannot prove a Reddit user is real.
  • A truthful victim may write vaguely; do not reject high-impact sources without corroboration review.
  • Combine with account metadata, cross-source search, and moderator signals.

Cost and latency notes

Analyze only is $0.005 per 1,000 characters; Analyze + revise with auto_revise=true is $0.010 per 1,000 characters. Both round up to the nearest 1,000 characters. Short captions/emails usually cost $0.005; longer pages or chapters scale linearly by length. Current v0.1 latency is LLM-bound, so batch/concurrent orchestration is recommended for high-volume pipelines.

Agent evaluation checklist