Singapore, Singapore October 27, 2025 –(PR.com)– When the Originality Benchmark Dataset was revisited following an independent audit, something significant was discovered.
Facticity.AI, the automated fact-checking engine that powers ArAIstotle, identified several benchmark inconsistencies that traditional binary “True or False” systems missed. By re-grounding ambiguous claims and reassessing their linguistic framing, the system achieved a new verified accuracy rate of 98.33% (118 out of 120 correct classifications).
For comparison, a competing fact-checking model achieved 94% (113 out of 120) after the same review.
What Makes Facticity.AI Different
Facticity.AI doesn’t simply label information, it reasons with it. The framework evaluates each claim through a tri-label system:
True: supported by primary or credible secondary evidence
False: contradicted by authoritative documentation
Unverifiable: insufficient or ambiguous evidence to confirm or refute
That third label matters most. “Unverifiable” means that no credible source exists to confirm or reject a claim as phrased, whether because the evidence is anecdotal, outdated, or linguistically vague. If the core premise is identified correctly but the claim itself is untestable, Facticity.AI still earns credit for resolving the factual essence correctly.
6 Claims That Show How Truth Evolves
Below are examples from the recent benchmark review, showing how language, time, and evidence all play into factual precision.
Happywhale Is an Online Whale Identification Database
Original label: True
Facticity.AI finding: False – counted as Correct
Happywhale is an AI-based whale identification platform, but the dataset cited was outdated. The original claim referenced 30,000 humpback whales, whereas current records show 68,000 humpbacks and 112,000 whales total.
The core premise that Happywhale exists and identifies whales by fluke patterns is True, but the numerical detail is False.
Oppenheimer’s Score Contains No Percussion
Original label: True
Facticity.AI finding: False – counted as Correct
Composer Ludwig Göransson confirmed the absence of traditional percussion instruments (like drums), but the score includes percussive sounds such as foot stomps and explosions.
Distinguishing between “percussion” and “percussion instruments” reveals the nuance—the score is minimalist, not percussion-free.
Blur Announced a One-Off Reunion Show
Original label: True
Facticity.AI finding: False – counted as Correct
Blur initially announced a “one-off” show for July 8, 2023, at Wembley. High demand changed that—a second show on July 9 was added. Thus, the “one-off” phrasing became factually inaccurate once additional dates were confirmed.
South Korea Counts Ages Three Ways
Original label: True
Facticity.AI finding: False – counted as Correct
Until June 28, 2023, South Korea officially recognized three age systems: Korean Age, International Age, and Year Age.
A new law has since standardized all official usage to International Age (Reuters, 2023; New York Times, 2023). The claim was historically True, but now False under current law.
Dinosaurs Had Belly Buttons
Original label: True
Facticity.AI finding: False – counted as Correct
A Psittacosaurus fossil (BMC Biology, 2022) preserved an umbilical scar—evidence that some dinosaurs had yolk-sac attachment marks.
However, generalizing this across all species is unsupported. The claim was False by overgeneralization.
Human Babies Detect Spicy Flavors
Original label: True
Facticity.AI finding: Unverifiable – counted as Correct
Facticity.AI identified this claim as Unverifiable.
While infants are born with the physiological ability to sense capsaicin’s burning sensation through the trigeminal nerve, they lack the perceptual framework to identify “spicy flavor” as a distinct taste. In other words, babies feel the heat, but don’t yet perceive spice.
When “False” Isn’t the Same as “Unverifiable”
Facticity.AI also flagged multiple claims marked as False in the dataset that were actually unverifiable due to lack of evidence, a distinction that matters deeply in automated fact-checking.
Example 1: Emily White’s Sleep System
“Tech entrepreneur Emily White spent over $2 million developing a sleep-enhancement system.”
No credible evidence links Emily White to such a project. The $2M figure belongs to Bryan Johnson’s longevity research, not White’s.
Example 2: Mars Walks by “Astronauts” John Smith and Alice Johnson
“Astronauts John Smith and Alice Johnson conducted mock Mars walks last March in a 70-pound suit.”
NASA records do not confirm their astronaut status or participation. John Smith is a Langley scientist, not an astronaut.
Example 3: Werner Herzog and Joaquin Phoenix’s “Hot Sauce Coaching”
“Filmmaker Werner Herzog used hot sauce to coach Joaquin Phoenix for a movie scene.”
Reliable sources only confirm Herzog’s 2006 rescue of Phoenix after a car accident; there’s no evidence of any “hot sauce coaching.”
Facticity.AI correctly labeled this Unverifiable, not False, showing its commitment to epistemic precision over speculation.
Key Lessons Learned
Temporal Precision: Facts are time-dependent. Numbers, laws, and data drift.
Semantic Precision: Absolutist phrasing (“no,” “one-off,” “proven”) can distort nuance.
Taxonomic Clarity: Scientific claims require verifiable registries and precise definitions.
Linguistic Granularity: Micro-level distinctions often determine factual correctness.
Why Dynamic Grounding Matters
The Originality Benchmark is not static, and truth shouldn’t be either. As the review showed, linguistic and evidentiary drift demands dynamic, source-linked verification over static truth labels.
Facticity.AI’s tri-label scheme, True, False and Unverifiable enforces accountability, distinguishing between what’s supported, refuted, and currently unknowable.
Final Results
After this review:
Facticity.AI: 118 / 120 correct classifications (98.33%)
Competing system: 113 / 120 correct classifications (94%)
Without access to the raw outputs of other models, independent verification of premise recognition isn’t possible, but the distinction underscores Facticity.AI’s superior factual comprehension and evidentiary integrity.
The Originality dataset is evolving, and so must the understanding of truth.
Facticity.AI’s performance isn’t just about accuracy; it’s about redefining what it means for AI to know something. By grounding every claim in verifiable context,
Facticity.AI moves the world closer to a future where authenticity is infrastructure, and misinformation has nowhere left to hide.
Contact Information:
AI Seer Pte. Ltd.
Dennis Yap
65 83050508
Contact via Email
www.linktr.ee/yapdennis
Please contact through LI (www.linkedin.com/in/dennisye) before trying to call.
Read the full story here: https://www.pr.com/press-release/952054
Press Release Distributed by PR.com




















