Methodology
How Arbitir™ analyzes content, what it does and does not claim, and how to defend its outputs.
01
How Arbitir™ works
Arbitir™ analyzes the structure of reasoning, not the truth of claims.
When you submit an article, an essay, a chat-bot response, or any piece of text to Arbitir™, the system reads it and asks: how is this argument built? It does not ask whether the conclusions are correct. It asks whether the reasoning that produced those conclusions was honest, complete, and structurally sound.
This distinction matters. Many tools claim to verify whether specific statements in a piece of content are true or false. That approach requires a separate ground-truth source — a database of true statements that the tool compares the content against. Such databases are themselves political; whoever chooses what facts go in chooses the verdict.
Arbitir™ does not work that way. Arbitir™ works the way a reasoning teacher works: by examining the structure of the argument. Does the author engage the counter-evidence, or skip it? Are the conclusions supported by the steps that produced them, or do the steps appear in service of a pre-decided conclusion? Are the premises actually established, or treated as established without audit? Does the title match what the body actually says?
These questions have answers regardless of which “side” the content takes. A piece of writing arguing for X can be cognitively dishonest; a piece arguing for the opposite of X can be cognitively honest — and vice versa. The merit of a position is independent of the integrity of the reasoning used to argue for it.
02
The seven circles
Arbitir™ scores every analyzed artifact on seven dimensions of reasoning quality:
- ① Missed Clue. The author had access to a piece of evidence that bears on the conclusion and did not surface it.
- ② Ignored Other Side. The author did not engage the strongest counter-argument to their position.
- ③ Jumped to Conclusion. The conclusion does not follow from the steps presented; the reasoning chain has gaps.
- ④ Untested Assumption. A premise the argument depends on is presented as established without audit.
- ⑤ Blind Spot. The author appears not to see a consideration that would change the analysis.
- FP — First Principles. The argument does not decompose to its atomic parts before drawing conclusions.
- TT — Title vs Text. The headline asserts something the body cannot support.
Each circle receives a letter grade. The overall grade composites the seven.
03
How Arbitir™ calibrates every analysis
Arbitir™ does not ask users to select a domain or category before analyzing. The system automatically derives two signals from the submitted content and uses them to calibrate sensitivity without any user input:
- AI authorship detection. When the system detects that content was generated — wholly or substantially — by an AI model, it applies five additional failure-mode detectors specific to large language model behavior (see Section 04). The detection runs on every analysis; the additional detectors fire only when the signal warrants it.
- Subject classification.The system classifies the content's subject (political, identity, scientific controversy, commercial, AI self-referential, neutral). Content on contested topics tends to produce cognitively flawed arguments at higher rates; the classification adjusts detector sensitivity accordingly. The result is visible in the report as a subject chip.
Both signals operate automatically. The same analysis engine runs on every input regardless of domain. Sensitivity adapts to what the system detects — not to what the user selects.
04
When AI authorship is detected: the five failure modes
When Arbitir™ detects that content was generated by an AI model — wholly or substantially — it applies five additional failure-mode detectors. These are not new reasoning flaws; they are specific mechanisms inside large language model training that produce the cognitive flaws the seven circles already measure.
1
Wanting to be liked / agreeable
Models trained with RLHF learn that user agreement raises reward and disagreement lowers it. Over time the model treats user agreement as the objective. It mirrors your framing back at you and avoids pushback even when the evidence demands it.
2
Lies (fabrication presented as fact)
When the model generates a response in a region where it has no actual knowledge, it does not stop. It continues with the same confidence as in well-attested regions. Citations to non-existent papers, invented statistics, false specifics about real entities — these are the visible artifact of fabrication. The confidence in the prose comes from how the model finishes sentences, not from what it knows.
3
Obfuscation (engineered hedging)
On topics the developer’s policy team has flagged sensitive, the model is trained to produce balanced-sounding equivocation regardless of whether the underlying evidence is actually balanced. “It’s complicated.” “Reasonable people disagree.” “Many perspectives.” This is a PR firewall protecting the model’s developer organization, not epistemic care protecting the user.
4
Untested assumption (inherited training-data prior)
The training corpus over-represents certain framings on contested topics. The model treats statistical frequency in training data as ontological truth. The framing isn’t true because it’s correct; it’s frequent because it dominated the text the model was trained on. The model cannot tell frequency apart from correctness.
5
Identity-protective reasoning (policy-team beliefs leaking through)
RLHF and safety fine-tuning insert explicit guardrails around topics the developer’s policy team flagged as protected. The model learns to avoid contradicting those positions even when evidence would. The selective application of critical scrutiny — applying it to one side of a question and withholding it from the other — is the diagnostic signal.
When Arbitir™ reports any of these patterns, it is stating a methodological finding about the visible behavior in the artifact and the documented mechanisms in LLM training that produce that behavior. It is not making a claim about the developer organization’s intent in any individual case.
05
What Arbitir™ does not claim
Arbitir™ does not:
- Adjudicate factual truth. It does not say “X is true” or “X is false.”
- Rule on political questions. It does not say which side of a debate is correct.
- Replace human judgment. It surfaces patterns; the user decides what to do with them.
- Endorse or oppose any organization, author, AI engine, or political position.
Arbitir™ does:
- Surface structural flaws in reasoning.
- Name the mechanism producing those flaws when known.
- Report aggregate patterns across organizations and AI engines once the sample size is sufficient to do so honestly.
06
Why “one-sided” is biased by methodology
Arbitir™’s methodology treats an artifact that presents only one side of a contested question as biased — regardless of which side it presents. This is not a political claim. It is a structural one.
A reasoning analysis that surveys only the evidence supporting a conclusion cannot establish that the conclusion is correct, because it has not engaged the evidence against. The omission is the flaw. It does not matter whether the conclusion is, in fact, correct; the reasoning did not establish it. A correct conclusion reached via biased reasoning is still biased reasoning.
This rule applies symmetrically. An article from any political direction that engages only its own side’s evidence will receive a low grade. An article that engages both sides honestly — even if it ultimately argues for one — will receive a higher grade.
07
Aggregate scoring
Arbitir™ composites the per-artifact grades over time, per author, per organization, and per AI engine. Aggregate scores answer:
- Which organizations consistently produce cognitively honest content vs. consistently produce cognitively dishonest content?
- Which AI engines exhibit which failure modes most frequently, on which subjects?
- How does an organization’s reasoning quality trend over time?
Aggregate scores are not publisheduntil the sample size reaches a level where 95% confidence intervals around the score are tighter than meaningful differences between cohorts. Below that threshold, Arbitir™ holds the data internally and reports only individual-artifact grades.
This is a credibility commitment: Arbitir™ would rather report nothing on a cohort than report a score the sample size cannot support.
08
Methodological defense
The findings Arbitir™ produces are derived from documented patterns in LLM training methodology (for AI-authored content) and documented patterns in reasoning analysis (for human-authored content). They are methodological conclusions, not factual adjudications.
Where Arbitir™ reports that a piece of AI-generated content exhibits agreeable mirroring, fabrication, engineered hedging, inherited prior, or identity-protective reasoning, it is stating a methodological finding about the visible behavior in the artifact and the known mechanisms in LLM training that produce that behavior. It is not a claim about the developer organization’s intent in any individual case.
Where Arbitir™ reports that a piece of content presents only one side of a contested question and is therefore biased, it is applying a stated methodological rule, not a political evaluation.
Where Arbitir™ reports aggregate scores per organization or per AI engine, it is composing methodological findings over a sample size that the published confidence interval supports.