🔬 PaperTrail Verify  ·  From $0.60/report

When you need to go
deeper than the numbers.

StyleMatch gives you eight metrics. Verify gives you what the AI can see that the metrics can't — internal consistency patterns across the whole essay, or a ten-dimension qualitative analysis that reads the writing itself, not just its statistics.

Two report types — no controlled sample needed for the first AI reads the writing, not just the statistics Evidence pulled directly from the student's text Suggested conversation questions generated per report Credits never expire — use them when you need them
🔬
PaperTrail StyleMatch
Revision analysis & authorship comparison
Student Name
Emma Kowalski
Student ID
2841-K
Date
3/10/2026
Teacher Name
Optional
Course
Optional
Assignment
Optional
📄 Submitted Work 1766 words
✓ Fill from open doc
To what extent is interpretation a reliable tool in the production of knowledge? To say that all knowledge must be interpreted holds true in many regards…
✏️ Controlled Sample 1861 words
…which that knowledge is produced and interpreted. Furthermore, it questions whether a deeper understanding is necessarily a better one…
🔍 Run StyleMatch
?
🔬 Verify — Submitted
?
⚖️ Verify — Compare
?
Two Report Types

One uses what you have.
The other goes further.

Verify uses AI to read the writing itself — not just count words and measure sentences. Both report types draw on the same underlying stylometric signals as StyleMatch, but layer qualitative analysis on top: what the writing actually says about how it was produced, who produced it, and whether the two samples sound like the same person.

No sample needed
🔬 Consistency Report

One essay.
Is it all the same writer?

The Consistency Report works on a single submitted document — no controlled sample required. The AI chunks the essay into ~150-word segments and scores each one on six dimensions: Register, Vocabulary, Sentence Complexity, Argument Depth, Error Density, and Cohesion.

  • A colour-coded heatmap shows which chunks stand out from the document's own baseline
  • Shifts in register, vocabulary sophistication, or argument depth mid-essay are flagged
  • A writer baseline profile is generated — the characteristic measurements for this document
  • Pairs well with AI detection tools that highlight specific passages — Verify shows you whether those passages fit the rest of the document stylistically
Use this when: You don't have in-class writing, or you want to check whether a submitted essay is internally consistent — one voice throughout, or something else.
⚖️ Baseline Deviation Report

Two samples.
Does the submitted work sound like the same person?

The Baseline Deviation Report goes beyond StyleMatch's statistical comparison. It uses AI to analyse ten qualitative dimensions — including patterns that numbers can't capture, like error fingerprints, characteristic phrase recurrence, and syntactic construction habits.

  • StyleMatch scores serve as quantitative anchors — the qualitative analysis builds on them
  • Ten dimensions rated Anomalous / Notable / Within Range with evidence pulled from the text
  • AI-suggested follow-up questions tailored to the specific deviations found
  • Research citations for every dimension — ten peer-reviewed sources cited per report
Use this when: StyleMatch raised flags and you want qualitative depth before a student conversation — or when you need a documented, evidence-based report.
Report Type 1

Consistency Report —
no controlled sample needed

This is the report that truly sets Verify apart. You don't need anything except the submitted essay itself. The AI segments the document into chunks, establishes a baseline from the document as a whole, then measures every chunk against that baseline.

What it's looking for is not bad writing — it's inconsistent writing. A sudden jump in vocabulary sophistication. A chunk where argument depth drops to descriptive. A change in register halfway through. These are the kinds of shifts a teacher notices but can't quite name.

🌡️

Six-dimension chunk scoring

Each chunk is scored 1–10 on Register, Vocabulary, Sentence Complexity, Argument Depth, Error Density, and Cohesion. The document's own average becomes the baseline — deviations are measured against the writer's own fingerprint, not a generic standard.

🎨

Visual heatmap

Chunks are colour-coded: teal for within baseline, amber for notable, red for flagged. A clean grid gives you a visual read of the essay's consistency before you look at a single number.

🤝

Pairs with AI detection tools

If another tool has flagged specific passages as potentially AI-generated, the Consistency Report tells you whether those passages are also stylistically inconsistent with the rest of the document. It's a second, independent signal from a different angle.

👤

Writer baseline profile

The report closes with a characterisation of this writer's style — their typical register, vocabulary level, sentence complexity, and argument depth. A useful reference if you later run a Baseline Deviation Report on a second piece from the same student.

🔬 PaperTrail Verify — Consistency Report
78
Consistent with Single Authorship
29-chunk analysis · no anomalies flagged · stable formal register throughout
Words: 3,937 Chunks: 29 Flagged: 0 Student: Emma Kowalski
Stylometric Heatmap — 29 chunks 0 differences noted
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Within Baseline
Notable
Flagged
Chunk Measurements — sample
ChunkRegisterVocabComplexityArg. DepthCohesionStatus
13 ↓4 ↓3 ↓2 ↓4 ↓Within
37 ↑7 ↑6 ↑67 ↑Within
117 ↑7 ↑6 ↑7 ↑7 ↑Within
18667 ↑7 ↑6Within
Document Baseline Profile
Register
5.6
Vocabulary
5.8
Complexity
5.3
Arg. Depth
6.0
Error Density
7.5
Cohesion
6.1
Report Type 2

Baseline Deviation Report —
ten dimensions, cited evidence

This report is an extension of StyleMatch — it takes the statistical anchors StyleMatch produces and layers ten qualitative dimensions on top. The AI reads both writing samples and assesses each dimension, pulling actual phrases from the text as evidence rather than just reporting numbers.

These ten dimensions were chosen because they capture things that statistical metrics miss: how a writer constructs sentences, not just how long they are. Which error patterns recur, not just how many. Whether characteristic phrases reappear across both samples, or whether the vocabulary and idiom belong to two different people.

🔡

Error & Mechanics Fingerprint

Recurring grammar patterns, article errors, tense inconsistencies, comma splices. Error patterns are among the most individual stylometric signals — and the absence of expected errors is as significant as their presence. If a student's in-class writing has consistent spacing errors around numbers that vanish in the submission, that's a finding.

🔧

Syntactic Patterning

Structural construction preferences beyond complexity: how the writer opens sentences, how they link clauses, what phrase-structure habits recur. Two writers can produce sentences of identical average length using completely different structural habits. This dimension captures that.

💬

Characteristic Phrase Recurrence

Whether the same analytical hedges, evidence-introduction patterns, and idiosyncratic expressions recur across both samples. Recent LLM-era authorship research has found that characteristic content words carry strong authorial signal — particularly when a student has a distinctive way of framing claims.

🗂️

Nominal vs. Verbal Style

Whether the writer favours noun-heavy constructions (the implementation of, a consideration of) or verb-driven ones (implementing, considering). This cognitive-stylistic habit is remarkably stable across topics and genres — and hard to fake consistently.

💡

AI-generated follow-up questions

The report closes with suggested questions tailored to the specific deviations found — not generic prompts, but questions that reference what was actually observed. What to ask a student about their function word patterns. How to probe the absence of characteristic errors. Ready to use before the conversation.

⚖️ PaperTrail Verify — Baseline Deviation Report
2
Notable Baseline Deviation
2 anomalous · 2 notable · 6 within range across ten dimensions
4242-word submitted vs. 5045-word baseline
Report ID: PT-K7N9M3  ·  Date: March 10, 2026  ·  Student: Emma Kowalski
StyleMatch Algorithmic Baseline
Function Word Profile
43/100
Punctuation Fingerprint
82/100
Sentence Length
75/100
FK Grade Level
92/100
Type-Token Ratio
87/100
Informality Markers
98/100
2AnomalousExceeds expected variation
2NotableWarrants attention
6Within RangeConsistent with baseline
Ten-Dimension Analysis
Function Word Distribution
🔴 Anomalous
Submitted Work
in whichof thehas been
Controlled Sample
to ain thewhich is
43/100 similarity — substantial deviation in preposition and conjunction patterns. Function words are used unconsciously and remain stable across topics.
Error & Mechanics Fingerprint
🔴 Anomalous
Submitted Work
border cross militancycompelling questions over
Controlled Sample
'800,000people'42million' people
Characteristic spacing errors around numbers in the controlled sample are completely absent in the submitted work. The absence of expected errors is a documented authorship signal.
Syntactic Patterning
🟡 Notable
Submitted
As of 2023…The CFR remarked that
Baseline
According to…According to…
Both favour source-attribution openings, but submitted work shows more varied structural patterns — temporal markers, prepositional phrases — vs. repetitive "According to" in the baseline.
Nominal vs. Verbal Style
✅ Within Range
Submitted
the implementation ofthe articulation of
Baseline
the ratification ofthe lack of
Consistent noun-heavy academic constructions in both samples. Stable cognitive-stylistic preference.
AI-Generated Follow-Up Questions
1.
Can you walk me through how you decided to structure this opening paragraph — specifically why you used "in which" rather than a more direct construction?
2.
I noticed your in-class writing has some consistent patterns around numbers and quotations that don't appear here. Can you tell me about your editing process for this piece?
3.
In your controlled sample you tend to introduce sources with "According to" — why did you vary that structure more in this essay?
The Research Basis

Ten dimensions.
All of them grounded.

Every dimension in the Baseline Deviation Report is anchored in peer-reviewed research. The AI isn't pattern-matching against a generic rubric — it's applying stylometric frameworks that have been validated across thousands of texts in authorship attribution, forensic linguistics, and computational humanities research.

The research citations appear in every report — not as a disclaimer, but because a teacher handing a report to a department head or parent deserves to be able to say exactly what it's based on and where it comes from.

Function Word Distribution
Burrows' Delta — the most validated authorship attribution method
Burrows (2002) Literary & Linguistic Computing · Mosteller & Wallace (1964) Federalist Papers attribution
Error & Mechanics Fingerprint
Idiosyncratic error patterns are among the most individual stylometric signals
Sapkota et al. (2014) NAACL-HLT · Stamatatos (2013) Journal of Law and Policy · Character n-gram stability research
Syntactic Complexity & Patterning
Sentence structure and clause-linking preferences are stable cross-genre signals
McNamara et al. (2014) Coh-Metrix · Björklund & Zechner (2017) Natural Language Engineering
Cohesive Device Profile
Discourse connective preferences are stable per writer and reflect reasoning habits
Ferracane, Wang & Mooney (2017) IJCNLP · Hyland, Metadiscourse (2005)
Characteristic Phrase Recurrence
Content words and characteristic phrases carry strong authorial signal in LLM-era research
PMC / National Library of Medicine (2025) PLoS ONE doi:10.1371/journal.pone.0320609
Lexical Diversity
Length-normalised measures (MATTR, MTLD) provide stable cross-text comparisons
Covington & McFall (2010) Journal of Quantitative Linguistics · Biber (1988) Variation Across Speech and Writing

Pay for what
you actually use.

Verify runs on credits. Buy a pack when you need one — they never expire. Most teachers use Verify when something in Inspect or StyleMatch warrants a deeper look, not on every submission.

Inspect is always free. StyleMatch requires an active subscription.

Starter Pack

10 reports — enough for a full class set when something comes up. Mix Consistency and Baseline Deviation as needed.

$6 10 reports · $0.60 each Credits never expire

Standard Pack

35 reports — a semester's worth of deeper investigation for a busy teacher.

$18 35 reports · $0.51 each Credits never expire

Department Pack

100 reports — for a department or shared license. Coordinate through one account.

$45 100 reports · $0.45 each Credits never expire
🔒

Text not retained

Student text is transmitted to generate the report, then immediately discarded. Nothing is stored after analysis completes.

👤

No student accounts

Students never interact with PaperTrail. The teacher runs the report from their own account. No student data is collected.

📋

Educator judgment first

Every report carries a disclaimer — it is a data point, not a verdict. No report should be used as the sole basis for any disciplinary decision.

🏫

FERPA & PIPEDA conscious

Designed with North American school privacy requirements in mind. No advertising, no data resale, no third-party tracking.

Start with Inspect free. Use Verify when it matters.

Most of the time, Inspect tells you what you need. When something warrants a deeper look — before the conversation with the student — that's when Verify earns its credit.

⬇ Add PaperTrail to Chrome
Inspect always free StyleMatch $4.99/mo Verify from $0.60/report Credits never expire