How We Use AI Without Letting It Hallucinate Into Reviews

An LLM will tell you, in confident prose, that a tool has a free tier it does not have, a price that changed eight months ago, and an integration that was never shipped. None of those are typos. They are the model filling a gap in its training data with the most plausible-looking token, and plausible is exactly the problem: a hallucinated spec reads identically to a correct one. If you publish reviews, that failure mode is not a curiosity. It is the thing that gets a reader to sign up for the wrong plan.

We use AI to write here, and we say so on every article that an LLM touched. So the honest question is not whether we use it — it’s what we do to keep it from inventing facts. This is the workflow.

The one rule: AI never sources its own facts

The single decision that prevents most hallucinations is structural, not clever. We separate two jobs that LLMs are wrongly assumed to do together: generating prose and establishing facts. The model is allowed to do the first. It is never allowed to do the second.

Concretely, that means every load-bearing claim in a review — a price, a tier limit, a launch date, whether feature X exists — comes from a source we opened ourselves, not from the model’s memory. The pricing page. The changelog. The docs. The actual product, in a trial account. We paste those facts into a notes document first, with the URL and the date we checked it, and only then does the model get to write around them.

The prompt we hand the model is the inverse of how most people use these tools. Instead of “tell me about Tool X’s pricing,” it’s “here are the four pricing facts, verified today; write the comparison paragraph using only these and flag anything you’d normally add that isn’t here.” That last clause matters. It turns the model’s instinct to embellish into a list of things for a human to go verify, rather than a list of things that quietly ship.

A related discipline: we don’t let the model cite. If a draft comes back with “according to a 2024 study” or “users report,” that phrase gets cut unless we can produce the study or the actual thread. Models generate citations the same way they generate everything else — by pattern — and a confidently formatted fake reference is worse than no reference, because it borrows the authority of a real one.

What the model is actually good for

Saying “we don’t trust it with facts” can read as “we don’t really use it,” which isn’t true. The model does a lot of work; it just does the kind of work where being wrong is visible and cheap to fix.

It restructures. Hand it a messy set of verified notes and it produces a clean section order faster than we would. It catches the second “however” in a paragraph. It rewrites a sentence we’ve stared at too long. It generates the three FAQ questions a reader probably has, which we then answer ourselves from sources. It drafts the comparison-table skeleton so we’re filling cells instead of building markup.

None of those tasks require the model to know a single true fact about the outside world. They’re transformations of text we already verified, or structural suggestions a human signs off on instantly. That’s the sweet spot: the model’s output is checkable at a glance, and a wrong answer costs us ten seconds, not a reader’s trust.

The place we keep the source-of-truth — the verified facts, the dated URLs, the “do not let the model touch this” list — needs to be a real document, not a chat scrollback. We run it in a structured workspace so each claim has a checkbox, a source link, and a last-checked date that an editor can sort by.

Notion

Where our verified-facts sheet lives: one row per claim, each with a source URL and a last-checked date, so an editor can sort by what's gone stale before anything republishes.

Free for personal use; paid plans from $10/user/mo

Try Notion

Affiliate link · We earn a commission at no cost to you.

The check before publish, and the check after

Before a review goes out, it gets a pass whose only job is to find unsourced claims. The reviewer isn’t reading for style; they’re reading every factual sentence and asking “where did this come from?” If the answer isn’t in the notes doc, the sentence doesn’t ship. This is deliberately a separate pass from the editing pass — bundling them is how a smooth, well-written, factually invented paragraph slips through, because good prose lulls you into trusting the content.

The after-publish problem is different and sneakier. A review can be 100% accurate the day it ships and wrong three months later because the tool changed its pricing. No amount of pre-publish discipline catches that. So the dated source links aren’t just for the initial check — they’re a recheck schedule. When a fact’s last-checked date gets old, or when a tool announces a change, we re-open the primary source and update the article, and we log it in the changelog so readers can see what moved and when. An AI-assisted review that’s never revisited drifts into the same wrongness as a hallucinated one; it just takes longer to get there.

That’s the whole system, and it’s intentionally unglamorous. The model writes; humans own the facts; every claim has a dated source; two reads before publish and a recheck after. None of it depends on the model getting better or being prompted more cleverly. It depends on never asking the model to be the thing it can’t reliably be.

FAQ

Do you disclose which articles used AI?

Yes. Any article where an LLM wrote part of the body carries an AI-assisted note. The disclosure is non-negotiable — it's the baseline for the rest of the process to mean anything.

If a human verifies every fact anyway, what does the AI actually save?

Structuring, drafting, and rewriting — the text transformation work where a wrong output is obvious and costs seconds to fix. We move the human effort off prose mechanics and onto fact-checking, which is where it's actually needed.

How do you handle a price or feature that changed after publishing?

Each factual claim is stored with a source URL and a last-checked date. When a date goes stale or a tool announces a change, we re-open the primary source, update the article, and record it in the changelog so the edit is visible.

How We Use AI Without Letting It Hallucinate Into Reviews

The one rule: AI never sources its own facts

What the model is actually good for

Notion

The check before publish, and the check after

FAQ

What 18 Months of Affiliate Data Taught Us About Which Reviews Convert

Why pickuma Runs No Sponsored Posts (and How That Shapes Recommendations)

How We Score Tools: The Rubric Behind Every pickuma Review

The E-E-A-T Signals We Actually Invest In (and the Ones We Skip)

How We Handle Internal Linking Across Hundreds of Articles Without a Spreadsheet

Get the best tools, weekly