pickuma.
Meta

How We Pick What to Review: The Content Calendar Behind Pickuma

We maintain a running spreadsheet of tool candidates scored by search demand, community discussion volume, and testability. Here is exactly how we decide which tools get reviewed, which get skipped, and why the queue looks the way it does.

8 min read

The most common question I get from readers is not about a specific tool. It is, consistently: what are you reviewing next, and when?

When you are evaluating tools for a time-sensitive decision, a review that arrives in two months is a review that arrives too late. If you do not know whether it is coming at all, you cannot plan around it. This article is the answer: a transparent walkthrough of the content calendar that drives Pickuma — how I maintain it, how I score candidates, how I decide what gets published when, and which topics I kill before they reach a draft.

The Content Calendar, Explained

I maintain a single Google Sheet that functions as the editorial calendar. It is not a project management tool or a publishing platform. It is a spreadsheet with columns, and the columns are: tool name, category, demand score, testability score, editorial fit score, composite priority, estimated publish window, current status, and notes.

The sheet contains roughly sixty candidates at any given time. About twelve are active — they have cleared the selection gates and are somewhere between initial research and final draft. About twenty are scored and queued but not yet started. The remaining thirty are flagged for re-evaluation before they enter the queue.

Scores are recalculated once a month, on the first Monday. Scoring more frequently would produce noise — minor search volume fluctuations or a single Reddit thread would bounce priorities around. A monthly cadence lets signals accumulate enough to be meaningful and keeps the queue stable enough that I can commit to publish windows.

How We Score and Prioritize

The scoring system is deliberately simple. Each candidate gets three scores on a 1-10 scale, and the composite priority is a weighted average. Three dimensions is where additional dimensions stop adding signal and start adding noise.

Demand score (weight: 40%). Combines search volume for comparison and evaluation queries, community discussion persistence, and direct reader requests. Comparison queries (“X vs Y”) carry more weight than brand searches. A reader request describing a specific evaluation problem carries more weight than a generic “can you review X.” Five or more unique requests for the same tool force the demand score to at least 7, regardless of what search and community signals say.

Testability score (weight: 40%). Can I actually evaluate this tool in a way that produces useful observations? A tool with a generous free tier, clean documentation, and sub-30-minute setup scores a 9 or 10. A tool requiring a multi-node cluster, opaque pricing hidden behind a sales call, and a week of configuration before meaningful results scores a 2 or 3. Tools in the 2-3 range are effectively excluded — I review them only when demand is overwhelming and I can justify the infrastructure cost.

Editorial fit score (weight: 20%). The subjective dimension. Do I understand the category well enough to evaluate it properly? Would the review add something that does not already exist? Is the tool genuinely interesting rather than incrementally better? A tool that changes how developers think about a category scores high. A tool that is 15% cheaper with no other differentiation scores low.

The composite determines queue position, but I maintain a separate “publish window” column that spaces out same-category reviews. Publishing two database reviews back-to-back is less useful than alternating — database, then CI/CD, then monitoring, then returning to databases.

This is the hardest editorial decision I make, and I do not have a formula for it. The tension is this: trending topics generate traffic now, but evergreen content generates traffic forever. A review of a tool that launched last week and is dominating Hacker News will bring readers immediately, but the same review will be stale in six months when the tool has shipped three major versions. A comparison guide covering a settled category generates modest but consistent traffic for years, but it does not feel urgent to publish.

I aim for roughly 70% evergreen and 30% trending, but this is aspirational, not a quota. Some months the candidate pool tilts toward comparisons and settled categories; other months a wave of new tools shifts the ratio toward trending reviews. I do not force a ratio. I watch it.

The discipline I enforce: I do not rush a review to catch a trend. Most reviews take two to four weeks of testing, writing, and revision, plus another week for the editorial pipeline. If a tool launches and the hype cycle will be over before I can publish a quality review, I let it go. A rushed review that is wrong damages the site more than a review that is late.

This is also why I prioritize evergreen content so heavily. Trending reviews have a shelf life. Evergreen content accumulates. The comparison guides I published a year ago are still generating traffic today. A review of a tool that was hot for two weeks in March would be generating nothing by May.

What Gets Rejected and Why

Beyond the scoring system, there are categories of candidates that I kill outright — not deprioritize, not move to a “maybe later” column, but remove from the sheet entirely.

Pre-launch tools. If a tool exists only as a GitHub README, a waitlist, or a demo video, it is not a candidate. I made this mistake once — I evaluated a tool based on documentation and a private demo, wrote a positive review, then watched the public launch ship something materially different from what I had tested. The review was wrong. I pulled it.

Tools in categories I do not understand. I review developer tools, infrastructure, SaaS productivity tools, and select finance tools where I have domain expertise. If a reader requests a review of a video editing suite or a CRM platform, I decline. Publishing a review without the context to distinguish real innovation from feature parity produces exactly the shallow, summary-of-summaries content that I built Pickuma to replace.

Tools with predatory pricing. If pricing is hidden behind a sales call, or if the pricing model penalizes existing customers without corresponding value improvements, I exclude the tool from the candidate list regardless of demand. This is not a gate that most review sites apply, and I understand why — it eliminates high-revenue affiliate categories. But I cannot in good conscience recommend a tool where the cost is unknowable at evaluation time.

Review requests from vendors. These get filed separately and carry zero weight. A vendor email that says “we would love for you to review our product” does not influence the scoring system. If the tool is worth reviewing, developers are already asking about it.

FAQ

How far ahead do you plan the content calendar? +
I maintain a rolling three-month publish window. The current month is committed — those articles are in active testing or writing. The following month is queued but flexible — unexpected demand signals or reader requests can reshuffle it. The third month is aspirational — it contains candidates that I intend to review but have not started. Beyond three months, the candidate list is a pool, not a calendar. Tools move from the pool into the publish window when their composite score rises, when a reader request tips them over the threshold, or when a personal project forces me to evaluate them anyway.
Why are there gaps between reviews in the same category? +
Because publishing two reviews in the same category back-to-back helps no one. A reader evaluating database tools needs a review of a database, not three reviews of three databases published on consecutive days. I also need the testing window between category-adjacent reviews to reset — switching from evaluating a monitoring platform to evaluating a CI/CD platform is less taxing than switching from one monitoring platform to another. The deliberate spacing is less efficient for production volume but produces reviews where the evaluation is sharper and the comparisons are more meaningful.
Can I see the full candidate list and scoring spreadsheet? +
Not the full spreadsheet — it contains internal notes and scoring details that would not be useful out of context. But we maintain a public view of the active pipeline on the site that shows which tools are currently being evaluated, which are queued for the next publish window, and which have been deprioritized with the reason. This public view is updated monthly alongside the scoring recalculation.

The calendar is a living document. It is wrong more often than it is right — demand shifts, testing reveals surprises, tools I expected to recommend turn out to be poor fits, and tools I expected to dismiss turn out to be excellent. The calendar’s job is not to be correct as a prediction. Its job is to be honest as a process.

If there is a tool you are evaluating and the answer to “is Pickuma reviewing it” would change your decision, send the request through the contact form. I read every one. I cannot promise a review, but I can promise that the request will be logged, scored, and considered alongside every other signal in the sheet. That is more than most review sites will tell you, and I think that matters.

Related reading

See all Meta articles →

Get the best tools, weekly

One email every Friday. No spam, unsubscribe anytime.