pickuma.
Meta

How We Pick Tools to Review: Selection Criteria and Prioritization

The criteria Pickuma uses to decide which developer tools get reviewed — from community demand signals to practical testability constraints. An inside look at our candidate selection process.

8 min read

For every tool review we publish, there are roughly fifty tools we evaluated and decided not to cover — and several hundred more that never made it past the initial candidate scan. I want to explain exactly how those decisions are made, because the selection process determines what you see on this site as much as the writing process does.

Here is the honest starting point: I receive more review requests than I can fulfill, my personal interests do not perfectly align with what the audience wants, and my testing infrastructure has real constraints that exclude entire categories of tools. The process described here is my attempt to make those trade-offs explicit rather than pretending they do not exist.

How I Discover Candidate Tools

The candidate pool comes from four sources, and I track them in a spreadsheet that I review every two weeks.

My own work. This is where most reviews start. When I encounter a tool in the course of building something — a new database I spin up to test a side project, a deployment platform I try because the one I am using frustrates me, an open-source library that a colleague recommends — I add it to the candidate list with a note about what problem it solved and whether it felt like a genuine improvement over the default choice. These tools get priority because my testing starts from genuine need, not from a “what should I review next” brainstorm. The insights that come from using a tool because you have to are different from the insights that come from using a tool because you plan to write about it.

Community signal monitoring. I spend a few hours each week reading developer communities — Hacker News, specific subreddits, Discord servers for technologies I work with, and GitHub Discussions on popular projects. When a tool generates sustained conversation over weeks or months — not a launch-day spike that dies in two days, but recurring discussion with people reporting real experiences — it goes on the candidate list. I track this manually because automated sentiment analysis of developer conversations is unreliable, and I would rather do an incomplete manual job than an automated job I cannot verify.

Reader requests. Every request that comes through the site contact form gets logged with a date, the tool name, and a brief summary of what the reader wants to know. When a tool accumulates five or more unique requests from different people, it moves to the top of the priority queue regardless of search volume or my personal interest. This is a deliberate correction mechanism: the tools I find interesting are not always the tools developers are struggling to evaluate, and the reader request pipeline is how I calibrate editorial taste against audience demand.

Search demand analysis. I use keyword data to identify categories where developers are searching for comparisons and reviews but finding thin results. The signal I look for is comparison intent — queries like “X vs Y for Z use case” or “best tool for Python data pipelines” — because these indicate someone in the evaluation phase who is not finding adequate information. Generic brand-name searches do not count as demand because the searcher is likely a current user, not a prospect.

The Three-Gate Filter

Once a tool is on the candidate list, it passes through three sequential gates. Tools that fail any gate are either deprioritized or removed from the list entirely, with the reason documented.

Gate one: real demand. The question here is simple: are developers actually looking for information about this tool? I measure this through the signals described above — search volume for comparison queries, community discussion persistence, and reader request count. If a tool shows no demand across any of these channels, it stays on the candidate list but at the bottom. I do not review tools because a vendor asked me to, offered me a trial account, or suggested that their tool “would be great for your audience.” Vendor outreach counts as zero demand signal. If the tool is good enough to review, developers are already asking about it. If they are not asking about it, a review is not going to manufacture demand where none exists, and I would rather spend my time on tools people are actively evaluating.

Gate two: practical testability. This is the gate that eliminates the most candidates, and it is the one I have the least control over. Can I actually test this tool in a way that produces meaningful observations?

The tools I can test share several characteristics: they have a free tier or open-source version that is fully functional, they can be evaluated on a single machine or a small cluster, and their core value proposition is observable within hours rather than months. This biases our coverage toward tools that serve individual developers and small teams — which, as it happens, is also the audience reading our reviews.

The tools I cannot test meaningfully include enterprise platforms that require a multi-node cluster to evaluate, security products that need a realistic threat simulation environment, and data platforms that require terabyte-scale datasets to expose their performance characteristics. When a tool falls into this category, I do not pretend it does not exist. I publish a note explaining that the tool was deprioritized due to testing constraints, not quality judgment. Some of the tools I cannot test are probably excellent. I just cannot verify that, and I will not publish a review that pretends otherwise.

This testability constraint is a real limitation of the site, and I want to be direct about it. Pickuma will never be able to review every developer tool worth reviewing. The infrastructure budget and the time budget are both finite. What we can do is be honest about which tools we can evaluate properly and which ones we cannot.

Gate three: editorial judgment. This is the most subjective gate, and I do not pretend it is anything other than my personal taste. After demand and testability are satisfied, the remaining decision is whether the tool is interesting enough to write about.

I define “interesting” narrowly. A tool is interesting if it changes the calculus for developers choosing in its category — not if it is incrementally better on a spec-sheet dimension. A new managed database that is 15% cheaper than RDS with the same feature set is not interesting, because the savings are unlikely to justify a migration for anyone. A new database that eliminates the distinction between transactional and analytical workloads and lets you replace both your Postgres instance and your data warehouse with a single deployment is interesting, because it changes what you compare against.

A CI/CD platform with slightly better caching than GitHub Actions is not interesting. A CI/CD platform that eliminates YAML configuration entirely and lets you define pipelines in the same language as your application code is interesting. A monitoring tool with prettier dashboards than Grafana is not interesting. A monitoring tool that correlates application errors with infrastructure changes without manual configuration is interesting, because it changes the workflow, not just the UI.

This gate is shaped by who I am: a developer who has spent years evaluating, adopting, and migrating between tools, who has opinions about what counts as innovation and what counts as feature parity with better marketing. Your taste may differ from mine. If it does, the reader request pipeline is the correction mechanism — tools that I find uninteresting but that the audience is genuinely struggling to evaluate will surface through community demand signals and direct requests.

What We Actively Avoid

Beyond the three gates, there are categories of tools I consciously deprioritize regardless of demand or testability signals.

Category leaders with no credible competition. GitHub does not need a review. There is no developer evaluating source control platforms who is making a genuine comparison between GitHub and the alternatives — the decision is already made. When a category is settled, I cover it in roundups and comparison guides rather than dedicating full reviews to the incumbent.

Tools with predatory pricing models. If a tool’s pricing is designed to extract revenue from locked-in customers rather than compete on value, I deprioritize it. This includes tools that hide pricing entirely behind a sales call, tools with usage-based pricing that is opaque until the first invoice arrives, and tools with a documented history of raising prices on existing customers without corresponding improvements.

Pre-release software that is not publicly usable. A GitHub repository with a README, a star count, and no working code is not a tool. A product launched with a waitlist and a demo video is not testable. I wait until there is something to install and use before I consider it for review.

Tools where the review would be indistinguishable from the product page. Some tools are so simple and well-documented that a review would add nothing. If the official documentation already answers every question a developer would ask during evaluation, a review is redundant. I would rather direct readers to the documentation than produce content that exists only to capture search traffic.

What This Process Produces

The selection pipeline produces a review queue that is smaller and slower than what a pure traffic-maximizing strategy would generate. That is intentional. Every review we publish represents a tool that developers are genuinely trying to evaluate, that we can test in a way that produces real observations, and that we believe is worth your attention — whether or not we ultimately recommend it.

That last point is important and frequently misunderstood. We review tools we do not recommend, and we publish those negative reviews. A detailed critique of a popular tool that explains why it did not work in practice is often more valuable than an endorsement of a lesser-known alternative, because it helps developers eliminate options they were already considering. The selection process does not pre-judge the recommendation. It determines what is worth evaluating. The testing determines what we say about it.

FAQ

How can I request a review of a specific tool? +
Use the contact form on the site. Every request is logged with a date and a summary of what you want to know. Tools that accumulate multiple requests from different people get prioritized, especially when the requests describe a specific evaluation challenge that existing reviews do not address. A request that says 'I want a review of X' is less useful than one that says 'I am comparing X and Y for a Python data pipeline with real-time requirements and cannot find information about how they handle backpressure.'
Why have you not reviewed [popular tool X]? +
Usually one of three reasons. First, it may be in the evaluation pipeline but not yet published — reviews take time, and the queue is full. Second, it may have failed the testability gate because proper evaluation requires infrastructure or datasets we cannot justify at our current scale. Third, it may be a category leader with a settled market position where a review would not change anyone's decision. You can check the public candidate list on the site for the specific reason tied to any tool.
Do you review tools that compete with your affiliate partners? +
Yes, and we have published both positive reviews of tools that compete with our affiliate partners and critical reviews of tools we have affiliate relationships with. The selection process and the editorial judgment are independent of the revenue model. If they were not, the site would be indistinguishable from the generalist affiliate publishers we built Pickuma to replace. The editorial team makes all selection and recommendation decisions independently.

Related reading

See all Meta articles →

Get the best tools, weekly

One email every Friday. No spam, unsubscribe anytime.