pickuma.
SaaS & Productivity

Studis Review: Turning Product Photos Into Social Ads With Gemini and Claude

We tested Studis, an AI tool that turns one product photo into ad creatives, generated copy, hashtags, and audience targeting using a Gemini Flash Image and Claude model stack.

5 min read

Product photography has a familiar bottleneck. You have one clean shot of the thing you’re selling, and then you need a dozen versions of it: a square for the feed, a vertical for Reels, one with a discount badge, one with copy that actually converts. Studis wants to collapse that into a single upload. You drop in a product photo, and it hands back ad creatives with generated copy, hashtag sets, and a suggested audience.

We looked at it from a developer’s angle, because the pitch to marketers is not the interesting part. The interesting part is the model stack. Studis runs Gemini 3.1 Flash Image for the visuals and Claude for the text — a working example of a pattern more teams will end up building themselves: send the image work to one model, the language work to another, and stitch the results into one artifact.

What Studis actually does

The workflow is deliberately short. You upload a product photo — a clean, well-lit shot with an uncluttered background gives the model the most to work with — then pick a target platform and a rough creative direction. Studis returns a batch of creatives rather than a single image: the product dropped into different generated backgrounds, recolored scenes, lifestyle settings, and layouts sized for specific placements.

Each creative comes with text attached. There’s a headline and primary copy written for the platform you picked, a block of hashtags, and a short audience description — the kind of interests-and-demographics summary you’d paste straight into an ad manager’s targeting fields. The output is positioned as close to publish-ready, not as a mood board. If a batch misses, you regenerate with a nudge — a different scene, a punchier tone — instead of starting from a blank screen.

In practice, that framing is honest about half the time. The background generation is the strongest part of the tool: the product stays recognizable across variations, and the lighting usually matches the scene instead of looking pasted on. The copy is competent and on-brief. What needs a human pass is anything specific — exact pricing, product claims, and text rendered inside the image, which still comes out garbled often enough that you can’t trust it unread.

The multi-model stack underneath

For developers, Studis is most useful as a reference architecture, because it is doing something you can reproduce.

Gemini 3.1 Flash Image handles the visual half. It’s a fast, low-cost image model tuned for editing and composition rather than pure text-to-image generation, which is the right call here — the job isn’t “invent a product,” it’s “keep this exact product and build a scene around it.” The model edits around an anchor image instead of generating from scratch, and that constraint is what keeps the product identity stable across a whole batch.

Claude handles the language half: headline, body copy, hashtags, and the audience summary. It receives the product context — category, key features, target platform — and writes copy constrained to that. Splitting the work this way is the lesson worth taking home. A single model asked to do both jobs tends to be mediocre at one of them. Routing each subtask to a model that’s genuinely good at it, then merging the outputs, produces a noticeably better artifact than forcing one model through the entire pipeline.

The cost math also favors this design. Flash-class image models are cheap per generation, so producing eight or ten variations from one upload stays affordable. Claude’s text calls are short and inexpensive. The per-upload total stays low enough that generating a wide batch is the default behavior rather than a premium upsell.

Where it fits and where it doesn’t

Studis is a good fit if you’re a solo founder or a small team shipping social-first products and you need volume — many creatives, many placements, fast iteration — more than you need pixel-perfect brand control. Generating a wide batch and keeping the two or three that land is exactly what the tool is built around.

It’s a poor fit if your brand has strict visual guidelines, if your category carries regulated claims (supplements, finance, medical), or if you need legible text baked into the image. And generated copy has to be read before it ships — both for plain accuracy and because an AI-written ad claim is still your legal responsibility, not the model’s.

If the multi-model pattern is the part that caught your attention, the build-it-yourself version is not a large project. The Gemini and Claude APIs are each a handful of calls. The real work is the orchestration layer: accepting the anchor image, deriving structured metadata from it, fanning that out to both models, and assembling the returned image and text into one creative. That’s a weekend prototype with the right editor.

Cursor

The AI code editor for wiring up your own Gemini-plus-Claude pipeline — multi-file edits, inline API scaffolding, and fast iteration on the orchestration layer.

Free tier; Pro $20/month

Try Cursor

Affiliate link · We earn a commission at no cost to you.

The honest summary: Studis is a competent assembler of a stack you could build yourself, sold to people who don’t want to build it. For non-technical marketers, that’s a real product. For developers, it’s a clear, well-chosen blueprint — and a reminder that many of the AI tools worth paying for right now are just two specialized models with good plumbing between them.

FAQ

Does Studis replace a product photographer? +
No. It still needs a real product photo as input — it edits and recomposes around that anchor rather than inventing the product from nothing. What it replaces is the variation work: resizing, restaging, and reformatting one good shot into many placements.
Why split work between Gemini for images and Claude for text? +
Each subtask has a model that is better at it. Gemini 3.1 Flash Image is tuned for fast, low-cost editing around an anchor image; Claude is stronger at on-brief, platform-specific copy. Routing each subtask separately and merging the results beats forcing one model through the whole pipeline.
Is the generated ad copy safe to publish as-is? +
Not without review. Language models can introduce specific claims — discounts, features, numbers — that your product does not back. The advertiser is legally responsible for those claims, so every creative needs a human read before it goes live.

Related tools

Some links above are affiliate links. We may earn a commission if you sign up. See our disclosure for details.

Related reading

See all SaaS & Productivity articles →

Get the best tools, weekly

One email every Friday. No spam, unsubscribe anytime.