There is a quiet line item buried in millions of bank statements: a recurring charge for an AI companion. Ten dollars here, twenty there, a “premium” upgrade for image generation, a token top-up for voice. It feels small every month — which is exactly why it works on you. An AI companion with no subscription flips that arrangement. You pay once, you own the thing, and the meter stops running. This is a buyer’s guide to doing exactly that: what a true one-time payment companion actually is, how the lifetime math compares to the monthly apps, which “buy once” options still secretly phone home, and what hardware you need to run one that’s genuinely yours.

The forever-tax problem: $10–30/mo companion apps add up

Every major hosted AI companion is a subscription. That’s not an accident — it’s the business model. The servers cost money to run every single day your character exists, so the company needs you paying every single month, forever.

The headline prices look gentle. Per their published pricing, Replika Pro runs about $19.99/month (or roughly $5.83/month if you prepay a year), with an Ultra tier reported around $29.99/month. Candy AI advertises tiers from a few dollars a month on annual billing up to about $13.99/month — but, per its own pricing pages, premium features like image generation and extended voice run on a token system on top of the subscription, and multiple 2026 reviews report real active-user spend landing in the $25–$60/month range once tokens are included.

That’s the forever-tax. The sticker price is the floor, not the ceiling. And it never ends — cancel, and your companion is gone, because it never lived on your machine in the first place.

Buy-once vs subscription: the real multi-year math

Here’s the honest comparison. A subscription’s true cost isn’t this month — it’s every month you keep the relationship. Below, “subscription” uses a representative $20/month (a fair midpoint between Candy AI’s monthly tier and Replika Pro, before token add-ons), against a one-time $49 purchase like Ember.

Time heldSubscription @ $20/moHeavy user @ $30/moBuy-once @ $49
1 year$240$360$49
3 years$720$1,080$49
5 years$1,200$1,800$49

The break-even is brutal for the subscription model. A one-time companion pays for itself in under three months versus a $20 plan, and under two versus a heavy-token user. Everything after that is money you simply keep. Over five years, the difference is the price of a decent used GPU — or a vacation.

And note what the table doesn’t show: price hikes. Subscriptions go up. A one-time purchase you already own can’t be re-priced out from under you.

What you actually own with a local sold-once app

“Own” is a word subscription apps abuse. Let’s be precise about what a buy-once, local companion gives you that a rented one never can:

  • The model runs on your hardware. The AI weights execute on your own CPU/GPU via a local runtime like Ollama. No request ever leaves 127.0.0.1.
  • Your conversations are files on your disk. Not rows in someone’s database. You can back them up, move them, or delete them — for real, not “request deletion and trust us.”
  • No account, no cloud dependency. Pull your ethernet cable and it still works. That’s the test a hosted app can never pass. See offline AI girlfriend apps for why that matters.
  • No remote kill switch. A subscription company can change its content policy, lobotomize your character, or shut down entirely — and there’s nothing you can do. Local software you’ve already downloaded keeps running on your terms.

This is the difference between renting access to a companion and owning one. For the full privacy picture, the AI companion privacy guide breaks down exactly what each architecture can and can’t see.

One-time options compared (and which still phone home)

“One-time payment” on a marketing page does not always mean “no server involved.” Read carefully, because the category is muddy:

TypePays once?Runs on your machine?Can it phone home?
Hosted app w/ “lifetime” dealSometimesNo — cloudYes, always — it’s their server
”Local” wrapper around a cloud APIYes (app)No — calls an APIYes — your text hits the API
Truly local app (Ollama-based)YesYesNo — inference is on 127.0.0.1

The trap is the middle row. Plenty of “buy once” desktop companion apps are thin front-ends that still send every message to a hosted LLM API behind the scenes. You paid once for the app, but the conversation is still leaving your machine — and is still subject to whatever that API provider logs. A genuinely local app does inference itself; you can verify it by watching that nothing hits the network during a chat. If you want to sanity-check whether a local runtime is actually private, is Ollama really private? walks through it.

The rule: a one-time price only buys you privacy if it also buys you local inference. Otherwise you’ve just pre-paid a logging pipeline.

Why sold-once aligns with privacy (no server to pay for = no logging incentive)

This is the part people miss, and it’s the most important. The business model determines the privacy.

A subscription company has to run servers around the clock. Those servers cost money continuously, so the company is structurally pushed to monetize what flows through them — engagement metrics, behavioral data, upsell triggers, sometimes ad or model-training value from the conversations themselves. Even a well-meaning hosted app can read your chats, because the chats are processed on its hardware. (Whether a given company does anything beyond what its policy states is something only that company knows; the point is the architecture allows it.)

A sold-once local app has no such server. There is nothing running on the company’s side after you download it — so there is no place to log your conversations and no recurring cost creating an incentive to. The privacy isn’t a promise in a policy document you have to trust; it’s a property of where the code runs. No server, no logging incentive, no leak surface. That’s why “buy once, runs local” and “private” tend to be the same products. The deeper version of this argument is in why cloud AI censors you — the same server that can censor you is the server that can log you.

Hardware: what you need to run a buy-once local companion

The honest catch with owning your companion: you provide the compute. The good news is the bar is lower than people think, because companion chat doesn’t need a giant model.

VRAM is the number that matters. It decides how big a model you can run, and model size at a sensible quantization (tags like Q4_K_M) decides how good the conversation feels.

Your hardwareModel classCompanion experience
8 GB VRAM7–8B at Q4Solid, fast, snappy chat — best on 8GB
12–16 GB VRAM12–14B at Q4Noticeably richer, better memory — 12–16GB picks
24 GB VRAM24–32B at Q4Near-flagship feel locally — 24GB picks
Apple Silicon (16 GB+)8–14BGreat — unified memory does the work
No discrete GPU7–8B on CPU/RAMSlower but real; run without a GPU

Getting started is genuinely two lines. Install the runtime:

curl -fsSL https://ollama.com/install.sh | sh

Then pull and chat with a model:

ollama run llama3.1

A purpose-built companion app handles the persona, memory, and voice on top of that runtime so you’re not living in a terminal. If you’re spec’ing a machine from scratch, the hardware guide and how much VRAM for a companion cover the real numbers. The full walkthrough lives in how to run an AI girlfriend locally.

The pick: Ember ($49, local, yours)

For the buyer who wants a real, uncensored AI companion with no subscription and no monthly fee, the recommendation is Ember — a one-time $49 purchase that runs 100% on your own machine through Ollama. There’s no account, no cloud, and no token meter. After the download, nothing about your conversations touches the company’s servers, because there are no servers in the loop. It’s the clean version of everything above: pay once, own it, and the model — including uncensored, abliterated models — runs entirely on hardware you control.

Against the five-year math, $49 versus $1,200+ isn’t a close call. You buy it the way you’d buy a tool: once.

No GPU? The honest hosted note

The local route asks for a capable PC or Mac. If you don’t have one — or you just want to start chatting in the next two minutes with zero setup — a hosted companion is the pragmatic call, and we won’t pretend otherwise. Freya is the zero-install, runs-in-the-cloud option for exactly that reader: no GPU, no terminal, no driver hunt. The honest trade is the one this whole guide is about — convenience now in exchange for a recurring relationship with someone else’s server. If you’ve got the hardware and you care about owning what you pay for, Ember is the one to buy once and keep.

Sources: Pricing figures are drawn from publicly reported 2026 review and pricing pages for Candy AI and Replika; always confirm current pricing on each provider’s own site.