Freetokens.dev

Gemini 2.5 Pro on Google AI Studio

Thu, 23 Apr 2026 16:30:00 +0000

Gemini 2.5 Pro is free on AI Studio. 1M context, Google's flagship, no credit card — the headline free-tier deal in AI right now. The catch: Google trains on everything you send through the free tier. Perfect for prototypes. Disastrous if you ship user data through it.

Llama 3.1 8B on Groq

Wed, 22 Apr 2026 10:15:00 +0000

14,400 free requests a day on Llama 3.1 8B via Groq. That's 10 per minute sustained, zero dollars, zero credit card. Stop hunting 70Bs and build your side project on this — it's the most usable free tier in the game.

Qwen 3 235B A22B on Cerebras

Mon, 20 Apr 2026 09:45:00 +0000

This is the most underrated free deal in AI. Qwen 3 235B A22B — Alibaba's frontier MoE — running at ~2000 tok/sec on Cerebras wafer-scale, 1M tokens/day free. A flagship model for zero dollars. Stop reading and go sign up.

Llama 3.1 8B on Cerebras

Fri, 17 Apr 2026 11:20:00 +0000

Cerebras hands you 1 million free tokens a day on Llama 3.1 8B at 2000+ tok/sec on wafer-scale silicon. Literally nobody talks about this. Get a key, point your SDK at it, move on. Treat it as free speed and stop overthinking.

Kimi K2 Instruct on Groq

Wed, 15 Apr 2026 17:00:00 +0000

Moonshot's Kimi K2 is quietly on Groq's free tier at 60 RPM. Faster than anyone else serves it and completely free. If you've never run a Chinese frontier model on LPU silicon, this is your on-ramp — five minutes from signup to first call.

Nemotron 3 Super 120B A12B on OpenRouter

Mon, 13 Apr 2026 13:15:00 +0000

NVIDIA's Nemotron 3 Super — a 120B hybrid Mamba-Transformer MoE — is free on OpenRouter. Weird architecture worth poking at, long context, zero dollars. Prompts may be logged for upstream training, so keep it to experiments and synthetic data.

GPT-4o on Microsoft Azure

Fri, 10 Apr 2026 14:00:00 +0000

Microsoft throws $200 of Azure credit at new accounts and — as of March 2026 — the old Azure OpenAI access-request form is finally dead. Anyone with an account can deploy GPT-4o now. Catches: credit card required, 30-day clock on the credit, and free-tier subs still choke on Foundry deployments sometimes. But $200 is $200 for a serious throughput test.

MiMo V2 Pro on Nous Research

Tue, 07 Apr 2026 10:00:00 +0000

Nous and Xiaomi just dropped a 2-week free window on MiMo V2 Pro — Xiaomi's ~1T-parameter MoE flagship — routed through Hermes Agent on the Nous Portal. Install Hermes, run hermes update, sign into a free Nous account, and you're calling a trillion-parameter model for nothing until ~April 21. Not OpenAI-compatible (Hermes Agent CLI only), but you're not going to get another shot at 1T free any time soon. Clock's ticking.

Llama 3.1 70B on NVIDIA NIM

Sun, 05 Apr 2026 12:00:00 +0000

NVIDIA hands out 1,000 free inference credits to Developer Program members, 40 RPM, OpenAI-compatible, 100+ models behind one endpoint — Llama 3.1 70B, Nemotron, Kimi K2.5, MiniMax. The Dev Program form is 5 minutes of friction for 1,000 free calls. Good trade. Underused because nobody talks about it.

Llama 3.3 70B on SambaNova Cloud

Fri, 03 Apr 2026 09:30:00 +0000

SambaNova gives you $5 of credits on signup — enough to touch Llama 3.1 405B on their RDU silicon, one of the only places you can run a 405B model free. Credits expire in 30 days, rate-limited free tier continues after. No credit card. If you haven't tried RDU inference yet, this is the cheapest test drive there is.

Llama 3.3 70B on Groq

Wed, 01 Apr 2026 14:30:00 +0000

Llama 3.3 70B free on Groq at ~275 tok/sec. The 1K-request-a-day ceiling stops you running a SaaS on it, but for agent loops, evals, and weekend builds, it's the fastest free 70B on the planet. Go.

Qwen 3.6 Plus Preview on OpenRouter

Tue, 31 Mar 2026 12:00:00 +0000

Alibaba's Qwen 3.6 Plus Preview just landed free on OpenRouter. 1M context. The 'Preview' label means the second Alibaba flips it to GA, the free endpoint dies — and nobody knows when that drops. This is exactly the kind of deal you check your inbox for. Use it now, not next week.

Llama 3.1 8B on Cloudflare Workers AI

Sat, 28 Mar 2026 10:00:00 +0000

Cloudflare Workers AI gives you 10,000 free Neurons a day across Llama, Mistral, Qwen, and more — edge-deployed in a one-line Worker call. Neurons aren't tokens, so small models stretch way further than you'd think. If you already live on CF, this is effectively free inference co-located with your app.

GPT-4.1 on GitHub Models

Thu, 26 Mar 2026 15:45:00 +0000

GitHub Models lets you hit GPT-4.1, Llama, Mistral, and xAI behind your GitHub PAT for free. Read the fine print: the free tier caps context at 8K/4K — even on GPT-4.1 — and tops out at 50 req/day on the big models. Wrong tool for production. Right tool for one-token multi-provider experiments.

DeepSeek V3 on Hugging Face

Mon, 23 Mar 2026 18:00:00 +0000

HuggingFace's free Inference tier is ten cents a month. Yes, ten cents. It's a tasting flight across DeepSeek V3, Llama, Qwen routed through HF's Inference Providers — you get a handful of calls, then you either upgrade to PRO at $9/mo or bounce. At least it's honest about what it is.

Command A on Cohere

Fri, 20 Mar 2026 11:30:00 +0000

Cohere's trial tier gives you Command A, Command R+, embeddings, and rerank for free. Non-commercial only, and the 1,000-calls-per-month hard cap will bite fast. Useless for an app — perfect for testing Cohere's rerank against your RAG pipeline before you commit.

DeepHermes 3 8B Preview on Nous Research

Tue, 17 Mar 2026 11:00:00 +0000

Nous dropped their own portal with a $5 signup credit and an OpenAI-compatible endpoint. Get DeepHermes 3 8B, Hermes 3 70B, and Hermes 4 behind one key without paying upfront. Rate limits are a little flaky right now — that's the Nous energy, lean in.