Dispatches
Gemini 2.5 Pro is free on AI Studio. 1M context, Google's flagship, no credit card — the headline free-tier deal in AI right now. The catch: Google trains on everything you send through the free tier. Perfect for prototypes. Disastrous if you ship user data through it.
14,400 free requests a day on Llama 3.1 8B via Groq. That's 10 per minute sustained, zero dollars, zero credit card. Stop hunting 70Bs and build your side project on this — it's the most usable free tier in the game.
This is the most underrated free deal in AI. Qwen 3 235B A22B — Alibaba's frontier MoE — running at ~2000 tok/sec on Cerebras wafer-scale, 1M tokens/day free. A flagship model for zero dollars. Stop reading and go sign up.
Cerebras hands you 1 million free tokens a day on Llama 3.1 8B at 2000+ tok/sec on wafer-scale silicon. Literally nobody talks about this. Get a key, point your SDK at it, move on. Treat it as free speed and stop overthinking.
Moonshot's Kimi K2 is quietly on Groq's free tier at 60 RPM. Faster than anyone else serves it and completely free. If you've never run a Chinese frontier model on LPU silicon, this is your on-ramp — five minutes from signup to first call.
NVIDIA's Nemotron 3 Super — a 120B hybrid Mamba-Transformer MoE — is free on OpenRouter. Weird architecture worth poking at, long context, zero dollars. Prompts may be logged for upstream training, so keep it to experiments and synthetic data.
Microsoft throws $200 of Azure credit at new accounts and — as of March 2026 — the old Azure OpenAI access-request form is finally dead. Anyone with an account can deploy GPT-4o now. Catches: credit card required, 30-day clock on the credit, and free-tier subs still choke on Foundry deployments sometimes. But $200 is $200 for a serious throughput test.
Nous and Xiaomi just dropped a 2-week free window on MiMo V2 Pro — Xiaomi's ~1T-parameter MoE flagship — routed through Hermes Agent on the Nous Portal. Install Hermes, run hermes update, sign into a free Nous account, and you're calling a trillion-parameter model for nothing until ~April 21. Not OpenAI-compatible (Hermes Agent CLI only), but you're not going to get another shot at 1T free any time soon. Clock's ticking.
NVIDIA hands out 1,000 free inference credits to Developer Program members, 40 RPM, OpenAI-compatible, 100+ models behind one endpoint — Llama 3.1 70B, Nemotron, Kimi K2.5, MiniMax. The Dev Program form is 5 minutes of friction for 1,000 free calls. Good trade. Underused because nobody talks about it.
SambaNova gives you $5 of credits on signup — enough to touch Llama 3.1 405B on their RDU silicon, one of the only places you can run a 405B model free. Credits expire in 30 days, rate-limited free tier continues after. No credit card. If you haven't tried RDU inference yet, this is the cheapest test drive there is.
Llama 3.3 70B free on Groq at ~275 tok/sec. The 1K-request-a-day ceiling stops you running a SaaS on it, but for agent loops, evals, and weekend builds, it's the fastest free 70B on the planet. Go.
Alibaba's Qwen 3.6 Plus Preview just landed free on OpenRouter. 1M context. The 'Preview' label means the second Alibaba flips it to GA, the free endpoint dies — and nobody knows when that drops. This is exactly the kind of deal you check your inbox for. Use it now, not next week.
Cloudflare Workers AI gives you 10,000 free Neurons a day across Llama, Mistral, Qwen, and more — edge-deployed in a one-line Worker call. Neurons aren't tokens, so small models stretch way further than you'd think. If you already live on CF, this is effectively free inference co-located with your app.
GitHub Models lets you hit GPT-4.1, Llama, Mistral, and xAI behind your GitHub PAT for free. Read the fine print: the free tier caps context at 8K/4K — even on GPT-4.1 — and tops out at 50 req/day on the big models. Wrong tool for production. Right tool for one-token multi-provider experiments.
HuggingFace's free Inference tier is ten cents a month. Yes, ten cents. It's a tasting flight across DeepSeek V3, Llama, Qwen routed through HF's Inference Providers — you get a handful of calls, then you either upgrade to PRO at $9/mo or bounce. At least it's honest about what it is.