Subscription Plans & API Pricing¶

End-to-end reference for ALwrity's usage-based subscription tiers, API cost configuration, and plan-specific limits. All data is sourced from backend/services/subscription/pricing_service.py.

Subscription Plans¶

Legend: ∞ = Unlimited. Limits reset at the start of each billing cycle.

Plan	Price (Monthly / Yearly)	AI Text Generation Calls*	Token Limits (per provider)	Key API Limits	Video Generation	Image Editing	Audio Generation	Monthly Cost Cap	Highlights
Free	`$0 / $0`	100 Gemini • 50 Mistral (legacy enforcement)	100K Gemini tokens	20 Tavily • 20 Serper • 10 Metaphor • 10 Firecrawl • 5 Stability • 100 Exa	Not included	10 edits/mo	20 generations/mo	`$0`	Basic content generation & limited research
Basic	`$29 / $290`	50 unified LLM calls (Gemini + OpenAI + Anthropic + Mistral combined)	100K tokens each (Gemini, OpenAI, Anthropic, Mistral)	200 Tavily • 200 Serper • 100 Metaphor • 100 Firecrawl • 50 Images (OSS models) • 500 Exa	30 videos/mo (OSS: WAN 2.5)	50 edits/mo (OSS: Qwen Edit)	100 generations/mo (OSS: Minimax Speech)	`$45`	OSS-powered: Full content generation, advanced research, all tools access
Pro	`$79 / $790`	5K Gemini • 2.5K OpenAI • 1K Anthropic • 2.5K Mistral	5M Gemini • 2.5M OpenAI • 1M Anthropic • 2.5M Mistral	1K Tavily • 1K Serper • 500 Metaphor • 500 Firecrawl • 200 Stability • 2K Exa	50 videos/mo	100 edits/mo	200 generations/mo	`$150`	Premium research, advanced analytics, priority support
Enterprise	`$199 / $1,990`	∞ across all LLM providers	∞	∞ across every research/media API	∞	∞	∞	`$500`	White-label, dedicated support, custom integrations

*The Basic plan enforces a unified ai_text_generation_calls_limit of 50 requests across all LLM providers (increased from 10). Legacy per-provider columns remain for analytics dashboards but do not control enforcement.

OSS Models: Basic tier prioritizes Open-Source AI models via WaveSpeed for cost efficiency: - Image Generation: Qwen Image ($0.03) or Ideogram V3 Turbo ($0.05) - Image Editing: Qwen Edit ($0.02) or FLUX Kontext Pro ($0.04) - Video Generation: WAN 2.5 ($0.25 per ~5s video) - Audio Generation: Minimax Speech 02 HD ($0.05 per 1K characters)

Plan Feature Notes¶

OSS-First Strategy (Basic Tier)¶

The Basic tier prioritizes Open-Source AI models via WaveSpeed for cost efficiency, allowing more generous limits: - Image Generation: Defaults to Qwen Image OSS ($0.03/image) vs Stability ($0.04/image) - 25% savings - Image Editing: Defaults to Qwen Edit OSS ($0.02/edit) vs Stability ($0.04/edit) - 50% savings - Video Generation: Defaults to WAN 2.5 OSS ($0.25/video) - Better quality/value than HuggingFace - Audio Generation: Uses Minimax Speech 02 HD OSS ($0.05 per 1K chars) - High-quality TTS

Other Features¶

Video Generation: Basic tier uses WAN 2.5 OSS ($0.25 per ~5s video). Pro/Enterprise can use HuggingFace tencent/HunyuanVideo ($0.10) or premium models.
Image Generation: Basic tier uses OSS models (Qwen Image $0.03, Ideogram V3 Turbo $0.05). Pro/Enterprise can use Stability AI ($0.04/image) or premium models.
Research APIs: Tavily, Serper, Metaphor, Exa, and Firecrawl are individually rate-limited per plan.
Cost Caps: monthly_cost_limit hard stops spend at $45 (Basic) / $150 (Pro) / $500 (Enterprise). Enterprise caps are adjustable via support.

Provider Pricing Matrix¶

Gemini 2.5 & 1.5 (Google)¶

gemini-2.5-pro — $0.00000125 input / $0.00001 output per token ($1.25 / $10 per 1M tokens)
gemini-2.5-pro-large — $0.0000025 / $0.000015 per token (large context)
gemini-2.5-flash — $0.0000003 / $0.0000025 per token
gemini-2.5-flash-audio — $0.000001 / $0.0000025 per token
gemini-2.5-flash-lite — $0.0000001 / $0.0000004 per token
gemini-2.5-flash-lite-audio — $0.0000003 / $0.0000004 per token
gemini-1.5-flash — $0.000000075 / $0.0000003 per token
gemini-1.5-flash-8b — $0.0000000375 / $0.00000015 per token
gemini-1.5-pro — $0.00000125 / $0.000005 per token
gemini-1.5-pro-large — $0.0000025 / $0.00001 per token
gemini-embedding — $0.00000015 per input token
gemini-grounding-search — $35 per 1,000 requests after the free tier

OpenAI (estimates — update when official pricing changes)¶

gpt-4o — $0.0000025 input / $0.00001 output per token
gpt-4o-mini — $0.00000015 input / $0.0000006 output per token

Anthropic¶

claude-3.5-sonnet — $0.000003 input / $0.000015 output per token

Hugging Face / Mistral (GPT-OSS-120B via Groq)¶

Pricing is configurable through environment variables:

HUGGINGFACE_INPUT_TOKEN_COST=0.000001   # $1 per 1M tokens
HUGGINGFACE_OUTPUT_TOKEN_COST=0.000003  # $3 per 1M tokens

Models covered: openai/gpt-oss-120b:groq, gpt-oss-120b, and default (fallback).

Search, Image, and Video APIs¶

Search APIs¶

Tavily — $0.001 per search
Serper — $0.001 per search
Metaphor — $0.003 per search
Exa — $0.005 per search (1–25 results)
Firecrawl — $0.002 per crawled page

Image Generation (OSS Models via WaveSpeed)¶

Qwen Image (OSS) — $0.03 per image ⭐ Default for Basic tier
Ideogram V3 Turbo (OSS) — $0.05 per image (photorealistic, text rendering)
Stability AI — $0.04 per image (Pro/Enterprise)

Image Editing (OSS Models via WaveSpeed)¶

Qwen Image Edit (OSS) — $0.02 per edit ⭐ Default for Basic tier
Qwen Image Edit Plus (OSS) — $0.02 per edit (multi-image)
FLUX Kontext Pro (OSS) — $0.04 per edit (professional, typography)

Video Generation¶

WAN 2.5 (OSS) — $0.25 per video (~5 seconds) ⭐ Default for Basic tier
Seedance 1.5 Pro (OSS) — $0.40 per video (~5 seconds, longer duration)
HunyuanVideo (HuggingFace) — $0.10 per video request
Kling v2.5 Turbo (5s) — $0.21 per video
Kling v2.5 Turbo (10s) — $0.42 per video

Audio Generation (OSS Models via WaveSpeed)¶

Minimax Speech 02 HD (OSS) — $0.05 per 1,000 characters ⭐ Default

Updating Pricing & Plans¶

Initial Seed — python backend/scripts/create_subscription_tables.py creates plans and pricing.
Env Overrides — Hugging Face pricing refreshes from HUGGINGFACE_* vars every boot.
Scripts & Maintenance — Use backend/scripts/ utilities (e.g., update_basic_plan_limits.py, cap_basic_plan_usage.py) to roll forward changes.
Direct DB Edits — Modify subscription_plans or api_provider_pricing tables for emergency adjustments.

Cost Examples¶

Scenario	Calculation	Cost
Gemini 2.5 Flash (1K input / 500 output tokens)	(1,000 × 0.0000003) + (500 × 0.0000025)	$0.00155
Tavily Search	1 request × $0.001	$0.001
Hugging Face GPT-OSS-120B (2K in / 1K out)	(2,000 × 0.000001) + (1,000 × 0.000003)	$0.005
Image Generation (Basic - Qwen Image OSS)	1 image × $0.03	$0.03 (counts toward 50-image quota)
Image Editing (Basic - Qwen Edit OSS)	1 edit × $0.02	$0.02 (counts toward 50-edit quota)
Video Generation (Basic - WAN 2.5 OSS)	1 video × $0.25	$0.25 (counts toward 30-video quota)
Audio Generation (Basic - Minimax Speech OSS)	2,000 chars × $0.05/1K	$0.10 (counts toward 100-audio quota)

Enforcement & Monitoring¶

Middleware estimates usage and calls UsageTrackingService.track_api_usage.
UsageService.enforce_usage_limits validates the request before the downstream provider call.
When a limit would be exceeded, the API returns 429 with upgrade guidance.
The Billing Dashboard (/billing) shows real-time usage, cost projections, provider breakdowns, renewal history, and usage logs.

Additional Resources¶

Last Updated: January 2026

Recent Changes (OSS-Focused Strategy): - ✅ Basic tier limits increased: 50 AI calls (was 10), 100K tokens (was 20K), 50 images (was 5), 50 edits (was 30), 30 videos (was 20), 100 audio (was 50) - ✅ Cost cap adjusted: $45 (was $50) to align with $40-50 hard limit target - ✅ OSS models prioritized: Qwen Image ($0.03), Qwen Edit ($0.02), WAN 2.5 ($0.25), Minimax Speech ($0.05/1K chars) - ✅ 25-50% cost savings vs proprietary models enable more generous limits