Back to notes

Models / Token Efficiency

The Best Models for Token Efficiency When Using Lucena Coder

Our short list of models that tend to give Lucena Coder the best blend of price, coding usefulness, and working speed.

The cheapest coding run is rarely just the cheapest model.

A model can have beautiful per-token pricing and still waste money if it drifts, over-explains, misses tool calls, or needs three correction loops. Lucena Coder is built to keep the working context tight, so small differences in model behavior show up quickly: which models act cleanly on current state, which ones burn output, and which ones are pleasant enough to use all day.

These are our current favorite models for token-efficient coding work in Lucena Coder.

Performance Favorite GLM 5.1

$0.98 input / $3.08 output per 1M tokens.

Favorite for Speed vs Price DeepSeek V4 Flash

$0.09 input / $0.18 output per 1M tokens.

Cheapest Model That Runs Well Gemma 4 31B

$0.12 input / $0.35 output per 1M tokens.

Our Favorite Models to Use with LucenaCoder

Prices are listed per 1 million tokens and were checked against OpenRouter's public model catalog on June 19, 2026. Speed tier here is a vibe check based on our own internal testing only.

Model Input / 1M Output / 1M Speed
Claude Haiku 4.5 $1 $5 Fast
Claude Sonnet 4.6 $3 $15 Medium
Claude Fable 5 $10 $50 Slow
Claude Opus 4.8 $5 $25 Slow
Claude Opus 4.7 $5 $25 Slow
DeepSeek V4 Flash $0.09 $0.18 Fast
DeepSeek V4 Pro $0.435 $0.87 Medium
Google: Gemma 4 31B $0.12 $0.35 Medium
Gemini 3.5 Flash $1.50 $9 Fast
Gemini 3.1 Pro Preview $2 $12 Medium
Mistral Large 3 2512 $0.50 $1.50 Medium
Devstral 2 2512 $0.40 $2 Medium
GPT-5.4 Mini $0.75 $4.50 Fast
GPT-5.4 $2.50 $15 Medium
GPT-5.5 $5 $30 Slow
MiniMax: MiniMax M2.7 $0.25 $1 Medium
Qwen: Qwen3.7 Max $1.25 $3.75 Medium
Qwen3.7 Plus $0.32 $1.28 Fast
GLM 5 $0.60 $1.92 Fast
GLM 5.1 $0.98 $3.08 Fast
GLM 5.2 $1.20 $4.10 Medium
xAI: Grok Build 0.1 $1 $2 Fast
Grok 4.20 $1.25 $2.50 Fast
Grok 4.3 $1.25 $2.50 Fast

This list frequently changes as we test performance on different models through Lucena's token-efficient architecture. Have a model we should test? Reach out: hello@lucena.one