AI · 2h ago
30-Day Benchmark: DeepSeek, Qwen, Kimi & GLM Compared on Cost and Speed
A developer benchmarked four Chinese LLM families—DeepSeek, Qwen, Kimi, and GLM—over 30 days with 1,247 prompts across code, reasoning, and chat tasks. Qwen3-8B costs as low as $0.01 per million output tokens, while DeepSeek V4 Flash offered the best quality-per-dollar. The results show weak correlation between price and performance.
Meridian48 take
The benchmark is a useful real-world snapshot but lacks peer review and may not generalize to all workloads.
Read the full reporting
I Benchmarked DeepSeek, Qwen, Kimi & GLM for 30 Days — The Numbers →
DEV Community
llm-benchmarkchinese-llms