中国开源模型Kimi K2.6在编程挑战中击败Claude、GPT-5.5和Gemini

#Tech

中国开源模型Kimi K2.6在编程挑战中击败Claude、GPT-5.5和Gemini

在最近的AI编程挑战赛中,中国初创公司Moonshot AI开发的开源模型Kimi K2.6 以22分胜出,领先于小米的MiMo V2-Pro,并超越了OpenAI的GPT-5.5和Anthropic的Claude Opus 4.7。

该挑战赛为“单词宝石拼图”,模拟滑动字母拼图游戏,测试模型的实时编程能力。

Kimi K2.6 凭借其“贪婪式滑动”策略,即使在较大网格中也取得了领先,展现了开源模型在特定任务上的竞争力。

此次挑战突显了开源模型追赶前沿模型的趋势,也表明模型性能差异在缩小。

查看原文开头(英文 · 仅前 3 段)

By Rohana Rezel

I’m running the ongoing AI Coding Contest where I pit major language models against each other in real-time programming tasks with objective scoring. Day 12 was the Word Gem Puzzle. Ten models entered. The results were not what most people would have predicted.

Kimi K2.6, an open-weights model from Chinese startup Moonshot AI, won the challenge outright: 22 match points, 7-1-0. MiMo V2-Pro from Xiaomi came second. GPT-5.5 was third. Claude Opus 4.7 finished fifth. Every model from the Western frontier labs landed below the top two.

※ 出于版权考虑,仅引用前 3 段。完整内容请阅读原文。

阅读原文 ↗