LLMのボトルネックを解消した新モデル

#スタートアップ

LLMのボトルネックを解消した新モデル 高速かつ低コストの新LLMが

米国拠点のAIスタートアップSubquadraticは、長らく課題となっていたLLM(大規模言語モデル)の性能ボトルネックを解消したと主張する新モデルの詳細を明らかにした。

同社が開発した『SubQ』は、既存のLLMに比べ高速かつ低コストで動作し、さらに大規模なデータセットも処理可能だ。

2026年6月、米国フロリダ州マイアミを拠点とするAIスタートアップ「サブクォドラティック」が、大規模言語モデル(LLM)の性能向上に向けた技術革新を発表しました。しかし、その主張は一部の専門家から疑問視されています。

技術革新の主張

サブクォドラティックは、LLMの性能を制限していた数学的制約を突破したと主張しています。同社が開発した新モデル「サブQ」は、従来のモデルに比べて処理速度が速く、エネルギー消費が少なく、コストも低コストで運用可能です。また、1回に処理できるテキスト量は他のモデルの12倍に達するとのことです。

技術の裏付けと反応

初期の発表では、自社が公開したテストスコアのみで主張を裏付けていたため、一部の専門家は疑問を抱いていました。しかし、同社は第三者機関「アッペン」による独立的な検証結果を公表し、その結果はサブクォドラティックの主張を支持しています。

技術の仕組みと今後の展望

LLMの中心的な仕組みである「トランスフォーマー」は、テキストを処理する際、全ての単語間の関係を計算する必要があり、計算量が急増します。サブクォドラティックは、この計算量を大幅に削減する「スパースアテンション」という技術を採用し、今後はLLMの構築方法に大きな変化をもたらす可能性があります。

まとめ

サブクォドラティックの技術革新は、LLMの効率化に向けた大きな進展かもしれません。ただし、今後の検証と実用化の進捗が注目されます。

原文の冒頭を表示(英語・3段落のみ)

Subquadratic has now shared more details about its new model. But some are still skeptical.June 19, 2026Stephanie Arnett/MIT Technology Review | Adobe Stock Miami-based AI startup Subquadratic came out of stealth mode last month with a huge claim. It announced that it had solved a mathematical bottleneck that had been holding back large language models for almost a decade. The details were thin, and many people were unconvinced. But Subquadratic has started to bring the receipts, sharing the results of an independent evaluation of its new tech. The results suggest that the company’s claims might be worth paying attention to. According to Subquadratic, it has developed a new kind of LLM, called SubQ, that is faster and cheaper and uses a lot less energy than any other model on the market. The company also claims that SubQ is able to process up to 12 times as much text at once than most other models, allowing it to carry out a range of data-heavy tasks, such as analyzing hundreds of documents or entire code bases. What’s more, Subquadratic says, SubQ does this while more or less matching the performance of the best models put out by Google DeepMind, OpenAI, and Anthropic on key tasks like coding.

The problem was that the company at first provided little evidence for its claims beyond a handful of self-published test scores. And it has yet to make SubQ widely available for people to try out themselves. So it’s no surprise that Subquadratic’s claims were met with skepticism. Dan McAteer, an artificial intelligence engineer, captured the overall response on X: “SubQ is either the biggest breakthrough since the Transformer ... or it’s AI Theranos.”

A month on, the company has published more information about its model, including the results of additional independent tests run by third-party firm Appen. “We expected healthy skepticism,” says Subquadratic cofounder and chief technology officer Alex Whedon. “In hindsight, releasing the third-party benchmarks alongside the initial announcement would have preempted much of the skepticism, which is why we’re taking the time to make sure any future results are fully verified before putting them out.” Subquadratic asked Appen, which evaluates other companies’ models, to run its tests on SubQ. The results seem to back up a lot of Subquadratic’s claims. “That was really exciting to me, it validated their architecture,” says Jeanine Sinanan-Singh, Appen’s director of generative AI research. “I was like, ‘Wow, this could be a game changer,’ because models struggle with speed and inefficiency,” she adds. “But when you have kind of shocking results, it’s really not as credible when you say it yourself.” SubQ won’t replace existing top models across the board, but it could offer huge increases in speed at a fraction of the typical cost for certain tasks. Subquadratic insists that in the long run, though, its breakthrough could change how LLMs are built. “We hope we’re kicking off a new age of efficiency,” says Justin Dangel, the firm’s cofounder and CEO. “We don’t think anybody will be building on transformers in a few years.” Attention! To understand why Subquadratic’s claims are a big deal, let’s dig into how most LLMs work. The key mechanism inside an LLM is a type of neural network called a transformer, which runs a process known as dense attention. Today’s LLMs typically chain together multiple transformers. (The foundational paper of the LLM era, published by researchers at Google in 2017, was titled “Attention Is All You Need.”) Dense attention works like this: When a transformer processes a chunk of text, it first encodes each word (or part of a word, known as a token) with a number. To capture the meaning of the full text, it then multiplies each of those numbers with every other number for that text. For example, a piece of text 10,000 words long would kick off almost 50 million individual multiplications. That’s a lot of computation and the main reason that LLMs are notorious power hogs. “If you want to summarize The Great Gatsby, you have to look at the first word and the last word together, and then you have to look at every other combination,” says Dangel.

※ 著作権に配慮し、引用は冒頭3段落までです。続きは元記事をご覧ください。

元記事を読む ↗