データプラットフォームのクローズドループ設計

#Tech

データプラットフォームのクローズドループ設計 知識こそが鍵 データプラット

AIネイティブ企業では、企業全体をクエリ可能にする「クローズドループ」データプラットフォームを構築しています。

これは、監視(Monitor)、分析(Analyze)、計画(Plan)、実行(Execute)、知識(Knowledge)のサイクルを回すMAPE-Kフレームワークに基づいています。

現在のデータプラットフォームは監視機能に強みがあるものの、計画や実行に不可欠な知識領域が不足しています。

OpenAIのデータエージェントも、まだポリシーの記述が不足しています。

真のクローズドループを実現するには、データの意図を記録する仕組みを構築する必要があります。

AIを活用した企業では、会議記録、チケット管理、顧客対応など、あらゆる情報を「知能レイヤー」が学習できる状態にすることで、企業全体を検索可能にしているとのことです。この状態を「クローズドループ」と呼ぶそうで、その構築には単なるデータ収集・分析にとどまらず、システム全体を再構築するようなアプローチが必要となります。本記事では、クローズドループデータプラットフォームの設計における重要な要素を解説します。

MAPE-K:自己適応システムの基盤

IBMが2003年に打ち出した「自律型コンピューティング・イニシアチブ」で定義されたMAPE-Kは、自己適応システムに必要な4つの機能(Monitor:監視、Analyze:分析、Plan:計画、Execute:実行)と、それらを支える共通基盤であるKnowledge(知識)で構成されます。Monitorはセンサーを通じてシステム情報を収集し、Analyzeは収集した情報を期待値と比較します。Planは現実と期待値のずれに基づいて行動を決定し、Executeはその決定に基づいて実行します。Knowledgeには、ポリシー、ルール、履歴、制約、目標などが格納され、各機能が参照・更新します。

Kubernetesなどの現代的なシステムもMAPE-Kの考え方を暗黙的に採用しており、その影響力は広範囲に及んでいます。

知識(Knowledge)の重要性

多くの企業がMonitor機能(データ収集・監視)の強化に注力していますが、クローズドループデータプラットフォームの設計においては、Knowledgeの重要性を軽視してはなりません。AnalyzeやPlan、Executeといった機能はすべてKnowledgeに依存しており、Knowledgeが不足していると、システム全体の機能が制限されてしまいます。Knowledgeには、システムの状態に関する情報だけでなく、ポリシー(ルール)も含まれるため、Analyzeで検出された変化が許容範囲内かどうかを判断し、Planが実行可能なアクションを決定するために不可欠です。

現在のデータプラットフォームの多くは、Knowledgeの読み込みに特化しており、書き込み機能が欠けているため、真のクローズドループを実現できていません。

クローズドループの実現:Writeパスの活用

クローズドループデータプラットフォームでは、MonitorからKnowledgeへの読み込みだけでなく、PlanとExecuteからKnowledgeへの書き込み(Writeパス)が不可欠です。これにより、各サイクルでの行動や推論が次のサイクルの文脈として活用され、システムは継続的に学習・改善していくことができます。単にデータを可視化するダッシュボードとは異なり、Writeパスの存在によって、システムは自律的に変化に対応し、進化していくことが可能になります。

AIを活用した企業が目指すべきは、単なるデータ収集・分析にとどまらず、Knowledgeの充実とWriteパスの確立によるクローズドループの構築です。

まとめ

クローズドループデータプラットフォームの設計には、MAPE-Kのフレームワークに基づき、特にKnowledgeの重要性を認識し、Writeパスを確立することが不可欠です。このアプローチにより、企業はデータに基づいたより効果的な意思決定と自動化を実現し、競争力を高めることができるでしょう。

原文の冒頭を表示(英語・3段落のみ)

In the Summer 2026 Request for Startups, Diana Hu described what she’s seeing inside the best AI-native companies: “they’ve made their entire company queryable. Every meeting recorded, every ticket tracked, every customer interaction captured, all legible to an intelligence layer that learns from it. This turns a company from an open loop into a closed loop.”The framing is correct, and the analogy is more precise than people are giving it credit for. “Closed loop” is not a metaphor. It’s a thing with a definition, an engineering history, and a set of components that all have to be present for the system to actually work. If you are building the connective layer Hu is describing, or any of the related ideas YC put in front of founders this batch, the question worth asking is not whether closed loops are a good idea. They obviously are. The question is what a closed-loop data platform actually requires when you design it from first principles.When IBM (yes, that IBM) launched its Autonomic Computing Initiative in 2003, it formalized what a self-adaptive system needs to actually self-adapt. The result was MAPE-K: four functions sharing a fifth.Monitor collects information from the managed system through sensors.Analyze interprets that information against expectations.Plan decides what to do when reality drifts from expectation.Execute acts on the decision through effectors.Knowledge is the shared substrate all four functions read from and write to. It holds the policies, the rules, the historical context, the constraints, and the goals.MAPE-K is now the most influential reference model for self-adaptive systems. It is taught in distributed systems courses. It is the architecture Kubernetes implicitly follows. It shows up in everything from autonomous vehicle safety frameworks to HPC operations.Every closed-loop data platform is an instance of MAPE-K, whether the team building it knows the acronym or not. Which means designing one from first principles starts with the component that matters most, and current data platforms have least of: Knowledge.The temptation, when you set out to build this, is to start with Monitor. Sensors are concrete. They produce events. Events are easy to think about and easy to integrate. Most data platforms today are extremely good at the Monitor function. Warehouses ingest, observability tools watch, catalogs index. The Monitor surface is genuinely solved.Analyze is harder, but tractable. Anomaly detection, threshold alerts, drift monitoring, schema diffs. The data observability category was built on this layer. It works.Plan and Execute are where teams typically get stuck. The reason is not that planning algorithms are bad. The reason is that Plan and Analyze both depend entirely on the Knowledge component, and Knowledge is the part of MAPE-K that current data platforms have almost nothing for.Read the original MAPE-K papers carefully and Knowledge is described as containing “topology information, historical logs, metrics, symptoms, and policies.” The first four are observability primitives. The fifth, policies, is what makes the loop close.A policy, in MAPE-K terms, is the comparator’s setpoint. It is the rule the Analyze function checks observed reality against. It is the authority structure the Plan function consults to decide whether a remediation is allowed. Without policies, Analyze can only detect change; it cannot decide whether the change is acceptable. Without policies, Plan can only propose actions; it cannot tell which actions are sanctioned.There is a second thing missing, and it is the more important one. The arrows in every MAPE-K diagram point both ways. Monitor reads from Knowledge, but Plan and Execute write to it. The loop closes because each cycle’s actions and reasoning become next cycle’s context. A platform that only reads from its Knowledge layer is not a closed loop, it is a dashboard with extra steps.This is the write path. It is the link that converts an open loop into a closed one. Read-path infrastructure (warehouses, catalogs, observability) is optimized for the Monitor and retrieval side of MAPE-K. Write-path infrastructure captures the policies, the decisions, and the reasoning at the moment they are produced, in a form the next iteration of Monitor can actually see. Without it, every cycle starts from scratch.The clearest published example of what a sophisticated Knowledge component looks like, and what’s still missing from it, is OpenAI’s writeup of their internal data agent. I wrote about this piece in detail when it came out. Two engineers, three months, four thousand daily users querying 70,000 datasets across 600 petabytes. The architecture they describe is the most honest, technically detailed account of context engineering for a data agent that has been published anywhere.The system has six layers of context: table metadata, human annotations, Codex-powered code enrichment, institutional knowledge from Slack and Notion, learning memory across queries, and runtime data samples. Every layer is well-engineered. Codex Enrichment is genuinely novel: it reads the pipeline code that produces important tables and infers what the metadata never captures, including upstream dependencies, join semantics, and business intent baked into transformation logic. The benchmark on memory impact (one query going from 22 minutes to 82 seconds) is real.Map this onto MAPE-K and the picture sharpens. OpenAI built the most sophisticated Monitor and retrieval-side Analyze layers anyone has shipped. Their Knowledge component contains exceptional topology, lineage, historical context, and learned signal. From what they’ve shared publicly, it contains essentially no policy.The agent can tell you what a table means, what code produced it, and how it has been queried before. It cannot tell you whether a particular use of that table is sanctioned. It cannot tell you what authority structure governs changes to the rule that produced the value. It cannot tell you whether the exception that shows up in last quarter’s report was a documented override or a workaround that nobody flagged.This is not a flaw in OpenAI’s engineering. The piece is candid that the agent is positioned as a “knowledgeable teammate” for analysts, not an autonomous remediator. They built exactly what they set out to build. The structural point is that even the best published Knowledge component for a data agent is a retrieval surface over what was. It does not contain what should be, and it has no mechanism for cycle N’s decisions to become cycle N+1’s context.If the goal is a closed-loop data platform, here is what each MAPE-K layer needs to look like, and where current architectures need to extend.The Monitor layer in most data platforms captures state. A row was written. A schema changed. A metric moved. These are necessary, but a closed-loop system needs more than state transitions. It needs the intent behind them.When a record is normalized, was the normalization a default rule, an explicit override, or a one-off exception? When a value was nullified, was that the source’s null or a downstream choice? When a rule fired, what version of the rule was in effect, and who authorized that version?OpenAI’s Codex Enrichment is the most ambitious published attempt to recover this kind of intent from existing artifacts. Reading the pipeline code that produces a table is a smarter Monitor signal than reading the table itself. But it is still reconstruction. The pipeline tells you what logic ran. It does not tell you why that logic was chosen, what alternatives were rejected, or what business constraint forced the call. That information was never written into the code.Closing this gap is a write-path problem. Intent has to be captured at the moment a record is created or modified, in the system where the work happens, alongside the record itself. Data products that sit in execution paths (the unified APIs and integration layers) already have a hook into that moment. They see records cross system boundaries, which is exactly when intent and outcome diverge. The infrastructure to capture the why exists structurally; it has not been treated as a product.The Analyze layer needs more than statistical baselines. It needs a comparator with a setpoint, and the setpoint has to be queryable. That means policies in the Knowledge layer must be:First-class data, not documentation. A policy that lives in a Notion page cannot be evaluated by a comparator at runtime.Versioned, with effective dates. A 2024 decision evaluated against 2026 policy is a category error. Every policy needs to be queryable as of a specific point in time.Authoritative, with an explicit source. Who authored this rule? Who is allowed to change it? When was it last reviewed? Without this, the comparator cannot tell whether a deviation reflects a stale rule or a real violation.This is not exotic. MLOps reference architectures already version models, features, and decisions with effective dates and audit trails. The same pattern applied to organizational policy is the missing piece.Most data platforms collapse Plan into Analyze. If the metric is bad, fire an alert. If the schema drifted, raise an issue. This treats the loop as if Plan is just “tell a human about the Analyze result.”A real Plan layer asks two questions Analyze cannot answer alone. First: given this deviation, what set of remediations is even possible? Second: which of those remediations is the system authorized to execute on its own, versus which require escalation, and to whom?These questions cannot be answered without policy and authority data in Knowledge. Which is why teams that try to build autonomous remediation on top of observability platforms keep producing systems that either over-act (wrong remediations applied without authorization) or under-act (every deviation routed to a human, which is just an alert with extra steps).Execute is the layer that closes the loop. The action taken has to land back in the managed system in a way that is itself observable. Otherwise the next Monitor cycle has no idea anything happened, and you’ve built a write-only side channel.This is where most homegrown closed-loop attempts fail. The remediation gets applied in the warehouse but not the source system. Or it gets applied in the source system but the reasoning isn’t recorded, so the next cycle sees the change as a mystery. Execute needs to write the action and the justification atomically, and both have to be visible to the next iteration of Monitor.Pull these requirements together and the picture of Knowledge sharpens. A closed-loop data platform’s Knowledge component is not a knowledge graph in the marketing sense. It is a versioned, authoritative, queryable store of:Schema and lineage (current, well-served by existing tools)Metrics and quality history (current, well-served)Policies with effective dates and authors (mostly missing)Authority and approval chains (mostly missing)Decision context: the why behind exceptions, overrides, and judgment calls (almost entirely missing)The first two come from the read path. The last three have to be captured on the write path, at the moment work happens, because they exist nowhere else. There is no after-the-fact retrieval that will reconstruct them.If you are taking on Hu’s RFS, or building any of the adjacent ideas in the Summer 2026 batch, the design call to make early is this one: a closed-loop data platform is not a connective layer over read-path artifacts. It is a system where Monitor, Analyze, Plan, and Execute all share a Knowledge component that holds policy and decision context as first-class data, and where every cycle’s actions write back into that Knowledge so the next cycle can see them.The canonical reference architecture has been around for two decades. MAPE-K is well-studied. The components are well-defined.What’s missing is the write path: the infrastructure that captures policy, authority, and reasoning at the moment they are produced, in a form the rest of the loop can use. Without it, the most sophisticated read-side context system ever shipped (and OpenAI’s is currently that system) is still a smarter dashboard. With it, the loop actually closes.Hu’s analogy is more precise than people are giving it credit for. A closed loop is what a self-adaptive system requires. The startups that build the version of this that actually closes will be the ones that designed for the full MAPE-K stack from day one, with the write path treated as the structural primitive it is, not the ones that bolted Plan and Execute onto an observability platform and called it a feedback loop.The loop can’t close if Knowledge is empty. And Knowledge stays empty without a write path.Beyond the Traverse is about the space past the last node, where the tools stop and the real work begins. If this resonated, forward it to someone building in this space, and if you’re designing closed-loop data infrastructure, I’d genuinely like to talk.No posts

※ 著作権に配慮し、引用は冒頭3段落までです。続きは元記事をご覧ください。

元記事を読む ↗