The shift is already underway. Data platform vendors are embedding semantic capabilities as standard infrastructure. Enterprise data teams are building semantic models as a matter of course. This post explains what is driving the shift, what it means for how data teams organize their work, and what it takes to build a semantic layer that serves both analytics and AI.
There is a pattern in how enterprise infrastructure evolves. A capability that was once a specialist addition — requiring dedicated tooling, bespoke implementation, and specialist skills — gradually becomes expected. It is embedded into standard platforms. Teams that do not have it find themselves at a structural disadvantage. Teams that built it early find that it compounds in value over time.
Semantic layers are undergoing that transition now. Two years ago, an enterprise semantic layer was a differentiated architectural choice. Today, Databricks is positioning Business Semantics as platform infrastructure. Snowflake is embedding Semantic Views as first-class schema objects. The industry consensus is shifting from “why would you build a semantic layer?” to “what kind of semantic layer should you build, and for what purpose?”
For data teams, this creates a decision that has real long-term consequences. Understanding what is driving the shift, and what distinguishes a semantic layer built for BI metrics from one built for enterprise AI, is the starting point for making it well.
|
Semantic layers are no longer a specialist investment. They are becoming the standard infrastructure layer between your data and the AI and analytics systems that depend on it. The question is no longer whether to build one — it is what to build, and how deep to go. |
What is driving semantic layers into core infrastructure
Three converging forces are making semantic layers mandatory rather than optional for enterprise data teams.
1. AI systems need explicit meaning, not just data
Large language models and AI agents are capable of impressive language generation and reasoning. But they do not know what your data means. They cannot reliably determine whether “revenue” means gross or net, whether “customer” refers to a contact, an account, or a billing entity, or how an asset’s maintenance history relates to its operational risk profile. Without explicit meaning, AI answers are plausible-sounding inferences from statistical patterns — not reliable reflections of your business.
A semantic layer is the mechanism that makes meaning explicit. When business concepts are formally defined — entities, relationships, terminology, and rules — AI systems can operate from those definitions rather than constructing them from raw schema. The quality of AI outputs in enterprise settings is increasingly determined by the quality of the semantic layer underneath them.
2. Platform vendors are embedding semantic capabilities as standard
The vendor landscape has shifted materially in the past two years. Databricks Business Semantics is positioned as a foundation for centralising business definitions, providing lineage and audit visibility, and powering natural-language experiences like Genie. Snowflake Semantic Views encode business concepts, metrics, and relationships so that Cortex Analyst can generate accurate SQL from business questions. Microsoft’s Fabric is embedding semantic models as a core layer.
This is not incidental. Platform vendors are building semantic capabilities into their platforms because they have concluded — based on customer experience and competitive pressure — that without a semantic layer, the AI and analytics capabilities built on top of their platforms will underperform. The semantic layer is becoming a platform requirement, not an add-on.
3. Governance, lineage, and AI accountability are converging
AI governance requirements — from the EU AI Act, from financial regulators, from internal risk and compliance functions — are demanding that AI decisions be traceable to their data sources and reasoning logic. A semantic layer that is integrated with the data platform’s governance model provides the lineage foundation for that traceability. As AI accountability becomes a regulatory and organisational expectation, the semantic layer becomes the governance layer for AI, not just for analytics.
|
The infrastructure inflection point Infrastructure transitions happen when a capability moves from differentiated to expected. Email, web presence, cloud computing, CI/CD pipelines — each of these was once a specialist investment. Each became standard practice. Semantic layers are at that inflection point now. The teams that built robust semantic models early will find that investment compounding in value as AI use cases multiply. The teams that did not will face a retrofit problem that grows harder with each new AI system deployed on top of ungoverned data. |
Not all semantic layers serve the same purpose
As semantic layers move into mainstream conversation, a definitional problem is emerging. The same term is being used to describe capabilities that have meaningfully different architectures and serve meaningfully different purposes. For data teams making architectural decisions, the distinction matters.
|
Semantic layer type |
What it does |
What it does not do |
|
BI metrics layer(dbt Semantic Layer, AtScale) |
Standardises KPI definitions, metric calculations, and business term mapping across BI tools. Ensures that “monthly active users” or “net revenue” means the same thing in every dashboard and report. |
Does not model entity relationships. Cannot answer questions that require connecting entities across domains. Not designed for AI grounding or multi-hop reasoning. |
|
Platform semantic capabilities(Databricks Business Semantics, Snowflake Semantic Views) |
Encodes business concepts and metrics as first-class schema objects within the data platform. Enables natural-language query interfaces (Genie, Cortex Analyst) to generate accurate SQL from business questions. |
Optimised for metric consistency and BI acceleration within the platform. Limited support for complex entity relationship modelling or cross-domain semantic reasoning at the depth required for enterprise AI. |
|
Enterprise knowledge layer(Kobai on Databricks) |
Models enterprise entities, their relationships, and the rules that govern them. Provides cross-domain, relationship-aware semantic context for analytics, AI, and agents. Supports multi-hop reasoning and explainable AI grounding. |
Extends rather than replaces BI metric layers and platform semantic capabilities. Designed to sit on top of Databricks Business Semantics and Unity Catalog governance, adding entity relationship depth and AI grounding that platform-native capabilities are not optimized for. |
The practical implication is that data teams often need both: a BI metrics layer for KPI consistency across reporting, and an enterprise knowledge layer for cross-domain entity reasoning and AI grounding. These are complementary investments. Platform semantic capabilities like Databricks Business Semantics establish the metric and BI foundation. An enterprise knowledge layer extends that foundation into relationship-aware reasoning and AI context — the capability the platform layer is not designed to provide.
|
Databricks Business Semantics + Unity Catalog + Genie = the governed analytics foundation. An enterprise knowledge layer extends that foundation into cross-domain entity reasoning and AI context. These are additive investments: each does a job the other is not designed for. |
What this shift means for how data teams work
The transition of semantic layers into core infrastructure changes the organisational model for data teams in several specific ways.
Semantic modelling becomes a standing capability, not a project
When semantic layers were specialist infrastructure, they were typically built as projects — a bounded engagement with a defined scope, deliverable, and end date. As they become core infrastructure, the ongoing maintenance and evolution of the semantic model becomes a standing team responsibility. New data sources need to be mapped to the model. New business requirements generate new entity definitions and relationship types. Model changes need to be reviewed, versioned, and governed.
This is a meaningful shift in how data engineering teams are organised. Teams that have built semantic modelling into their operating model — with clear ownership, tooling for domain expert participation, and governance processes for model changes — are better positioned for the AI era than those that have treated it as a one-time initiative.
Domain experts become semantic model contributors, not just consumers
The most consequential shift that semantic layers create for data teams is in the ownership model for meaning. When the semantic model is maintained by data engineers alone — because the tooling requires engineering skills — the model reflects how engineers interpreted the business. When domain experts can contribute directly — because the tooling is accessible to non-engineers — the model reflects how the business actually works.
This shift changes the relationship between data teams and business teams. Data engineers become the stewards of the semantic model’s technical foundation. Domain experts become the authors of the business definitions within it. The model is a shared artefact rather than a technical deliverable.
The semantic layer becomes the interface between data and AI
As AI copilots, agents, and decision support systems multiply, the semantic layer becomes the primary interface through which they access enterprise data. Each new AI system or Genie space that connects to the semantic layer draws from the same shared definitions rather than constructing its own interpretation of the schema. The consistency and accuracy of AI outputs across the enterprise are determined, in large part, by the quality of the semantic layer they share.
For data teams, this means that investment in the semantic model compounds in value with every new AI system deployed. A well-maintained semantic model that five teams and three AI systems depend on is significantly more valuable than five isolated semantic models that each serve a single use case.
Semantic lineage becomes the foundation for AI governance
As AI governance requirements mature, data teams are finding that the semantic layer is the natural place to establish lineage for AI outputs. When an AI system answers a question by traversing a semantic model — rather than by making statistical inferences from raw data — the reasoning path is explicit and inspectable. Compliance teams can trace an AI answer from the output back through the semantic query to the governed source data. Model risk management functions can validate the semantic model as the foundation for AI behaviour. This traceability is the practical implementation of AI explainability, and it requires the semantic layer to be present.
What to build: the characteristics of a semantic layer that scales
Not every semantic layer will serve as a foundation for enterprise AI. Understanding what distinguishes a semantic layer that scales — across teams, use cases, and AI systems — from one that delivers value in a limited scope is essential for teams making investment decisions.
Governance continuity, not governance overhead
A semantic layer that introduces its own governance model — separate from the data platform’s access controls, lineage tracking, and audit capabilities — creates operational overhead that compounds over time. As teams and use cases multiply, maintaining two parallel governance models becomes a significant cost and a compliance risk. A semantic layer that inherits governance from the underlying data platform — so that the policies and lineage already in place extend naturally to semantic queries and AI answers — scales without introducing that overhead.
Domain expert participation, not engineering dependency
A semantic model that can only be modified by data engineers through code will not stay current with the business it represents. Businesses change faster than engineering backlogs can accommodate. No-code semantic modelling tools that allow domain experts to define, modify, and extend entity definitions and relationship types directly — without requiring a data engineering ticket — produce semantic models that reflect the business as it actually operates. They also produce significantly higher adoption, because the teams whose questions the model is built to answer recognise it as a reflection of their own domain.
Entity relationships, not just metric definitions
A semantic layer optimised for BI metric consistency — standardising the definitions of KPIs and business terms across dashboards — is valuable for reporting but insufficient for AI grounding. AI systems that need to answer cross-domain questions — connecting assets to maintenance history to engineer certifications to operational schedules — require a semantic layer that models the relationships between entities explicitly. The distinguishing capability is not metric standardisation; it is relationship-aware reasoning across connected entities.
Incremental build, compounding value
Semantic models that attempt to model the entire enterprise before delivering value consistently fail. The pattern that works is narrow initial scope — a single domain, 3–5 target questions, the entities that answer those questions — expanded incrementally as each new domain adds value through shared entities. When an engineer model is extended to include asset certifications, it can answer workforce questions. When asset certifications are connected to asset types, it can answer scheduling questions. When scheduling is connected to operational windows, it can answer revenue optimization questions. Each extension multiplies the value of what exists.
|
Characteristics of a semantic layer built for the AI era
|
Where semantic layers are heading: three trends worth watching
Agentic AI raises the stakes for semantic quality
As AI agents become capable of taking autonomous actions — not just answering questions but executing workflows, triggering integrations, and making decisions — the semantic foundation they operate on becomes safety-critical. An agent that acts on a wrong AI answer because the semantic model contains an ambiguous entity definition or a missing relationship can cause real operational harm. The quality standards for semantic models will rise in proportion to the autonomy of the AI systems that depend on them.
Semantic layers become the AI context plane
The proliferation of AI systems in enterprises — multiple copilots, agents, and AI-assisted workflows operating across different teams and domains — creates a coordination problem around context. Each system needs to understand the same enterprise entities in a consistent way. The semantic layer that provides that shared context becomes, in effect, the AI context plane: the central resource that all AI systems draw from to ensure they are operating from the same understanding of the business. Organizations that build this plane deliberately will have materially better AI outcomes than those that allow each AI system to construct its own interpretation of the data.
Data team roles will reorganize around semantic ownership
As semantic layers become core infrastructure, the skills and roles needed to build and maintain them will become central rather than peripheral to data team composition. Semantic model architects, ontology stewards, and the processes for governing model changes will become standard parts of how mature data teams operate. The skills involved are partly technical and partly domain-oriented — the most effective semantic model owners understand both how to model a domain formally and how the business actually uses the entities in that domain.
|
The data teams that will be best positioned for the AI era are not the ones with the most data or the most models. They are the ones with the clearest, most current, most governed shared understanding of what their data means — expressed in a semantic layer that every system in the organization can draw from. |
Kobai: the enterprise knowledge layer that extends platform semantic capabilities
Kobai provides enterprise knowledge graph and semantic AI capabilities on the Databricks Lakehouse. Graph structures are built directly within Databricks, under Unity Catalog governance, with no requirement to export data to a separate graph platform. Kobai operates within the Databricks governance and compute model, providing graph execution and semantic indexing natively inside the Lakehouse.
The positioning is explicit: Kobai extends platform semantic capabilities rather than replacing them. Databricks Business Semantics standardizes metrics, provides lineage visibility, and powers Genie. Kobai adds the cross-domain entity relationship modelling, knowledge graph traversal, and AI grounding that the platform layer is not designed to provide. Domain experts author the semantic model in Kobai Studio using no-code tooling. AI answers from Episteme carry deterministic, traceable graphical lineage. Kobai can integrate with and map to W3C-aligned ontology standards where required, through integration and import/export capabilities.
The path to getting started is available on the Databricks Marketplace: the Genie Spaces Accelerator Kit for teams building on Genie, and the Semantic Graph Pilot for teams that want to establish knowledge graph and cross-domain AI reasoning on their existing Lakehouse data.
|
To explore how a knowledge layer on Databricks extends your existing semantic investments, visit kobai.io or contact us at contact@kobai.io. |