Skip to content
1*zn2FQFJ5Fq_MLIu9-zISzA-2-200x200
Semantic Distillation: A Brief Primer

The fact that business teams are drowning in disconnected data is getting to be a bit of a cliche. Adding a semantic layer to an enterprise data platform can bring order to chaos, allowing teams to collaborate effectively and leverage AI to unlock valuable insights.

Celebal Technologies Partners with Kobai
Celebal Technologies Partners with Kobai

to Launch Turnkey Knowledge Graph Solutions For Global
Enterprises on Databricks

Latest Event:
Webminar on Wednesday, October 29th, 2025
Play now
Kobai on Databricks Marketplace: What This Means for Lakehouse-Native AI
KobaiApr 28, 2026 5:46:20 AM13 min read

Kobai on Databricks Marketplace: What This Means for Lakehouse-Native AI

Kobai on Databricks Marketplace: What This Means for Lakehouse-Native AI
7:41

Kobai’s solutions are available on the Databricks Marketplace, making it straightforward for Databricks customers to add a semantic intelligence layer to the Lakehouse they already run. This post explains what that means, what each listing provides, and why the combination matters for enterprise AI.

The Databricks Marketplace has become a meaningful signal of ecosystem maturity. When a solution appears there, it means something specific: it is built on Databricks, validated to run within the Lakehouse architecture, and available through familiar procurement channels without requiring a new data platform, a new governance model, or a separate system of record.

Kobai is listed as a Built on Databricks partner with two solutions on the Marketplace: the Genie Spaces Accelerator Kit and the Semantic Graph Pilot. Both are designed to help Databricks teams extend what their Lakehouse can do for AI, thereby adding semantic intelligence and knowledge graph capabilities directly on top of the infrastructure they already operate.

This post unpacks what each solution provides, how they fit into the Databricks architecture, and what Lakehouse-native AI actually means in practice.

Kobai operates directly on Databricks compute. Semantic queries inherit governance from Unity Catalog. Data stays in Delta Lake tables. There is nothing new to operate alongside your Lakehouse as Kobai is just an extension of it.

What “Built on Databricks” means and why it matters

The Built on Databricks designation is not just a badge. It describes a specific architectural relationship: the software runs natively on the Databricks Data Intelligence Platform, uses Databricks compute to execute, and integrates with Unity Catalog for governance and lineage. It does not run alongside Databricks, it runs within it.

For enterprise teams evaluating additions to their data stack, this distinction has practical consequences:

  • No new infrastructure to provision or maintain alongside the Lakehouse
  • No data pipelines to manage between Databricks and an external system
  • Governance, access controls, and data lineage flow through Unity Catalog without additional configuration
  • Workloads that Kobai enables - semantic graph queries, AI reasoning across connected entities - run as Databricks compute jobs, contributing to existing Databricks utilization

The Databricks Marketplace listing makes this accessible through a streamlined procurement and deployment process, reducing the friction of getting started and enabling teams to pilot capabilities within their existing Databricks environment.

What this means for Databricks teams

Evaluating Kobai does not require a new infrastructure decision. Teams with an existing Databricks Lakehouse can deploy either Marketplace solution within that environment, connect it to governed Delta tables, and demonstrate value in a contained pilot before any wider commitment.

What Kobai offers on the Databricks Marketplace

Kobai’s two Marketplace solutions address different entry points. One is oriented toward teams already using Genie who want to scale it across the business. The other is for teams that want to establish a semantic graph capability from the ground up.

1. Kobai Genie Spaces Accelerator Kit

Databricks Genie is a conversational AI interface that allows business users to query their Lakehouse data in natural language. It is a strong capability for making data more accessible but as teams try to scale it across departments and domains, a consistent challenge emerges: each Genie space ends up defining its own business logic, and definitions drift across teams.

A customer might have one Genie space where “active asset” means one thing, and another where it means something different. Cross-domain questions that span operations and finance, or customer data and supply chain break down because there is no shared semantic foundation.

The Kobai Genie Spaces Accelerator Kit addresses this directly. It provides the tooling and guided process to establish a shared semantic model across Genie spaces so that enterprise entities, their definitions, and their relationships are defined once and made consistently available to all Genie spaces built on top.

Kobai Genie Spaces Accelerator Kit

Databricks Marketplace → Kobai Genie Spaces Accelerator Kit

A structured accelerator for Databricks teams looking to establish a shared semantic layer across Genie spaces. Includes ontology modelling tooling, data mapping guidance, and a reference implementation for connecting Kobai’s semantic model to Databricks Genie, enabling consistent, cross-domain conversational AI on governed Lakehouse data.

Available now on the Databricks Marketplace · Built on Databricks · Unity Catalog compatible

What the Genie Spaces Accelerator Kit provides in practice:

Capability

What it does

Why it matters for Genie

Shared semantic model

Defines enterprise entities, relationships, and business meaning once, as a governed layer over Delta tables

Every Genie space draws from the same definitions — eliminating drift across business units

No-code ontology tooling

Domain experts model entities and relationships visually, without writing code or schema definitions

The people who understand the business can own the definitions, without routing through engineering

Unity Catalog integration

Semantic model inherits Unity Catalog access controls and lineage

Governance that applies to your data also applies to the semantic layer on top of it

Genie space connection

Published semantic views are made available to Genie in a single step, via the Kobai SDK

Genie answers become grounded in a shared, governed semantic model rather than ad hoc space-level logic

Cross-domain consistency

Questions that span operations, finance, or supply chain resolve against the same entity definitions

Teams can ask cross-domain questions and trust that the AI is working from consistent context

2. Kobai Semantic Graph Pilot

For teams that want to go further than conversational AI such as relationship-aware reasoning, multi-hop queries, and explainable AI grounded in a full knowledge graph, the Kobai Semantic Graph Pilot provides a structured path to get there within the Databricks Lakehouse.

The Pilot is a contained, time-bounded engagement designed to produce a working semantic graph from real enterprise data, connected to a defined set of business questions. It is intended for organizations that want to validate knowledge graph capabilities in their environment before broader adoption without the overhead of a multi-year implementation project.

Kobai Semantic Graph Pilot

Databricks Marketplace → Kobai Semantic Graph Pilot

A guided pilot for Databricks customers that want to establish knowledge graph and semantic AI capabilities directly on their Lakehouse data. The pilot spans ontology design, data source connection, semantic graph build, and demonstration of cross-domain AI reasoning — all within the Databricks Lakehouse architecture, with no data movement or additional platform required.

Available now on the Databricks Marketplace · Built on Databricks · Unity Catalog compatible

What the Semantic Graph Pilot covers:

Phase

What happens

Output

Discovery

Working with domain experts to identify the 2–3 business questions most likely to benefit from semantic graph capabilities, and the data sources that support them

Defined pilot scope, prioritized question set, and source-to-ontology mapping plan

Ontology design

Building a lightweight semantic model — entity types, relationship types, and properties — that covers the pilot domain using Kobai Studio

A governed ontology, modelled by domain experts, aligned to the business questions

Graph build

Connecting source data from Databricks Delta tables to the ontology using Kobai Precursor (automated mapping), and building the semantic graph index via Kobai Saturn

A live semantic graph, queryable through Kobai’s APIs and tools, running on Databricks compute

Demonstration

Showing the target business questions answered through the semantic graph, including cross-domain traversal and AI reasoning via Kobai Episteme

A demonstration of GraphAI on real data, with traceable lineage back to source entities

How Kobai sits within the Databricks Lakehouse

Understanding the architecture is important for teams evaluating whether Kobai is a fit for their environment. The key principle is that Kobai does not introduce a separate data platform. It operates as a semantic layer over the Databricks infrastructure you already have.

Reference Architecture — Kobai on Databricks

Data sources

Operational databases, ERP/SAP, cloud storage, APIs, IoT sensors, existing data warehouse

Databricks Lakehouse (Delta Lake · Unity Catalog · Spark Compute)

The governed data foundation. All Kobai queries execute here. No data leaves this layer.

Kobai Precursor

Automated mapping of source data to the semantic ontology — reducing the data engineering effort needed to connect raw tables to the graph

Kobai Saturn (Semantic Graph Index)

The knowledge graph engine, built directly over Delta tables. No data is moved or duplicated. Graph traversals are translated into SQL/Spark and executed on Databricks compute.

Kobai Studio

No-code ontology modelling environment where domain experts define entities, relationships, and business rules visually

AI · BI · Agents · Genie

Downstream consumers: Databricks Genie (via the Kobai SDK), AI agents, BI tools, Notebooks, and custom applications — all drawing from the governed semantic layer

The architectural consequence of this design is that there is no synchronisation pipeline to manage between Kobai and Databricks. When data in a Delta table changes, the semantic graph reflects it because the graph is not a copy of the data, it is a semantic index over it. Unity Catalog access controls determine what each user or agent can see, and that governance flows through to every semantic query.

Kobai adds meaning and relationship awareness on top of the Databricks Lakehouse without adding a new system of record. The Lakehouse remains the single source of truth. Kobai makes that source of truth traversable.

What semantic intelligence adds to Lakehouse AI workloads

The Databricks Lakehouse is a strong foundation for enterprise AI. Delta Lake provides scalable, governed storage. Unity Catalog provides unified access control and lineage. Databricks AI services including MLflow, Vector Search, and Genie provide capable tooling for building and serving AI.

What a semantic intelligence layer adds is the ability for AI systems to reason about how the data is connected, not just what it contains. There are three capabilities that this unlocks at the Lakehouse level:

Cross-domain reasoning

Most enterprise AI questions span more than one operational domain. “Which of our contracts are most exposed to the supply chain disruption affecting Supplier X?” requires connecting supplier data, parts data, assembly data, product data, and contract data — all of which may live in separate Delta tables, connected by relationships that exist in the real world but are not declared in any schema.

A semantic knowledge graph declares those relationships explicitly. Once declared, any AI system, Genie space, or analytic tool that queries through the semantic layer can traverse them without reconstructing complex joins. The question gets answered across domains, not just within one.

Consistent business meaning across Genie spaces and AI agents

As organizations build more Genie spaces and deploy more AI agents, semantic consistency becomes a governance challenge. Different teams define the same concept differently. An agent operating across multiple domains can encounter conflicting definitions and produce inconsistent answers.

A shared semantic model solves this at the source. Enterprise entities and their definitions are established once, by the people who understand the business. Every Genie space, every agent, and every analytic tool that connects through the semantic layer operates from the same shared ground truth, not from individually-configured logic that drifts over time.

Explainable, traceable AI answers

In regulated industries such as financial services, aerospace, energy, and life sciences, AI answers must be auditable. It is not sufficient for an AI to produce a correct answer; the reasoning path that produced it must be traceable to specific data, definitions, and relationships.

Kobai’s GraphAI capability (Episteme) provides this by design. Every answer generated through the semantic graph includes graphical lineage back to the entities and relationships that contributed to it. This is not traceability retrofitted onto an AI system, it is the natural output of reasoning over a structured, governed semantic model.

On agentic AI and semantic context

Agentic AI systems that operate autonomously across multiple tools, data sources, and decision points, need reliable context to function well. Without a semantic foundation, agents must reconstruct the relationships between entities from raw data each time they act, which is slow and prone to error. A semantic knowledge graph provides that context as a governed, always-current resource that agents can query directly. Kobai exposes this context to AI agents via the Kobai SDK, with a published path to MCP-based interaction as the ecosystem evolves.

Which Databricks teams are best positioned to benefit

The two Marketplace solutions are not a fit for every team equally. Below is a practical guide to which signals suggest a good fit for each.

Genie Spaces Accelerator Kit is a good fit when…

Semantic Graph Pilot is a good fit when…

You are already using Genie and want to scale it beyond a single team or domain

You want knowledge graph and multi-hop reasoning capabilities, not just conversational AI

Business logic is being duplicated across Genie spaces and definitions are drifting

Your highest-value questions span multiple systems and require traversing entity relationships

You need a shared semantic foundation without building a separate data product

You want to validate knowledge graph capabilities in your environment before broader adoption

Domain experts need to own the business logic that Genie uses, without engineering involvement

You are evaluating a graph database and want to explore whether Lakehouse-native capabilities are sufficient

You want Genie answers that are consistent, governed, and auditable across the organization

You operate in a regulated environment where AI explainability and answer lineage are required

For teams that want to pursue both (a shared semantic layer for Genie and a full knowledge graph capability), the two solutions are complementary. The Genie Spaces Accelerator Kit establishes the shared semantic model. The Semantic Graph Pilot extends that model into full graph traversal and GraphAI. Starting with one and expanding into the other is a natural progression.

Where this architecture is particularly well suited

Kobai’s knowledge graph and semantic intelligence capabilities are particularly relevant for organizations with complex, multi-system data environments where cross-domain reasoning and AI explainability matter. The following verticals reflect where these patterns appear most consistently.

Vertical

Typical challenge

How semantic intelligence on Databricks helps

Aerospace & Defence

Asset genealogy, maintenance traceability, and compliance data span multiple systems and must be reconciled for audit and operational decisions

Connect design, manufacturing, MRO, and regulatory data in a single semantic model; enable cross-domain investigations in minutes rather than days

Energy & Oil/Gas

Operational data from SCADA, ERP, maintenance, and sensor systems is rarely connected; root cause investigations are manual and slow

Semantic graph connects asset, event, and operational data; enables AI-assisted root cause analysis across the full asset network

Manufacturing

Supply chain, BOM, production, and supplier data lives in separate systems; disruption response requires manual cross-system investigation

Connected semantic model of suppliers, parts, assemblies, and products enables real-time disruption analysis and scenario planning

Life Sciences / Pharma

R&D, clinical, regulatory, and supply chain data must be connected for compliance and AI-assisted discovery; traceability is a regulatory requirement

Semantic knowledge graph links clinical, manufacturing, and regulatory entities; Episteme provides the explainable AI lineage that audits require

Financial Services

Customer, product, counterparty, and risk data span multiple books of record; cross-domain risk questions require complex manual joins

Semantic model connects customer, contract, and exposure data; enables consistent cross-domain risk reasoning with auditable AI outputs

How to access Kobai’s solutions on the Databricks Marketplace

Both solutions are accessible through the Databricks Marketplace today. Existing Databricks customers can browse, request, and deploy from within their Databricks environment using their existing credentials and procurement relationship.

The simplest path to getting started:

  • Browse either listing on the Databricks Marketplace and request access
  • Connect with the Kobai team to scope your starting point — Genie acceleration or semantic graph pilot
  • Identify 2–3 cross-domain questions that would benefit most from semantic intelligence
  • Work with Kobai to deploy the semantic model over your existing Delta tables, within your existing Unity Catalog governance
  • Demonstrate value on a contained scope before expanding to additional domains or use cases

For teams working with a Databricks account team or Databricks Partner Manager, Kobai is a co-sell partner. The fastest route to a working pilot is through a joint conversation between your Databricks contact and the Kobai team at

databricks-partner@kobai.io.

Kobai — Semantic Intelligence for the Databricks Lakehouse

Kobai is a Built on Databricks partner providing a Lakehouse-native semantic intelligence layer for enterprise data and AI workloads. Kobai allows organizations to define enterprise entities, relationships, and business meaning directly over governed Databricks data — enabling knowledge graph capabilities, cross-domain AI reasoning, and explainable answers without introducing a separate graph database or additional system of record.

Kobai on the Databricks Marketplace:

→ Genie Spaces Accelerator Kit: marketplace.databricks.com → Kobai Genie Spaces Accelerator Kit

→ Semantic Graph Pilot: marketplace.databricks.com → Kobai Semantic Graph Pilot

→ Kobai on Databricks: kobai.io/databricks

 

COMMENTS

RELATED ARTICLES