Databricks Lakewatch: The Future of Agentic SIEM

Databricks just launched Lakewatch, signaling the end of the traditional SIEM. Not an upgrade. Not a new pricing tier. A fundamental rethink of how security data gets stored, analyzed, and acted on. The question isn't whether this changes the market. It's whether your organization is positioned to take advantage of it.

The SIEM Tax: Why Your Security Team Is Paying to Stay Blind

Here's an uncomfortable truth most security vendors won't say out loud: your SIEM isn't missing alerts because your team isn't good enough. It's missing them because you've been forced to delete 75% of your telemetry data to keep costs under control.

More than a process failure, it’s a structural one — built into the pricing model of every traditional SIEM on the market. You pay per gigabyte ingested, so your team becomes expert at deciding which data isn't worth keeping. Every one of those decisions is a blind spot. And blind spots are exactly what attackers look for.

Meanwhile, the threat landscape has shifted. Attackers aren't waiting for a slow human triage cycle. They're running autonomous agents: scanning for vulnerabilities, executing lateral movement, and exfiltrating data at machine speed. The old model of "detect, escalate, respond" was built for a different era.

What Data Deletion Actually Costs You

The financial logic of deleting telemetry feels sound until you price what's on the other side of those blind spots. SLA breaches that trace back to a log you didn't keep. Lateral movement that went undetected because the identity event was archived three months ago. A regulatory audit where the evidence trail stops six months short of what's required.

Traditional SIEMs create a false trade-off: visibility versus cost. Organizations routinely keep 30–90 days of hot data and archive or delete the rest. In a threat environment where attackers maintain persistent access for an average of over 200 days before detection, that retention window isn't a policy choice but a liability.

The SIEM-as-a-product model is the root cause. When the vendor charges for every byte you ingest and store, your data strategy gets shaped by pricing, not security needs. That's the architecture Lakewatch is built to replace.

What Is Databricks Lakewatch?

Lakewatch is built on what Databricks calls the Open Security Lakehouse, an architecture that decouples compute from storage and uses open data formats to eliminate the ingestion tax that has defined traditional SIEM economics.

Instead of data flowing into a proprietary indexing engine where you pay to store every byte, your telemetry lands in your own cloud object storage in open formats: Delta Lake, Apache Iceberg, and the Open Cybersecurity Schema Framework (OCSF). You own the data. You pay cloud commodity rates to store it. Compute costs only occur when you run queries or trigger agents.

The result:100% telemetry retention at a total cost of ownership up to 80% lower than legacy solutions. The trade-off that defined traditional SIEM economics — visibility versus budget — stops being a trade-off.

The Architecture: From Raw Signal to Autonomous Response

Lakewatch follows the Bronze-Silver-Gold data layering pattern familiar to any Databricks practitioner, applied specifically to security operations:

Bronze (Ingestion): Raw telemetry lands in native format via Lakeflow Connect, which automatically normalizes sources (AWS, Okta, Zscaler, Palo Alto Networks, and others) into the OCSF schema. Full fidelity is preserved. Unity Catalog governs access from day one.

Silver (Normalization): Data is normalized, enriched with business context (HR roles, asset criticality, geography) and made queryable across years of history without cost penalties.

Gold (Detection & Response): Curated data powers detection engineering and the agentic AI layer. Rules are defined as code in YAML or Python notebooks(Detection-as-Code), enabling version control and CI/CD practices in the SOC.

The Agentic Reasoning Loop

This is where Lakewatch moves beyond static automation playbooks. Traditional SOAR tools execute predefined workflows. Lakewatch deploys AI agents built on the Agent Bricks framework that can perceive, reason, and act autonomously across the full investigation cycle:

  1. Perception & Triage: Agents ingest signals from across the Open Security Lakehouse (identity, endpoint, network, and cloud telemetry) in a unified context.

  2. Planning: The agent breaks the investigation into logical subtasks and selects the right tools and sequence without human intervention.

  3. Execution: The agent runs SQL queries, API calls, or scripts in isolated environments governed by boundaries enforced by Antimatter, preventing prompt injection and unauthorized credential access.

  4. Adaptation: Results are evaluated, and the agent pivots the investigation if the evidence warrants it.

  5. Resolution: The system completes the task, executes authorized containment actions, and surfaces a reasoning summary for human review.

Genie agents add a natural language interface on top: Genie Code automates the creation of new detections from threat intelligence, while Genie Spaces lets analysts query years of historical data in plain English. The partnership with Anthropic's Claude models underpins the reasoning layer and enables correlation analysis that goes beyond simple pattern matching

How Lakewatch Compares

The entry of Databricks into the SIEM market alters the competitive dynamics dominated by Splunk and Microsoft Sentinel. An analysis of these platforms reveals a "triple constraint" of cost, customization, and cloud-native integration that organizations must balance.

Market Analysis

Splunk has historically been the standard for large enterprises requiring deep customization, but its architecture often leads to a high TCO due to ingestion rates and management complexity. The acquisition by Cisco has introduced uncertainty, although it maintains a market share near 47%.

Microsoft Sentinel has grown rapidly by offering a cloud-native alternative for organizations in the Azure ecosystem, with highly mature AI integration (Copilot). However, its economic advantage diminishes in non-Microsoft environments, where third-party ingestion costs can be high.

LakeWatch differentiates itself by applying the "lakehouse disruption" to security. By allowing organizations to store years of telemetry in their own cloud storage at a fraction of the cost, Databricks enables a "total visibility" strategy where discarding data is no longer a financial necessity. Furthermore, the move toward agentic AI positions LakeWatch as a "next-generation" platform capable of countering automated threats in the AI era, while competitors remain mostly in an "augmented AI" phase.

Source: Databricks

What This Means for Your Organization

The numbers are significant. For a scenario of 35TB per day with 365-day retention, traditional cloud SIEMs can run into the tens of millions annually. The Lakewatch architecture enables a 250% increase in data volume and four times the retention periodat thesame total cost of ownership. That's not a marginal efficiency gain. It's a fundamentally different operating model.

For CISOs and VPs of Data, the strategic implication is this: Lakewatch converts security data from a cost center to a governed asset. The same Unity Catalog that governs your business data governs your telemetry. The same Delta Lake tables that power your analytics pipelines power your threat detection. Security data is no longer siloed in a proprietary tool; it's part of your data platform.

For security engineers, the operational shift is equally significant. Detection-as-Code means detection rules live in Git, get reviewed like software, and are deployed through CI/CD pipelines. The analyst workflow moves from "sift through alerts manually" to "review agent reasoning summaries and approve containment actions."

And for organizations operating under NIS2, DORA, or similar regulatory frameworks: multi-year retention at low cost, with full auditability through Unity Catalog lineage, stops being an aspirational capability and becomes a default one.

For companies already on Databricks, Lakewatch means you're not adding a SIEM. You're extending the platform you already run, with the same governance model, the same compute layer, and the same data contracts your engineering teams already know.

If Your Team Is Already on Databricks, Read This First

Lakewatch isn't a rip-and-replace proposition for most Databricks organizations. The Bronze-Silver-Gold architecture, Unity Catalog governance, and Delta Lake storage patterns are things your data engineering team already understands. What changes is how security telemetry gets routed into that architecture — and how detection logic gets built on top of it.

The highest-leverage starting point is usually a data architecture assessment: map your current telemetry sources, identify the retention gaps that are creating blind spots, and model the TCO delta against your existing SIEM contract. That exercise alone tends to clarify the decision quickly.

Conclusion: The Future of the Autonomous Enterprise

The launch of Lakewatch signals the convergence of data, AI, and security. In the future, the ability to analyze multi-modal data (audio and video) alongside traditional telemetry will be a defensive requirement against AI-driven social engineering attacks.

For CISOs, Lakewatch represents the end of passive, rule-based SIEMs and the beginning of intelligent, agentic security platforms that can protect the organization at the same speed at which threats evolve.

Next
Next

Lakeflow Connect Free Tier: $35/Day Back in Your Budget