Resources and insights
Our Blog
Explore insights and practical tips on mastering Databricks Data Intelligence Platform and the full spectrum of today's modern data ecosystem.
Deduplicating Data on the Databricks Lakehouse: Making joins, BI, and AI queries “safe by default.”
Learn 5 proven deduplication strategies for Databricks Lakehouse. Prevent duplicate data from breaking AI queries, BI dashboards, and analytics. Includes code examples.
Deploy Your Databricks Dashboards to Production
Stop deploying Databricks dashboards manually. Learn how to use Git, Asset Bundles, and CI/CD for reliable, reproducible dashboard deployments across environments.
The Nightmare of Initial Load (And How to Tame It)
Initial data loads don't have to be nightmares. Discover the split Bronze table pattern that separates historical backfills from incremental streaming.
You Pay for the Complexity of Your Move From On-Prem to Cloud
Moving data from on-prem to cloud shouldn't require 5+ systems. Discover why complexity costs you money and how Zerobus Ingest simplifies data pipelines.
Temp Tables Are Here, and They're Going to Change How You Use SQL
Learn how temporary tables in Databricks SQL warehouses enable materialized data, DML operations, and session-scoped ETL workflows. Complete with practical examples.
95% of GenAI projects fail. How to become part of the 5%
MIT reports 95% of GenAI investments produce zero returns. Learn the 5 failure modes keeping AI projects stuck in pilot limbo and how to ship production AI.
Hidden Magic Commands in Databricks Notebooks
Discover 12 powerful Databricks notebook magic commands beyond %sql and %python. Learn shortcuts for file operations, performance testing, and debugging.
5 Reasons You Should Be Using LakeFlow Jobs as Your Default Orchestrator
External orchestrators can account for nearly 30% of Databricks’ job costs. Discover five compelling reasons why LakeFlow Jobs should be your default orchestration layer: from Infrastructure as Code to SQL-driven workflows.
Excel never dies (and neither does SharePoint)
Learn how Databricks' native Excel import handles multi-sheet workbooks, streaming autoloader, and SharePoint integration. Includes code examples.
Databricks Workflow Backfill
Use Databricks Workflow backfill jobs to reprocess historical data, recover from outages, and handle late-arriving data efficiently.
Snowflake and Databricks: How to balance compute
Compare Snowflake and Databricks compute models. Learn scaling strategies, cost optimization tips, and when to use auto-suspend, multi-cluster, and autoscaling.
Lakebase: The best of both worlds
Lakebase brings transactional speed to Databricks analytics. Get single-record retrieval in milliseconds plus unlimited data processing in one platform.
DABs: Referencing Your Resources
Databricks bundle lookups failing with "does not exist" errors? Resource references solve timing issues and create strong dependencies. Complete guide with examples.
Managing Databricks CLI Versions in Your DAB Projects
Prevent Databricks deployment failures caused by CLI version conflicts. Step-by-step guide to version management in DAB projects with CI/CD automation.
Connecting ChatGPT to Your Databricks SQL Warehouse
Learn how to connect ChatGPT to your Databricks SQL Warehouse using Model Context Protocol (MCP). Step-by-step guide with screenshots and tips.