Resources and insights
Our Blog
Explore insights and practical tips on mastering Databricks Data Intelligence Platform and the full spectrum of today's modern data ecosystem.
New Databricks INSERT Features: INSERT REPLACE ON and INSERT REPLACE USING
Databricks SQL introduces two powerful new INSERT commands: INSERT REPLACE ON for conditional record replacement and INSERT REPLACE USING for complete partition overwrites. These Delta-native features eliminate complex workarounds while maintaining data integrity. Available in Databricks Runtime 16.3+ and 17.1+ respectively, these commands provide developers with precise control over data updates and partition management in modern data engineering workflows.
Make Joins on Geographical Data: Spatial Support in Databricks
Databricks Runtime 17.1's new geospatial support makes it easy to join geographical data using geography and geometry datatypes. This blog shows you how to map delivery zones, calculate distances, and optimize routes using spatial functions.
Dashboards for Nerds: DataFrame Plotting in Databricks
Discover Spark native plotting in Databricks - create charts directly from DataFrames without pandas conversion. Perfect for data engineers.
Grant individual permission to secrets read in Unity Catalog
Implement granular secret access control in Unity Catalog that goes beyond traditional Key Vault-level permissions. This advanced approach uses Unity Catalog UDFs with service credentials to create secret-specific access controls, allowing you to grant users access to individual secrets rather than entire Key Vaults.
Query Your Lakehouse In Under 1 ms.: From OLAP to OLTP
Databricks Lakehouse excels at analytical workloads but struggles with single-row lookups that customer-facing applications demand. This blog shows how to solve the OLAP vs OLTP dilemma by adding PostgreSQL-based OLTP capabilities to your existing Lakehouse architecture
Why Financial Institutions Are Ditching Vendor Solutions for Databricks
How 7 Databricks accelerators are helping banks and financial institutions replace expensive vendor solutions with custom in-house data applications for AML, fraud detection, customer analytics, and compliance.
CI/CD Best Practices: Passing tests isn't enough
CI/CD pipelines can pass all jobs yet still deploy broken functionality. This blog covers smoke testing, regression testing, and critical validation strategies: especially useful for data projects where data quality is as important as code quality.
Recursive CTE: The beauty of SQL Self-Referencing Queries
Recursive CTEs in SQL: queries that can reference themselves to solve complex problems iteratively. From generating sequences to traversing network graphs and hierarchical data, learn how to eliminate manual looping with SQL solutions.
Managing Data Changes with SCDs in Databricks
Discover how to build trustworthy data systems with Slowly Changing Dimensions in Databricks. This comprehensive guide covers SCD Types 1, 2, and 6 implementations using Delta Lake's MERGE operations and LakeFlow Declarative Pipelines, with practical SQL and Python examples.
Add External Data Sources to Unity Catalog Lineage
Enhance your Unity Catalog lineage by incorporating external data sources such as Kafka streams and IoT devices. This blog covers both UI-based and programmatic methods for creating complete data lineage visibility in your Databricks environment.
AI_PARSE_DOCUMENT() Get PDF Invoices Into The Database
Learn how to automate invoice processing with Databricks' AI_PARSE_DOCUMENT() function. Step-by-step guide to convert PDF invoices into structured database records using SQL and Agent Bricks. Includes cost analysis and real examples.
Managed Iceberg Tables
Learn when to choose Apache Iceberg over Delta tables in Databricks. Complete guide covering manifest files, CDC limitations, liquid partitioning, and table properties with practical examples.
The Hidden Benefits of Databricks Serverless
Most Databricks cost comparisons focus only on compute pricing, missing two critical factors that can save thousands monthly. Learn how serverless waives private link transfer fees (up to $10/TB) and provides persistent remote caching that survives warehouse restarts - hidden benefits that often justify the serverless premium entirely.
Data Intelligence for All: 9 Ways Databricks Just Redefined the Future of Data
Discover how Databricks' 9 major announcements at Summit 2025 are democratizing AI with Agent Bricks, Lakebase, free edition, and more game-changing innovations.
End The Data Engineering Nightmare with Metrics.
Learn how Databricks metrics views simplify SQL analytics by centralizing business rules and eliminating repetitive code. Complete tutorial with examples.