Building Production-Ready Databricks Projects with Bundles

Databricks

May 11

We’ve noticed a common pattern among teams adopting Databricks Declarative Automation Bundles (formerly known as Databricks Asset Bundles or DABs): they’re using Bundles, but deployments still feel fragile. Releases get delayed. Production incidents keep happening. Engineers lose confidence every time they ship a change.

We’ve discovered that rather than the problem being in the tool itself, it’s all about the adoption level.

In almost every project we've worked on, teams are using roughly 20% of what Bundles actually enable (declarative deployments, maybe some YAML configuration) and layering that on top of existing workflows that were never designed to support it. The result is incremental improvement without real reliability. You get a slightly more organised version of the same underlying fragility.

This post is about the other 80%

The Real Shift Isn't About Tooling

Bundles don't just give you a better deployment mechanism. Used fully, they give you a framework for enforcing engineering standards across your entire project: reproducibility, testability, environment isolation, and automated quality gates.

But here's the thing: Bundles expose whether those standards exist. They don't create them.

If your team is developing directly in the Databricks UI, running %pip install inside notebooks, managing dependencies inconsistently across clusters, and treating CI/CD as optional, Bundles will surface all of that. They don't fix broken processes. They amplify them.

The shift that unlocks the full value of Bundles is a mindset shift: from loosely structured workflows to modular, testable, reproducible systems. Once you start building that way, a few things become non-negotiable:

Source control: every change is versioned and traceable
Explicit dependency management: no hidden cluster state, no version drift
Automated quality gates: logic is validated before it reaches production
CI/CD as the only deployment path: no manual runs, no hidden steps, no exceptions
Environment parameterisation: dev, staging, and prod are isolated by configuration

These aren't Bundles features. They're engineering practices that Bundles make easier to enforce. The teams that get full value out of Bundles are the ones that commit to all of them, not just the ones that are easy.

What Partial Adoption Actually Looks Like

Partial adoption is deceptive because everything appears to be working. Bundles are deployed. Pipelines run. The repository exists. But look closer, and the same old problems are underneath.

Development still happens in the UI. Code is written and tested in shared clusters. There's no reproducible local setup, so debugging depends on the environment. The classic symptom: "It works, but only here."

Dependencies exist but aren't controlled. Some projects use a requirements.txt, others rely on cluster-installed libraries. Versions drift between environments without anyone noticing until something breaks in production that worked fine in dev. "It worked yesterday. What changed?"

Quality checks are manual and optional. Linters exist, but run only when someone remembers. There are no pre-commit hooks, no type checking, and pull request reviews become the only line of defence against bad code reaching production.

CI/CD is bypassed when it's inconvenient. Pipelines exist in theory. In practice, deployments get triggered manually, validation steps get skipped "just this once," and there's no clear promotion strategy between environments. The pipeline is aspirational, not mandatory.

Each environment behaves differently. Environment-specific logic leaks into the codebase. Values get hardcoded. Artifacts get rebuilt per environment.

None of this is caused by Bundles. It's caused by adopting the tool without the discipline it requires.

What Full Adoption Looks Like

A single source of truth

Everything lives in code. The repository structure is the project, not just a place to store notebooks, but the authoritative definition of how the project is built, configured, and deployed.

  
    my-bundle-project
├── databricks.yaml                # Entry point for Bundles (defines targets, includes resources)
├── docs
│   ├── deployment.md              # Deployment process and environments
│   └── development.md             # How to run and test the project locally
├── pyproject.toml                 # Dependencies and project configuration (single source of truth)
├── README.md
├── resources                      # Declarative infrastructure (what runs in Databricks)
│   ├── alerts
│   │   └── configuration
│   │       ├── my-alert-1.yaml
│   │       └── my-alert-2.yaml
│   ├── jobs
│   │   ├── my-job-1
│   │   │   ├── configuration
│   │   │   │   └── my-job-1.yaml
│   │   │   └── notebooks
│   │   │       ├── my-task-1.py
│   │   │       └── my-task-2.py
│   │   └── my-job-2
│   │       ├── configuration
│   │       │   └── my-job-2.yaml
│   │       └── notebooks
│   │           ├── my-task-1.py
│   │           └── my-task-2.py
│   ├── pipelines
│   │   ├── my-pipeline-1
│   │   │   ├── configuration
│   │   │   └── transformations
│   │   └── my-pipeline-2
│   │       ├── configuration
│   │       └── transformations
│   └── variables.databricks.yaml  # Environment-specific variables (dev/staging/prod)
├── src                            # Application code (pure Python package)
│   └── my-package-name
│       ├── common
│       │   ├── dates.py
│       │   └── files.py
│       ├── core
│       ├── data
│       └── tasks
├── tests                          # Unit tests mirroring src structure
│   ├── common
│   ├── core
│   └── data
├── uv.lock                        # Locked dependencies (fully reproducible environment)
├── .gitignore
└── .github
    └── workflows
        ├── cd.yaml                # Deployment pipeline
        └── ci.yaml                # Validation pipeline
  

This layout isn't cosmetic. Every directory has a clear role. Resources are declarative. Application logic lives in a proper Python package under src/. Tests mirror the package structure. CI/CD configuration is part of the project instead of an afterthought.

A reproducible local development environment

Local development mirrors production. If code doesn't run locally, it's not ready to deploy, and it certainly won't behave predictably across environments.

pyproject.toml is the single source of truth for dependencies. We use uv for fast, deterministic environment management. The uv.lock file guarantees that every engineer and every CI runner is working with identical dependency versions.

  
    # pyproject.toml

[project]
name = "your-project-name"
version = "0.1.0"
description = "Short project description"
readme = "README.md"
requires-python = ">=3.11"

authors = [
    { name = "Your Name", email = "you@example.com" }
]

dependencies = [
    "pydantic>=2.0,<3.0",
    "python-dateutil>=2.8,<3.0"
]

[project.optional-dependencies]
dev = [
    "pytest",
    "pytest-cov",
    "ipython",
    "ipdb",
    "ipykernel",
    "pyspark>=3.5.0",
    "delta-spark>=4.1.0",
    "databricks-sdk",
]

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[tool.hatch.build.targets.wheel]
packages = ["src/my-package-name"]
  

No cluster-installed libraries. No %pip install in notebooks. If it's a dependency, it's declared here.

A build process that runs once

The artifact is built once and deployed everywhere. No rebuilding per environment, no environment-specific logic in the build step, no manual intervention

  
    # databricks.yaml

artifacts:
  wheel_files:
    type: whl
    build: uv build
  

That wheel is then referenced directly by job tasks, either via the libraries key for classic compute or through environments for serverless:

  
    # resources/jobs/my-job-1/configuration/my-job-1.yaml (classic compute)

resources:
  jobs:
    my-job-1:
      tasks:
        - task_key: my-task-1
          notebook_task:
            notebook_path: ../notebooks/my-task-1.py
          libraries:
            - ../../../../dist/*.whl

        - task_key: my-task-2
          notebook_task:
            notebook_path: ../notebooks/my-task-2.py
          libraries:
            - ../../../../dist/*.whl
          depends_on:
            - task_key: my-task-1
  

  
    # resources/jobs/my-job-1/configuration/my-job-1.yaml (serverless)

resources:
  jobs:
    my-job-1:
      tasks:
        - task_key: my-task-1
          notebook_task:
            notebook_path: ../notebooks/my-task-1.py
          environment_key: my-environment-1

  environments:
    - environment_key: my-environment-1
      spec:
        client: "4"
        dependencies:
          - ../../../../dist/*.whl
  

Same artifact. Different targets. That's the principle.

CI/CD as the only path to production

Nothing reaches production without passing through the pipeline. Not "usually." Not "except for hotfixes." Never.

The CI pipeline runs on every push and pull request: dependency sync, pre-commit hooks, and the full test suite. The CD pipeline handles promotion, every branch maps to an environment, and every deployment is fully automated.

  
    # .github/workflows/ci.yaml
name: CI

on:
  push:
    branches: [main, develop, staging]
  pull_request:
    branches: [main, develop, staging]

jobs:
  ci:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install uv
        run: pip install uv

      - name: Sync dependencies
        run: uv sync

      - name: Install pre-commit
        run: pip install pre-commit

      - name: Setup pre-commit hooks
        run: pre-commit install

      - name: Run pre-commit
        run: pre-commit run --all-files

      - name: Run tests
        run: uv run pytest
  

  
    # .github/workflows/cd.yaml
name: CD

on:
  push:
    branches: [develop, staging, main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        include:
          - branch: develop
            target: dev
          - branch: staging
            target: staging
          - branch: main
            target: prod

    if: github.ref_name == matrix.branch
    environment: ${{ matrix.target }}

    env:
      DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
      DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}

    steps:
      - uses: actions/checkout@v4

      - name: Install Databricks CLI
        run: pip install databricks-cli

      - name: Validate bundle
        run: databricks bundle validate -t ${{ matrix.target }}

      - name: Deploy bundle
        run: databricks bundle deploy -t ${{ matrix.target }}
  

The branch-to-environment mapping makes promotion explicit. develop goes to dev. staging goes to staging. main goes to production.

Environment isolation through configuration

Environments differ by configuration, not by code changes. If you're modifying logic to make something work in prod that works in dev, the architecture is wrong.

The databricks.yaml defines targets and variables. The variables.databricks.yaml handles environment-specific values. Application code never needs to know which environment it's running in.

  
    # databricks.yaml

bundle:
  name: my-bundle-project

workspace:
  file_path: ${var.WORKING_DIRECTORY}

include:
  - path/to/configuration/yaml/files

artifacts:
  wheel_files:
    type: whl
    build: uv build

targets:
  dev:
    mode: development
    default: true

  staging:
    mode: production
    git:
      branch: staging
    run_as:
      service_principal_name: <staging-sp>

  prod:
    mode: production
    git:
      branch: main
    run_as:
      service_principal_name: <prod-sp>
  

  
    # resources/variables.databricks.yaml

variables:
  ENV:
    description: ""
    default: ${bundle.target}

  WORKING_DIRECTORY:
    description: ""
    default: "/Users/${workspace.current_user.userName}/.bundle/${bundle.name}/${var.ENV}/"
  

The variable ${bundle.target} is the only thing that changes between deployments. Everything else, like logic, structure, and artifact, is identical.

Automated code quality

Quality is enforced automatically on every commit. Not reviewed manually. Not left to the PR author's discretion. Enforced.

Pre-commit hooks handle formatting (black), import ordering (isort), linting (flake8), and type checking (mypy) — all before code ever reaches the remote branch.

  
    # .pre-commit-config.yaml

files: ^(src)
repos:
  - repo: https://github.com/pycqa/isort
    rev: 5.13.2
    hooks:
      - id: isort
        args: ["--profile", "black"]

  - repo: https://github.com/psf/black
    rev: 22.3.0
    hooks:
      - id: black
        language_version: python3

  - repo: https://github.com/pycqa/flake8
    rev: 7.1.0
    hooks:
      - id: flake8
        args:
          - --max-line-length=88
          - --extend-ignore=E501
          - --builtins=spark,dbutils
        additional_dependencies:
          - importlib-metadata<7

  - repo: https://github.com/pre-commit/mirrors-mypy
    rev: v1.10.0
    hooks:
      - id: mypy
  

The goal isn't to slow engineers down. It's to make consistent quality the path of least resistance.

Testing as a first-class concern

Tests are not documentation, but the mechanism that lets you ship changes with confidence. The test structure mirrors the src/ package exactly, making it easy to find tests for any module and easy to see what's covered.

  
    # tests/common/test_dates.py

from my_package_name.common import dates

def test_my_date_function() -> None:
    expected_value = ""
    assert dates.my_date_function() == expected_value

# tests/common/test_files.py

from my_package_name.common import files

def test_my_file_function() -> None:
    filewriter = files.FileWriter()
    expected_value = ""
    assert filewriter.my_file_function() == expected_value

Keep tests simple. A clear input, a validated output, a deterministic result. Simple tests are easier to maintain and faster to diagnose when they fail, which matters more than test sophistication when you're shipping changes regularly.

Run the full suite locally or in CI with: uv run pytest

The Pattern That Separates Reliable Teams from Fragile Ones

Bundles don't fail. Partial adoption does.

The teams that ship confidently are using the same tools, just with more discipline. They treat every principle here as non-negotiable: local-first development, explicit dependencies, automated quality gates, CI/CD as the single deployment path, and configuration-driven environment isolation.

If you're already using Bundles, you're closer than you think. The gap between 20% and 100% adoption isn't a tooling problem but a commitment one. Commit to the system behind the tool, and the reliability follows.

DatabricksDeclarative Automation BundlesCI/CDData Engineering

Eugenio Arcana https://www.linkedin.com/in/eugenio-arcana-99193b2a1/