How an asset management firm put Databricks in the hands of every analyst, portfolio manager, and engineer.

With permissions stuck in Snowflake, deployments stuck in manual processes, and a governance team flying blind across 15 investment boutiques, our client needed more than a migration. SunnyData built the automated governance layer that made Databricks a platform the whole firm could finally use.


Key Metrics


Industry: Asset Management
Solution: Data Governance, Permission Automation, CI/CD, End-User Enablement
Platform: Unity Catalog, Databricks Apps, DABs, GitLab CI/CD


Business Context

The client is one of the world's largest independent asset management firms, operating through 15+ specialized investment boutiques, each with its own compliance requirements and data access policies.

Their data estate reflected that complexity: portfolio data in Snowflake, MS Teams collaboration data on Amazon S3 via AWS Glue Catalog, and unstructured research in S3 Iceberg tables

Databricks was already in the environment, intended to become the unified platform for analytics, ML, and the firm's Investment Analyst Assist initiative. But without a way to bring existing governance across, onboard code safely, or give end users visibility into what was available to them, adoption was stalled before it started.

The Problem

The organization had years of mature governance built in Snowflake: roles, privileges, and complex group-to-group inheritance hierarchies carefully maintained across 15+ investment boutiques. With portfolio data in Snowflake, collaboration data in AWS Glue, and research assets in S3 Iceberg tables, the goal was to make Unity Catalog the authoritative catalog of catalogs across all three without losing the governance already in place. No native tooling existed to make that happen, and migrating the permission modelmanually would have taken months, leaving a static snapshot that drifted out of sync with every new hire and role change.

Meanwhile, engineering teams were deploying Databricks jobs manually, with no standardized CI/CD pipeline, no automated security scans, and no coordinated approval process between Enterprise Data and functional teams. Deployment cycles stretched across days.

And the data governance team had no consolidated view of Unity Catalog. To check documentation coverage or understand what a user had access to, someone had to click through every object in the UI one at a time. End users had no self-service path into the platform. Adoption stalled not because Databricks wasn't capable, but because people couldn't see what was available to them, and couldn't trust that what they saw was accurate.

The Solution

Automated Permission Migration Engine

SunnyData built a fully automated ETL pipeline that extracts roles, privileges, group-to-group hierarchies, and metadata from Snowflake via the existing foreign catalog connection and applies them continuously to Databricks Unity Catalog. Using SCD Type 2 change detection, the pipeline identifies new, updated, and deleted records and translates them into the equivalent Unity Catalog privilege model, automatically executing GRANT and REVOKE statements. This is not a one-time migration. As Snowflake evolves, Databricks stays current without the governance team lifting a finger.

GitLab + DABs CI/CD Pipeline

SunnyData implemented a standardized CI/CD pipeline using GitLab and Databricks Declarative Automation Bundles. Every commit triggers automated DABs validation, security scans for hardcoded credentials, and team-specific functional tests. Auto-deployment handles DEV and UAT; production requires dual approval from both Enterprise Data and the relevant functional team, turning two misaligned workflows into one auditable process.

Data Governance Control Tower

SunnyData built a custom Databricks App that surfaces Unity Catalog roles, groups, permissions, metadata quality, and access policies in a single interface: no API calls, no manual navigation. The app runs on Databricks' on-behalf-of (OBO) model, meaning every query runs under the logged-in user's own credentials. The Access Explorer feature lets any user search for a data object and see exactly how they got access, down to the inheritance path. What once required hours of manual investigation now surfaces in seconds, giving end users the confidence to self-serve inside Databricks without filing a ticket.

The Results

The client now has a governed, self-maintaining permission layer across three data platforms unified under Unity Catalog. What would have required months of manual engineering runs as an automated, repeatable pipeline. Deployment cycles that stretched across days are now complete in minutes after a merge. And a governance team that previously had no consolidated view of Unity Catalog now has real-time visibility across all 15+ boutiques in a single app.

The more significant shift is structural. Engineers ship to production through a standardized, compliant pipeline. The governance team enforces policy at scale with a lean headcount. And end users (the analysts and ML engineers driving the firm's Investment Analyst Assist initiative) have a self-service path into Databricks that respects their access boundaries without requiring IT intervention.

This is a proof point that Unity Catalog can serve as a catalog of catalogs in a multi-platform enterprise, not just a destination for migrated data, but the authoritative governance layer for data that stays in Snowflake and S3. That is a meaningful expansion of how customers think about their Databricks investment. Customers don't get more value from Databricks by having more data in it. They get more value when more people can confidently use what's already there.

Looking Ahead

Often, the uncertainty of what you can see, what you're allowed to use, or whether the platform you've been given actually reflects your real permissions is what kills platform adoption. The organization solved it by treating governance not as a migration checkbox, but as the foundation for everything that comes after. When the permission model is accurate, automated, and visible to every user, the platform stops being something IT manages and starts being something the whole firm uses.

Next
Next

Automating Healthcare Contract Analysis with AI