BIND: Building the Data Foundation for Argentina's Digital Banking Ecosystem

SunnyData partnered with BIND to migrate its overloaded IBM DataStage infrastructure to a modern Lakehouse on Databricks over AWS. We implemented governed data pipelines, CI/CD automation, and an AI-powered executive agent, all while keeping a high-volume operation running without disruption.

Industry: Banking & Financial Services
Solution: Data Platform Migration, Data Engineering, CI/CD & IaaC, GenAI Business Agent (RAG)
Platform: Delta Lake, Unity Catalog, Databricks Lakeflow, Databricks Apps, MLflow
Cloud: AWS


Business Context

BIND is Argentina's leading digital financial ecosystem for businesses and fintechs, processing over 340 million monthly transactions. It provides the banking backbone for more than 70% of the country's digital wallets (including Mercado Pago and Prex), impacting tens of millions of users daily.

The Problem

BIND's original data infrastructure, built on IBM DataStage and DB2 Warehouse, supplemented by SQL Server and MariaDB, had reached its operational limits. IBM compute resources were running at 95% of contracted capacity, creating saturation risk and constraining growth. Mid-to-high complexity ETL jobs were taking between 3 and 8 hours to run, delaying data availability for business teams. With 12 TB of active data growing at 8–10% per month and no path to historical versioning or lineage, the infrastructure was becoming both a bottleneck and a liability.

Beyond performance, the platform had significant governance gaps:

  • Access controls and permissions were distributed without centralized auditing

  • Metadata was fragmented

  • No consolidated data quality policies

This made it difficult to build reusable analytical models and effectively blocked any path to AI and ML in production.

The Solution

To tackle these issues, SunnyData migrated its entire data ecosystem to a unified Lakehouse platform on Databricks on AWS. Designed under an iterative, secure, and governed approach, it addressed both immediate operational pain points and BIND's long-term strategic ambitions.

Data Platform Migration & Engineering Foundation

The migration covers approximately 600 ETL jobs identified and classified by Lakebridge, migrated progressively from IBM DataStage to Databricks Lakeflow across four primary data sources. Rather than replicating existing jobs, the team used the migration as an opportunity to refactor processes using Databricks-native frameworks (Lakeflow, DLT, DBT), implementing distributed execution, automated observability, and CI/CD automation via Databricks Asset Bundles.

Storage was modernized from DB2 to Delta Lake, enabling open formats, complete data lineage, historical versioning, and meaningful cost reduction compared to IBM's fixed-license model. Unity Catalog provides the governance backbone: centralized access controls, audit trails, data quality policies, and lineage tracking. Governance foundation needed to meet BIND's regulatory requirements and enable trustworthy self-service analytics.

The architecture follows a medallion model and is designed Data Mesh-compliant from the outset, mapping assets to business domains with clear ownership so that individual units can eventually operate independently without creating a fragile monolith.

AI Business Agent

Once the foundation was there, SunnyData also built a GenAI-powered business agent deployed on Databricks Apps. The system ingests BIND's monthly management reporting (Management reports, Board presentations, and supporting Excel files) into a RAG pipeline backed by S3, enabling senior executives to query financial performance data in natural language. The MVP launched for approximately directors and senior leaders, with a phased roadmap to extend access to middle management and eventually the full organization, with Unity Catalog managing role-based permissions at each stage.

The Results

The migration to Databricks delivered an immediate and measurable shift in how BIND's data infrastructure performs. ETL jobs that previously took 3 to 8 hours to complete now run in under an hour. Business teams now have access to reliable and ready-to-act-on data.

Beyond speed, the move from IBM's fixed-capacity model to Databricks on AWS transformed BIND's cost structure from a constrained, overloaded licensing arrangement to a transparent model where compute scales with demand and powers down automatically when idle.

The governance picture changed just as significantly: where access controls were previously distributed and unaudited, Unity Catalog now provides centralized lineage, role-based access, and the audit trail required to meet regulatory standards.

On the AI side, directors and senior managers who previously spent hours manually reviewing monthly management reports can now get answers in seconds by querying metrics and business highlights in plain language through the AI Agent.

Looking Ahead

With a clean, governed, and modeled data foundation in place, BIND is positioned to move fast on the initiatives that matter most. Wholesale banking analytics for recently acquired loan portfolios are being built directly on Databricks from day one. The architecture is being prepared for Kafka integration to enable near-real-time ingestion, and legacy reporting is migrating to Databricks Dashboards for a unified, auditable layer.

Most importantly, BIND now has the infrastructure to do what wasn't previously possible: deploy AI and ML in production, from predictive analytics to advanced customer segmentation, on a platform built to support it.