Cloud Data Transformation with Data Vault
Telco
Leveraging dbt to build a Data Vault on Snowflake for a complete cloud data transformation and greater business flexibility.
The client is a UK mobile service provider with 4 million members. They are a B Corp, always looking to create a positive impact for the people and the planet. Their goal is to make it easier for people to connect, offering flexible services and doing business consciously.
What we did
modelling
ingestion
The Challenge
The client was looking to start providing greater flexibility and offering new products to their customers, but their legacy data warehouse was holding them back.
The telco needed to migrate to a cloud-based data platform and had identified Data Vault as a modelling methodology and Snowflake as the warehousing solution for their project.
When the client came to Infinite Lambda they were looking to re-engineer and migrate their legacy data warehouse to a new data architecture where all their data was clean, integrated and accessible, addressing the limitations of their existing legacy warehouse.
They needed to switch to a future-proof approach where the data was modelled after the business logic and able to be more easily adapt to support a more member centric approach and new product offerings.
Guiding a full-scale data transformation on Snowflake
The client had already adopted Snowflake but found they did not have the internal capacity to deliver the project and maintain existing reporting. Having also selected Data Vault 2.0 as the modelling methodology, they were looking for a partner with strong expertise with both the Data Cloud and the specific framework.
We built a data platform based on the following stack:
- Snowflake as the cloud-based data warehouse;
- dbt for transforming the data loaded in the warehouse, testing the transformation and performing quality checks;
- AWS as the underlying cloud platform for the deployment;
- Python for platform integration and automation.
Migrating to dbt and the Data Cloud
The client prioritised the predictability of their Snowflake migration. Having delivered over 60 cloud data migrations, Infinite Lambda offered tried and tested methodology that minimises uncertainty around timescale, effort and cost.
The telco had identified Snowflake as their warehouse of choice to reduce the effort required by the data team to maintain and manage the solution, whilst also offering the scalability they needed to deliver on their strategic business goals.
The client had two systems running in parallel, the legacy SQL server and the new Snowflake solution where ingestion was set up.
Together with the client’s data team, we outlined the following requirements:
- Scalable computation infrastructure
- Fully automated and tested data transformations
- Near real-time ingestion and processing of raw events, with data quality monitoring
- Governance and compliance by design
- Data integrated according to the enterprise data model
- A Data Vault model built from the data warehouse layer
- Data marts built on top of the data warehouse layer to minimise the impact of changes on data feeds
Building the business logic on Data Vault
After completing a like-for-like, fully validated migration of the data model to Snowflake and dbt, the second phase was to develop a Data Vault model that would allow for a more logical organisation of the data that reflected the actual business model.
The objective was to centralise the business logic in a single view so that each and every department would use the same logic when working with the data as opposed to creating their own view.
We started by documenting the business processes for the priority business domains to enable Data Vault modellers to build the data model with a business focus. We leveraged dbt to transform the feeds for those domains into the Persistent Staging Area (PSA) layer, cleaning the data, applying PII tags and adding unique member ID data keys.
The Infinite Lambda team created a business layer in Snowflake that represents how the business works as opposed to how the data is ingested. We did that by ingesting all of the data feeds in the Raw Vault layer first, where cleaned, unmodelled data was ready to work with. We leveraged dbt to implement, test, deploy and document all of the transformations required to build the new data model.
Our consultants modelled the data so that it was logically combined into business domains from multiple different feeds. We replicated some views and provided data scientists with logically structured, modelled data they could use.
We built a single source of truth that could provide a single view combining all of the key data from over 15 feeds. For instance, for the payments domain, the modelled data could provide a holistic representation of what was purchased, who paid for it, when and how they paid for it.
During the migration a requirement was added to allow the data to link to an account and not to a single SIM which needed a creative solution. We introduced the concept of the unique member ID, which would allow engineers to link all records and payments back to the member to provide an enhanced, customisable user experience.
Knowledge transfer
A key component of our migration methodology is making sure the client is fully prepared to take over the new system, having the knowledge and skills to leverage it efficiently and maintain it in the long run.
Infinite Lambda partnered with the client’s data platform team, which meant we would work side by side and often have pair programming sessions.
We also provided a series of training sessions with key members of our staff to ensure the client’s team were proficient in the technology and the Data Vault framework so they could make the most of their platform and quickly become self-sufficient.
The technology we useD
A successful data transformation that enables scalability and collaboration
Within 14 months, we completed the migration, adhering to the timeframe we had initially agreed. Migrating existing logic to their new platform, we ensured a seamless transition for the business, giving them full confidence in using their data.
Our consultants guided the rollout and facilitated the client in adopting their new platform, empowering them to start building data products and drive value for their organisation.
Migrating from SQL Server to dbt and Snowflake has significantly optimised reporting time for the data users on the client’s side in terms of both efficiency and reliability. Reporting which used to take 8 hours has now been reduced to a couple of hours and it is far more reliable than before, enabling the finance team to self-serve.
Let’s walk the walk together
We are a generation of engineers and technologists who are passionate about transforming organisations with digital-age solutions and seeing them thrive on the cloud.
see Related Stories
We have helped over 50 organisations to deliver projects at different scales with over £100m in ROI.
We have helped over 50 organisations to deliver projects at different scales with over £100m in ROI.