Integrated Data Analytics Platform

moshi

Setting up a modern data stack from scratch to enable customer insights and ultimately contribute to stress-free bedtime routines for children.

Moshi is a mindfulness app for children that helps them fall asleep fast and contributes to stress-free bedtime routines.

As the volume and complexity of Moshi’s data increased, the client realised they needed a more reliable and sustainable solution. We considered their challenges, drew a detailed plan and set up a modern data stack from scratch.

What we did

Modern data stack set up from scratch
Ingestion & custom Fivetran connectors
Automated reporting
Behavioural analysis foundations

The Challenge

Like many other fast-growing organisations, Moshi did not have a data analytics platform in place. KPI reporting was done manually and managed by a single team member on Excel, which was extremely time-consuming.

Moshi would rely heavily on third-party data providers but it would arrive at different times and in different formats. Aggregated like this, reports would be unreliable because of latency issues, often resulting in discrepancies.

The client needed a sustainable, scalable solution that could handle the growing volume and complexity of their data.

The solution

Setting up a modern data stack from scratch

The Infinite Lambda team set up a modern data stack from scratch, automating all processes from data ingestion to final reporting. We built a customised dashboard that mirrored the original KPI report and automated the data pipelines to ensure a seamless reporting.

Data engineering

We started out by drafting the architecture. To significantly reduce build time, we opted for using Fivetran as a tool that can robustly ingest data from hundreds of data sources with minimal operational overheads.

We created a data lake with two different layers to store raw and quality data, so we could only pull the data that was relevant to a specific report. We then implemented dbt for data transformation, providing the client with lineage to understand the data transformation process, and replicated the original Excel KPI report on Looker.

Our data engineers deployed Airflow with Kubernetes and Docker as infrastructure on AWS, leveraged Airflow to automate the data ingestion, and set up manual deployment with Docker, EKS and ECR.

Building custom Fivetran connectors

As there were no direct connectors available for two of the vital sources, we realised we needed to build them ourselves. Since they both had a large dataset, we made sure to only extract the most important data to the KPI report in order to avoid redundant data, optimising cost and keeping it simple.

The combination of out-of-the-box Fivetran connectors and custom data pipelines for other data sources that we instrumented through Airflow, the client could now extract the most important data out of a large datasets without any delays as well as work out close estimates of missing data immediately.

Automation

The manual approach to reporting consisted of all of the data being aggregated in Excel by downloading csv files from each data source. This was neither robus, nor sustainable.

We analysed the original Excel report to get a full understanding of the data and the way the KPIs were calculated. Once we had determined all the data sources, we used Fivetran and Python scripting to ingest them into Snowflake and made sure that matched with the data in Excel.

We deployed Airflow to automatically run the scripts to fetch data from two APIs and to run dbt models that transform the data in Snowflake every morning. Data pipelines could now be scheduled and automated to ingest into Snowflake.

The next step was to use dbt to transform the data to make it more analysis-friendly. This included cleaning and structuring tasks, such as creating unique identifiers, removing duplicates, creating data models and aggregated metrics.

Foundation for behavioural analysis

Behavioural analysis requires instant raw data but the third-party data the client replied on came in different formats. Plus, as it would arrive several days later, there was the problem of latency.

We helped Moshi to inject data from their internal database into Snowflake where available data could be displayed instantly. We further built out the CI/CD pipelines from ingestion to transformation and applied automation. This way, a Looker report would be instantly generated once the data is refreshed and all end-users would be informed via email once a report was ready.

Building an Excel report replica on Looker

As there were a number of users at Moshi who were used to receiving the reports, we were tasked with creating an exact replica of the original Excel report but in a BI dashboard.

We successfully mirrored the original report format and managed to aggregate all data from multiple Excel sheets into a single Looker dashboard.

The technology we useD

The Result

Actionable customer experience insights

Our work enabled the Moshi team to build automated near real-time reporting and more analytics capability that give them a better understanding of the customer journey. These insights would help with further product development and new features that would ultimately lead to more children getting a better night’s sleep.

Where it used to take hours for Moshi’s analyst to pull the reports, data was now automatically ingested from multiple first- and third-party data sources, tested, validated and ingested. This way, Moshi’s growing analytics team could focus on the important tasks, while the data was far more easily accessible to others within the organisation.

Infinite Lambda helped us accelerate our data projects and achieve our goals in rapid time. Delivering us not only an end to end data pipeline but also detailed business KPI reports. They have provided much-needed capacity and knowledge, and have listened and adapted to our needs. I would highly recommend.

Ian Trayler, CTO at Moshi

don’t wait

Let’s walk the walk together

We are a generation of engineers and technologists who are passionate about transforming organisations with digital-age solutions and seeing them thrive on the cloud.

see Related Stories

We have helped over 50 organisations to deliver projects at different scales with over £100m in ROI.

The Francis Crick Institute
Scalable Global Trusted Research Environment
Railsr
Robust Data Platform for a Digital Bank
Oddbox
Building a Single Source of Truth
The Halo Trust
Using Artificial Intelligence to Find the Debris of War
Autolus
A Single Source of Truth in the Cloud
AJ Bell
Facilitating the Adoption of the Data Cloud

We have helped over 50 organisations to deliver projects at different scales with over £100m in ROI.