...

Dynamic Pipeline Generation on GitLab

Andras Kelle
October 5, 2021
Read: 3 min

Running a pipeline in a monorepo can be very resource- and time-consuming.

In version control systems, a monorepo (monolith != monorepo) is a software development strategy where we use the same repository to store code for more than one project. Google, Facebook, Microsoft and other big tech companies all employ monorepos. At first sight you might think that this setup has many disadvantages, such as a large number of commits, branches and tracked files or a lack of access control. However, it also has numerous advantages too, including:

  • visibility
  • easier package management
  • consistency

As projects get more complex, the length of the .gitlab-ci.yml increases in proportion to the content of their repository. Besides the conceptual challenges, numerous performance issues can affect a monorepo setup. Don’t let the pipeline be one of them.

Inside of a pipe

 

Structure and architecture

Automation makes our life easier so let’s use it to generate a configuration file and its content for a child pipeline. At an abstract level, a child pipeline is an automated manifestation of generated processes that execute jobs in a direct acyclic way.

A common practice is to use a for loop to retrieve data over iteration, and save results in an array. In the following example, the concept of the dynamic pipeline generation relies on that after an iteration, elements of an array can be used to create the jobs defined in the generated pipeline.

N.B. I use Python in the examples to replace variables with values during the child-pipeline generation process but feel free to use any other preferred programming language.

Let’s have a look at the directory structure:

 

[php]libs

└── java

└── auth

└── payments

└── subscription

[/php]

 

Since it is an explanatory project, it will not require lots of data; we will only need it to demonstrate the concept.

First, separate the generation and the trigger processes into two different jobs as shown in the .gitlab-ci.yml below. The trigger job should use strategy: depend on the generator because the child-pipeline-trigger will need the generated configuration file, which will be available only after the generation process succeeded and is saved as an artifact.

 

But what’s happening under the hood?

A brief description: The generator job at the child-pipeline-generator stage calls main.py in its script block. That executes the generator() method, which will collect data from get_libs() and create the child pipeline’s configurations file, the child-pipeline-gitlab-ci.yml.

get_libs() iterates through the content of lib/java and returns a list made up of the entities contained by libs_path.

 

The PipelineWriter class contains all the configurations and templates that we need to generate the child pipeline, including hidden (parent) jobs and the child pipeline job template.

As you can see in the example, Gitlab supports extends with multi-level inheritance. The build-{lib}-lib job inherits all the configurations defined in the .basic job.

You can think of these indented multi-line strings returned by each method as different parts of the child pipeline, which can later be put together into a valid configuration file.

 

Tip: Validate your generated configuration file with CI Lint before deploying it.

 

What it looks like on the UI

Dynamically generated jobs
Dynamically generated jobs

Conclusion

In this article, I showed an example of how automation makes our life easier and increases efficiency by relying on GitLab’s child-pipeline processes. I used Python to dynamically generate a configuration file for the child pipeline and, in my opinion, this is a great way to create a more organised YAML file with less repetition.

If you dig into optimisation, you can check beforehand which services/libs, packages or directories have changed (between two build processes) and generate a pipeline determined by them to reduce build time and resource usage.

Additional Resources:

  1. Monorepos in Git | Atlassian Git Tutorial
  2. Parent-child pipelines | GitLab
  3. Keyword reference for the `.gitlab-ci.yml` file | GitLab

 

Head to the Infinite Lambda blog to find other insightful materials.

More on the topic

Everything we know, we are happy to share. Head to the blog to see how we leverage the tech.

event-driven architectures for data-driven apps
Make Data-Driven Apps with Event-Driven Architectures
The rise of cloud computing and cloud-native technologies enabled the emergence of new age companies. A digital-native breed of businesses that truly operate 24/7 across...
November 23, 2022
Fivetran Regional Innovation Partner of the Year for EMEA 2022
Infinite Lambda Named Fivetran Regional Innovation Partner of the Year for EMEA
We are thrilled to announce that we have been named Fivetran Regional Innovation Partner of the Year for EMEA. We are twice as happy to...
October 20, 2022
dbt Labs Platinum Partnership and Certification Award
Infinite Lambda Named dbt Labs Platinum Partner
We are thrilled to announce that Infinite Lambda has been named a Platinum partner to dbt Labs. We have been using dbt since the very...
October 18, 2022
Using Kotlin Multiplatform to share app logic
Sharing Application Logic Using Kotlin Multiplatform
What is Kotlin Multiplatform? Ever since it was first introduced, Kotlin’s ability to target multiple different platforms has been one of its key benefits. With...
October 5, 2022
Apache Airflow start_date and execution_date explained
Airflow start_date and execution_date Explained
Despite Airflow’s popularity in data engineering, the start_date and execution_date concepts remain confusing among many new developers today. This article aims to demystify them. Basic...
June 15, 2022
Breaking Some Myths about the Use of Dual-Track Agile
Bringing both flexibility and transparency, the Dual-Track Agile methodology is increasingly popular. With a growing number of teams that decide to try it out, it...
June 10, 2022

Everything we know, we are happy to share. Head to the blog to see how we leverage the tech.

Seraphinite AcceleratorOptimized by Seraphinite Accelerator
Turns on site high speed to be attractive for people and search engines.