- authors: the MetaboHUB consortium
- creation date:
2023-03-02
- targeted public: MetaboHUB’s code developers (All WPs)
- scope: MetaboHUB2 – WP5 – T2 - Infrastructure
- main goal: Best practices and recommendations for CI/CD pipelines
- Warning: this document focuses on GitLab CI/CD advices and recommendations; not on CI/CD technics and instructions. For that, please refer to GitLab official CI/CD documentation.
This document contains a list of advices and best practices to design GitLab CI/CD.
GitLab forge hosts a simple and powerful CI/CD environment.
A simple yaml formatted file, named .gitlab-ci.yml
, loaded at the root of a git repository, contains all CI/CD instructions and actions.
Even if it’s possible, we do not recommend to rename this file unless your project is host on different GitLab server,
with different available runner or third part resources.
CI/CD should be used to:
- check if my code compile successfully (for Java / C / C++ projects)
- validate If any library / dependency is present on a "fresh-install" computer or "from scratch built" environment.
- run all unit tests; run all e2e tests (if applicable; best practice, highly recommended)
- get a test-coverage report (optional - best practice)
- analyze your code using a specific tool (e.g.: sonarqube) and detect potential security issues, bugs, bad practices, ... (if applicable - highly recommended)
- generate archives from a functional code / releases and host them on GitLab using artifacts and/or package mechanisms (optional)
Note: for artifacts, we recommend you to keep releases artifacts without time restriction. Artifacts generated by dev branches should be kept 24h or 1 week. Artifacts generated by master branch or tags should be kept forever. See "Figure 1 – artifacts management".
## -----------------
## ARTIFACTS
# build artifact for dev. branches
jar-tmp:
stage: artifacts
tags:
- docker
except:
refs:
- master
- tags
script:
- test -d artifacts || mkdir -p artifacts
- cp target/*.jar artifacts/
artifacts:
name: microservice-xxx-boot-$CI_COMMIT_REF_NAME
expire_in: 1 day # <= instructios to keek artifact 24h
paths:
- $CI_PROJECT_DIR/artifacts
# build artifact for stables releases
jar-lts:
stage: artifacts
tags:
- docker
only:
refs:
- master
- tags
script:
- test -d artifacts || mkdir -p artifacts
- cp target/*.jar artifacts/
artifacts:
name: microservice-xxx-boot-$CI_COMMIT_REF_NAME
paths:
- $CI_PROJECT_DIR/artifacts
Figure 1 - Instruction for artifacts management in .gitlab-ci.yml
file.
Jobs should be run in Docker images in order to avoid wild dependencies installation on your runner server.
You have two choices:
- use reference Docker images (e.g.:
maven:3.6-jdk-11
,node:16.15.0-slim
, … ) - create your own Docker image (if your code require specific third part system libraries)
- write a Dockerfile at the root of your project, and build it at the first step of your CI/CD pipeline
or - host your Docker image on Quay.io or on DockerHub (or any accessible Docker registry)
- write a Dockerfile at the root of your project, and build it at the first step of your CI/CD pipeline
Note: if you project use a Docker image in production, you can consider to use it for your tests.
Cache mechanism is used to keep files between different jobs or at different stages. It can be used for example to:
target
directory for maven projectsnode_modules
directory for node projects- big file (see following matching section)
Warning: if your CI have more than one job in a same stage and call same cached file(s), random bugs may occur.
Jobs run on a specific runner server. If a test runs in a job require a "world-wide-web" hosted resource (e.g., an URL or a REST webservice) check its availability with the runner server firewall. All URLs must use secure protocols (HTTPS or SFTP). If you call these resources during your unit test, instead of calling a live version, you should mock them to only test your code and not the third-part webservice. However, it is important to periodically check all third part resources and webservices to check:
- if they are still up,
- if their path changed (302 error code)
- or if their response schema changed (HTML, XML or JSON).
If a job runs tests with database transaction requirements, you should highly consider using a GitLab "service" job instruction. If your project can use different SGBD (e.g.: both MySQL and PostgreSQL) consider to tests all scenarii. This can be done with parallelized jobs, in the same stage (but code analysis and coverage report should be performed only in one test job).
Large static files (e.g.: raw data used during unit tests) should not be host in your Git history. You can create a specific job to fetch and download those files and keep them in a cached path. See "Figure 2 – job to download non-git managed resources".
# CI/CD variables
variables:
LIB_PATH: "src/main/webapp/WEB-INF/lib/"
WGET_URL: "https://nextcloud.inrae.fr/s/xxxxxxxxx/download?files"
WGET_PASSWORD: "********" # <= should be keept in a GitLab 'CI/CD Variables', with a 'masked' option
# cache
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- src/main/webapp/WEB-INF/lib/
- target/
# [...] define stages, other jobs, ...
# main jobs: build and test
get libs:
stage: get_libs
tags:
- docker
image: cirrusci/wget
script:
- echo "[info] fetching non-maven lib. from NextCloud";
- test -f $LIB_PATH/converter.jar || wget --password $WGET_PASSWORD $WGET_URL=converter.jar -O $LIB_PATH/converter.jar
Figure 2 - job to download non-git managed resources
Sensitives data (e.g., passwords) should not be written in any Git versioned file
(either “properties”, “ini” or “gitlab-ci.yml” files).
You can store your project sensitive data into a non-versioned file (added in you .gitignore
)
and or use GitLab CI files and variables.
Notes:
- CI/CD variables can be hidden (hidden in logs) and/or protected (available only to a specific branch).
- CI/CD variables can be defined at group level (and available to all projects in that group).
- secrets in configuration files can be overwritten during a CI/CD with a sed command (using a GitLab variable) in a
before_script
working section. - only users with owner, maintainer and administrator rights on the project can view and edit these variables.
Generated binaries (C / C++ projects), Jar/War (Java project), PDF (Latex projects) can be kept and made available and/or downloaded from the GitLab server using the instructions relating to the artefacts. Keep only "release generated resources and binaries" with no time restriction. Maven, npm and other generated libraries can be easily hosted in the GitLab package registry.
GitLab lets you deploy and/or publish applications. For example, a web application project can be published on your development server using a final CI/CD job. The job's main command can be a simple copy command via SSH or a managed command with docker commands executed remotely via SSH (stop, rm, run, etc.).
Note: to run SCP or SSH commands, you must keep an SSH key in a hidden GitLab variable.
A pipeline job can trigger the CI/CD of another GitLab repository on a specific branch. It can be used either for the "Meta pipeline" (see specific section) or to call another CI/CD project during the current project. For example, a project which implements a library generated by OpenAPI can trigger the CI/CD of the corresponding repository which will rebuild and/or regenerate the latest version of the binaries before the current CI/CD build job.
If you have jobs that share some characteristics, you can define a template that can be used in several jobs via the "extends" command. Each template name starts by a dot.
Example:
.prepare_step:
before_script:
- echo 'prepare'
build:
extends:
- .prepare_step
script:
- echo 'build'
test:
extends:
- .prepare_step
script:
- echo 'test'
More details on GitLab official documentation website.
You can import CI portions from other projects.
Example:
include:
- project: 'commonProject'
file: '/path/to/the/file/.gitlab-ci.yml'
This allows to define some common jobs and templates in a unique repository.
More details on GitLab official documentation website.
We recommend that you follow this global model:
- Variables section - define all non-sensitive variables specific to your CI/CD repository first. Consider adding sensitive variables to the project's GitLab configuration section and variables shared by multiple projects to a group-level GitLab configuration section.
- Cache section - define the path and files kept / shared between jobs in different stages.
- Staging section - description of CI/CD stages, used to parallelize and set launch order of all jobs
- Jobs section - list of all CI/CD jobs. You can follow the order of the job stages to improve the readability of the document. You should also use a standard template for all your jobs.
For a job, consider using this template (mandatory instructions are in bold):
- stage - define the job test
needs
- define the previous mandatory job; this improves the parallelization of jobs- tags - define job runners
image
(mandatory for docker runners) - define the job docker imageservices
- define the job services (e.g.: use a MySQL database)only
/except
- configure if a job can be run only for specific branches or tags (or disallow the job for specific branches or tags).before_script
- perform operations before executing the main job, for example: update a password in a configuration file with a GitLab hidden variable- script - Instructions for basic job commands. (for example: build, test, deploy, ...)
after_script
- perform operations after executing the main job, for example: copy generated binary in an “artifact” directorycoverage
(for tests jobs) - catch the test coverage report using a regular expressionartifacts
- store the targeted files in an archive and host them in GitLab (with or without a time limit).
Please consider using these steps in the following order:
- Obtain third-party files (download web-hosted libraries, non-git-versioned files, etc.)
- Build the docker image (if applicable)
- Build and/or compile project (if applicable)
- Run tests (if applicable – also generate code analysis, coverage report, javadoc, …)
- (optional) Run job to publish Javadoc and/or coverage report on a web-server
- Run jobs to
- generate artifact – keep file generated during build or test stages
- publish libraries on registry – public and/or private
- deploy on development or test server
If your development project uses several git repositories, consider running a daily or weekly pipeline to test all CI/CD projects in a specific order to check the consistency of your overall project. Refer to the "trigger pipeline" documentation for more help.