DLRN internals¶
This document aims at describing the inner workings of DLRN, so a new contributor can get up to speed as quickly as possible.
Main concepts¶
The following basic concepts are used in DLRN. You’ll need to get used to them if you want to understand the code:
- Source Git: DLRN will always take the source code to build the package from a
Git repository, regardless of the
Source0
entry in the spec file. - Distgit: spec files are assumed to be present in a Git repository. DLRN has a driver-based mechanism to allow different options for the distgit location, see the Package Info Drivers section.
- Project: each project corresponds with a package to be built. A package may define a number of subpackages in the spec file, but a single source RPM file is always created. The DLRN driver mechanism allows us to have different sources of information for the project list, such as rdoinfo or a single git repo.
- Commit: a commit is the main abstraction used by DLRN. It aggregates all
information related to each package built, such as:
- Project name
- Hash of the commit from the source git
- Hash of the commit from the distgit
- Build status (successful or not)
- Name of the rpms
Directory structure¶
The DLRN codebase is structured as follows:
- doc/: project documentation
- scripts/: several useful scripts, used by CI and DLRN itself, plus some other
miscellaneous files.
- build_rpm.sh: script that calls mock to build rpm files (see the Building packages section).
- submit_review.sh: script to open a Gerrit review after a build failure (see the Error reporting section).
- centos.cfg, centos8.cfg, fedora.cfg and redhat.cfg: base mock
configurations for CentOS 7, CentOS 8, Fedora and RHEL 8 builders. For a RHEL 8
builder, you will have to make sure the appropriate base repos are configured,
since those are not publicly available. These base configurations can be located
in a separate directory, defined by the
configdir
option in projects.ini.
- dlrn/: main DLRN code
- build.py: build functions, described in detail in the Building packages section.
- config.py: general configuration management.
- db.py: database-related code.
- notifications.py: error reporting functions.
- purge.py: dlrn-purge command, used to reduce the disk usage on long-running instances.
- reporting.py: reporting functions, create simple HTML reports of the repository status.
- repositories.py: functions required to clone a git repo and get information from it.
- rpmspecfile.py: basic rpm spec file parsing to be able to get package names and dependencies.
- rsync.py: synchronizes yum repositories between servers, used to have a multi-node architecture.
- shell.py: main file, reads command-line arguments and launches the build process.
- utils.py: miscellaneous utilities.
- api/: DLRN API code, described in detail in its own page.
- drivers/: modular drivers for project listing and distgit location, described in the Package Info Drivers section. We are also including modular drivers for different build methods.
- migrations/: Alembic scripts for database maintenance.
- stylesheets/: contains a CSS file used by the reporting module.
- templates/: contains Jinja2 templates, also used by the reporting module.
- tests/: unit tests.
High level algorithm¶
When DLRN is run, the following (simplified) sequence of events is executed:
fetch information for available projects
for each project
find last processed commit
refresh source git
find last commit in source git and distgit
if any commit is later than last_processed_commit
add commit to list of commits to be processed
for each commit_to_be_processed
build package
create yum repo with built package and the latest versions of every other package
if errors
report via e-mail or Gerrit review if configured
store commit in DB
generate HTML report
Configuration¶
DLRN uses a simple INI file for its configuration. Most config options are
located in the [DEFAULT]
section and read during startup. Only
driver-specific options have their own section.
The dlrn/config.py file defines a ConfigOptions
class, that will create an
object including all parsed options.
Package Info Drivers¶
Package info drivers are derived from the PkgInfoDriver
class
(see dlrn/drivers/pkginfo.py), and are used to:
- Define a list of projects (packages) to be built
- Define the source and distgit repos for each project
- Fetch the new commits for each project’s source and distgit repos
- Pre-process spec files, if needed
Each driver must provide the following methods:
- getpackages(). This method will return a list of dictionaries. Each
individual dict must contain the following mandatory keys (others are
optional):
- ‘name’: package name
- ‘upstream’: URL for source repo
- ‘master-distgit’: URL for distgit repo
- ‘maintainers’: list of e-mail addresses for package maintainers
- getinfo(). This method will return a list of commits to be processed for a specific package.
- preprocess(). This method will run any required pre-processing for the
spec files. If the
custom_preprocess
variable is defined inprojects.ini
, the external program(s) or script(s) defined in the variable will be executed as the last step of the pre-processing. - distgit_dir(). This method will return the distgit repo directory for a given package name.
You can check the code of the existing
rdoinfo driver
and gitrepo driver
to see their implementation specifics. If you create a new driver, you
need to add the project name to the projects.ini
configuration file, and
if you need any new options, be sure to add them to a driver-specific section
(see the Configuration section for details).
Package Build Drivers¶
Package build drivers are derived from the BuildRPMDriver
class
(see dlrn/drivers/buildrpm.py), and are used to perform the actual package
build from an SRPM file.
Each driver must provide the following method:
- build_package This method will take an output directory, where the SRPM is located, and build it using the driver-specific method.
You can check the code of the existing
mock
driver to see its implementation specifics. If you create a new driver, you
need to add the project name to the projects.ini
configuration file, and
if you need any new options, be sure to add them to a driver-specific section
(see the Configuration section for details).
Building packages¶
The package build logic is included in build.py. There we have several functions:
- build(). This is the function called externally. It gathers some
configuration options and parameters, then calls
build_rpm_wrapper
to launch the build process and returns a list with the built rpms. - build_rpm_wrapper(). This wrapper function prepares the mock configuration
file to be used during the build using the configuration. It will also add
the most current repository to the mock configuration, so we can use packages
in the current repository as dependencies during the build. Then, it will
spawn a Bash script,
build_srpm.sh
to build the source RPM, and call the appropriate build driver to generate the binary RPM.
The build_srpm.sh
script takes care of creating the source RPM. Some magic is
required to build it, specifically:
- The script tries to determine a version and release number for the package.
This version number should be compatible with the
Fedora guidelines,
and allow upgrades from and to packages from stable releases, which is
not always easy. We use the following algorithm:
- For Python projects, take the output from
python setup.py --version
. Most OpenStack projects use PBR, which gives us proper pre-versioning after a tagged release. - For Puppet projects, we take the version from the
metadata.json
orModulefile
files, if available, and increase the .Z version if there are any commits after the tagged release. - For other projects, we take the version number from the latest git tag.
- If everything fails, default to version 0.0.1.
- The release number is always 0.<date>.<upstream source commit short hash>.
- For Python projects, take the output from
- A tarball is generated using
python setup.py sdist
for Python projects,gem build
for Ruby gems, and tar for any other project. Then, the spec file is updated to use this tarball asSource0
, and a source RPM is created.
The binary RPM is built from the SRPM using a the build driver specified in
projects.ini
. This can be done using Mock, Copr, Brew, or any other tool,
provided that the required driver is available.
Hashed yum repositories¶
Each build is stored on a separate directory. A hashed structure is used for the
directories, such as cd/af/cdaf2c77d974d5e794909313dceb3554be69a42e_4b1619fe
.
In this structure, cdaf2c77d974d5e794909313dceb3554be69a42e
is the commit hash
for the source git repo, and 4b1619fe
is the short hash for the distgit commit.
The first two directory levels (cd/af
) are taken from the commit hash.
Component support¶
DLRN now supports the concept of components inside a repository. We can use components to divide the packages in a repo into logical aggregations. For example, in the OpenStack use case, we could have separate components for those packages related to networking, compute, storage, etc.
Currently, only the RdoInfoDriver
and DownstreamInfoDriver
package info
drivers supports this. When components are defined, and enabled with the
use_components=True
option in projects.ini
, DLRN will change its behavior
in the following ways:
- Hashed yum repositories will change their paths, including a component part. For
example, a commit for a package in the compute component will use hash
component/compute/cd/af/cdaf2c77d974d5e794909313dceb3554be69a42e_4b1619fe
. - Each component will have a separate repository (
component/compute
,component/network``and so on), and the ``current
andconsistent
symlinks will also be relative to each component. - To preserve compatibility with instances without component support, the top-level
current
andconsistent
symlinks will be replaced by acurrent
andconsistent
directory. Each directory will contain a single .repo file, and that file will aggregate the .repo files for the current/consistent repositories of all components.
Post-build actions¶
After a package is built, we need to create a package repository with the latest
version for every package in the project list. The post_build()
function in
shell.py
takes care of that. The idea behind this is that the repo for each
build will contain the most current version of each package to date. This
behavior can be skipped if the --no-repo
command-line option is provided, so
only the build package and logs will be stored.
To minimize the amount of storage used for each repo, DLRN does not copy the
packages to the current hashed directory. Instead, post_build()
iterates
through the list of packages, finding the RPMs for their latest successful
builds, and symlinks them in the current hashed directory.
It is probably easier to understand with an example:
Initially, we only have source commit 010b0a and distgit commit 020202 for project foo, then its hashed repo will look like:
01/0b/010b0a_020202/foo-<version>.el7.centos.noarch.rpm
Then, we build project bar, with source commit 030303 and distgit commit 040404. Its hashed repo will be:
03/03/030303_040404/bar-<version>.el7.centos.noarch.rpm 03/03/030303_040404/foo-<version>.el7.centos.noarch.rpm -> ../../../01/0b/010b0a_020202/foo-<version>.el7.centos.noarch.rpm
And the same process will be followed for every new package.
Error reporting¶
DLRN allows two different ways to notify build errors, both included in notifications.py:
- A notification e-mail, sent using the
sendnotifymail()
function. The mail recipient list is taken from themaintainers
project property. - A Gerrit review. This option makes use of a utility script
submit_review.sh
and the configured options in options.ini to create the review. It also adds the project maintainers to the generated review.