Ebuild Functions

From Funtoo
Jump to navigation Jump to search

Ebuild Functions

You may find it hard to believe, but what now exists as Portage used to only be a bash shell script of about 100 lines called "ebuild" -- view it here. Ebuild functions, which are still a core component of ebuilds, existed then, and still exist today, to define various sets of steps to build and install a particular piece of software from source code on to a Gentoo or Funtoo Linux system.

The first ebuild script worked similarly to the way modern Portage does -- The contents of a ".ebuild" file (written by a user or developer) would be sourced. This ebuild file contained defined various bash shell functions. Also contained in a ".ebuild" file were variable definitions, which are covered on a separate wiki page, Portage Variables.

Then, the ebuild script would execute these functions in a particular order to perform certain tasks, such as unpack the source tarball, configure and compile the source code, and copy the results into a temporary image directory so the files can then be put into a .tbz2 package or copied directly to the live filesystem.

When defined in an ebuild file, bash shell functions look like this:

    (bash source code) - Example bash shell function
function_name() {
    commands to run when function_name is called.
    another line of commands.
}

Once defined, the commands inside function_name can be called in a script by simply typing:

    (bash source code)
function_name

In Portage, it's ebuild itself that actually calls these various functions. As an ebuild writer, you simply define them.

Function Walkthrough

Now, let's walk through each ebuild function that is called by ebuild, in the order that they are called, and see what they do:

  • pkg_setup - variable intialization and sanity checks
  • src_unpack - unpack sources
  • src_prepare - prepare sources by patching them or tweaking them as needed
  • src_configure - run autoconf-related things to configure them for compiling
  • src_compile - actually compile the sources
  • src_install - install all the files we want to end up in a package or on the filesystem into a temporary image directory. Portage handles .tbz2 creation or installing to a filesystem for us, using what we install to the image directory.

After src_install, the files are ready to be "merged" into the live filesystem. This is when they are copied from the temporary build directory into /usr, etc.

You may think that we have covered the whole range of ebuild functions by now, but you'd be wrong! There are still other ebuild functions that are executed, which begin with the pkg_ prefix, and these functions are designed to run commands on your local system. Typically, they are used to run commands right before files are merged, and right after, as follows:

  • pkg_preinst - before merging files, are there any local commands or configuration we need to do? If so, we do it here.
  • (files are merged)
  • pkg_postinst - after merging files, are there any local commands or configuration we need to do? If so, we do it here.

Like many of the src_ functions, the pkg_ functions do not always need to be defined and used. But some ebuilds need to make use of them.

A Word About EAPIs

As the Portage package manager was developed and grew from 100 lines of code into the monster it now is, lots and lots of features were added. At some point, the amount of new features being added made it difficult to write ebuilds that worked with all versions of Portage, since some features changed the default behavior of Portage slightly. For this reason, the concept of EAPI, or "ebuild API" was adopted. Typically a numerical value, the EAPI setting in an ebuild denotes what "version" of the ebuild spec is targeted by a particular ebuild. Modern versions of Portage can handle ebuilds written for various EAPIs by looking at the EAPI variable at the top of the ebuild and using this to determine what features should be enabled when executing functions in that ebuild. You will see references to EAPI below, because the default behavior of some of the ebuild functions change based on EAPI, and some functions were introduced with a particular EAPI.

src_* functions

Below, I will describe in a high level of technical detail each src_ function that exists in Portage.

Ebuild functions starting with src_ are all related to creating the ebuild or package from source code or their original artifacts (which could be binaries for binary ebuilds.)

I will also reference Portage Variables in these descriptions. Please see the Portage Variables page to familiarize yourself with them as needed.

src_unpack

src_unpack is intended to be used to unpack the source code or artifacts that will be used by the other src_* functions. With EAPI 1 and earlier, it is also used for patching/modifying the source to prepare them for building, but with EAPI 2 or later the src_prepare function should be used for this instead. When src_unpack starts, the current working directory is set to $WORKDIR, which is the directory within which all source code should be expanded. Note that the variable $A is set to the names of all the unique source files specified in SRC_URI, and they will all be available in $DISTDIR by the time src_unpack starts. Also note that if no src_unpack function is specified, ebuild.sh will execute the following function for src_unpack by default:

src_unpack() {
  unpack ${A}
}

src_prepare

EAPI 2 and above support the src_prepare function, which is intended to be used for applying patches or making other modifications to the source code. When src_prepare starts, the current working directory is set to $S.

src_configure

EAPI 2 and above support the src_configure function, which is used to configure the source code prior to compilation. With EAPI 2 and above, the following default src_configure is defined if none is specified:

src_configure() {
	if [[ -x ${ECONF_SOURCE:-.}/configure ]] ; then
		econf
	fi
}

src_compile

This function defines the steps necessary to compile source code. With EAPI 1 and earlier, this function is also used to configure the source code prior to compilation. However, starting with EAPI 2, the src_configure function must be used for configuration steps instead of bundling them inside src_compile. In addition, starting with EAPI 2, there is now a default src_compile function that will be executed if none is defined in the ebuild:

src_compile() {
	if [ -f Makefile ] || [ -f GNUmakefile ] || [ -f makefile ] ; then
		emake || die "emake failed"
	fi
}

src_test

src_test is an interesting function - by default, an end-user's Portage does not have tests enabled. But if a user has test in FEATURES, or EBUILD_FORCE_TEST is defined, then ebuild.sh will attempt to run a test suite for this ebuild, by executing make check or make test if these targets are defined in the Makefile; otherwise, no tests will execute. If your Makefile supports make check or make test but the test suite is broken, then specify RESTRICT="test" in your ebuild to disable the test suite.

src_install

src_install is used by the ebuild writer to install all to-be-installed files to the $D directory, which can be treated like an empty root filesystem, in that ${D}/usr is the equivalent of the /usr directory, etc. When src_install runs, the Portage sandbox will be enabled, which will prevent any processes from creating or modifying files outside of the ${D} filesystem tree, and a sandbox violation will occur (resulting in the termination of the ebuild) if this is attempted. Once src_install has perfomed all necessary steps to install all to-be-installed files to $D, Portage will take care of merging these files to the filesystem specified by the $ROOT environment variable, which defaults to / if not set. When Portage merges these files, it will also record information about the installed package to /var/db/pkg/(cat)/$P. Typically, a src_install function such as this is sufficient for ensuring that all to-be-installed files are installed to $D:

src_install() {
  make DESTDIR="$D" install
}

pkg_* functions

An ebuild's functions starting with pkg_* take a wider view of the package lifecycle, and may be executed very early or very late in the build or package installation process. They are also all executed even if installing a Portage binary package, so are the intended place for defining any global configuration changes that are also required during binary package installation, such as user and group creation. When these functions are executed, the $ROOT variable will be defined to point to the target root filesystem to which the package is to be (or has been) installed. All logic inside pkg_* functions must function properly even if $ROOT is something other than /.

pkg_setup

The pkg_setup function is unusual in that it runs prior to any src_* function, and also runs prior to any other pkg_* function that runs when a binary package is installed, so it provides a useful place for the ebuild writer to perform any sanity checks, global configuration changes to the system (such as user/group creation) or set any internal global variables that are used by the rest of the ebuild. Using this function for defining global variables that are needed in multiple other functions is a useful way of avoiding duplicate code. You should also look to pkg_setup as the ideal place to put any logic that would otherwise linger in the main body of the ebuild, which should be avoided at all costs as it will slow down dependency calculation by Portage. Also remember that Portage can build binary packages, and this function is a good place to execute any steps that are required to run both prior to building an ebuild, and prior to installing a package. Also consider using pkg_preinst and pkg_postinst for this purpose.

pkg_pretend

   Note

If you aren't interested in the gory details of pkg_pretend, it's safe to skip to the next ebuild function. This ebuild function is not very useful, and is more interesting as a case study of how Portage features were sometimes added without sufficient technical review.

The pkg_pretend function was added with EAPI 3, and the Gentoo development manual says it is intended to "run sanity checks for a package during dependency calculation time". I'll explain in this section how this function was poorly implemented and cannot perform its intended task of performing sanity checks at dep calculation time, and what pkg_pretend can be reliably used for, which is -- unfortunately -- very little.

Here are the various problems with the implementation of this function that prevent it from being useful:

  • Tests may fail due to yet-to-be-merged dependencies. Therefore, the pkg_pretend function can't be used for any tests that could potentially be satisfied by a build or runtime dependency.
  • Further limiting its usefulness, pkg_pretend runs unconditionally -- if the package is to be built from source and installed, if the package is to be installed from a binary package, or even if the package is to be built from source and not installed on the system. All of these situations result in pkg_pretend being evaluated at dep calculation time, and many tests only apply in certain scenarios. A workaround for this is to use the MERGE_TYPE variable (see below) to differentiate between each of these situations. This makes it very hard to use correctly.
  • Also consider that the case of a binary package, the package may be being installed in a test/QA environment, or on a build server, that does not reflect the system on which the package will ultimately run.
  • pkg_pretend also injects shell scripts into the dependency calculation, further slowing down dependency calculations. This almost seems like a cruel joke. Although in actual implementation in Portage, these checks are performed immediately after dependencies are successfully resolved, which isn't as horrible as running them as part of dependency calculation itself.
When/How to Use It

One may be tempted to place tests for kernel features in pkg_pretend, and abort if certain features are not present. This is also ill-advised. These tests should only print recommendations to the user, as they may not be accurate. Here's why. linux-info.eclass will use /proc/config.gz to check for kernel features, if it exists. If it does not exist, it falls back to /usr/src/linux/.config. This file may not exist yet -- it may be in the dependency list to be merged, and even if it does exist at the time pkg_pretend is run, it may not accurately reflect the runtime environment of the final package, in the case of a package being built for QA or for being hosted on a binary package server. As you can see, any kernel checks are not guaranteed to be accurate or relevant in all cases, so cannot provide any true sanity checks that would cause the merge to abort early.

So while pkg_pretend could be marginally useful for a typical use case of a Gentoo or Funtoo Linux user installing a package from source, you can see that there are many situations where it could cause a problem.

Here are some recommendations for the proper use of pkg_pretend:

  • If possible, avoid using pkg_pretend entirely:
    • Place recommendations for kernel functionality in pkg_postinst. pkg_pretend has no benefit over this existing method.
    • Peform true pre-build sanity checks in pre_src_unpack, pre_src_prepare or pre_src_compile. These can evaluate the build environment immediately before the build starts, so their tests will be accurate.
  • If you do use it, take the above design limitations into account, and it's recommended that pkg_pretend is only used for tests related to building from source, and that these tests are only run if MERGE_TYPE is set to source.
  • Tests should only be recommendations, and not for its intended use of pre-build sanity checks (since it runs at the wrong time.)
  • Do not call an eclass' pkg_pretend from your own pkg_setup function.
MERGE_TYPE

To use pkg_pretend properly, understand that it runs at dep calculation time, and may be prior to the ebuild building from source, installing from a binary package, or just building a binary package and not installing. To determine what exactly will be happening, examine the MERGE_TYPE variable. It can have one of the following possible values:

binary
the ebuild is installing from binary package.
source
the ebuild is installing from source.
buildonly
the ebuild is building a package, but not installing.

Perform the appropriate checks based on the setting of MERGE_TYPE, and more importantly, don't perform checks that are inappropriate based on what is happening.

pkg_preinst

The pkg_preinst function is called by Portage, prior to merging the to-be-installed files to the target filesystem specified by $ROOT environment variable (which defaults to /.) Keep in mind that these to-be-installed files were either just compiled and installed to $D by src_install, or they were just extracted from a .tbz2 binary package. The pkg_preinst function provides an ideal place to perform any "just before install" actions, such as user and group creation or other necessary steps to ensure that the package merges successfully. It also provides a potential place to perform any sanity checks related to installing the package to the target filesystem. If any sanity checks fail, calling die from this function will cause the package to not be installed to the target filesystem.

pkg_postinst

The pkg_postinst function is called by Portage prior to the package being installed to the target filesystem specified by $ROOT. This is a good place to perform any post-install configuration actions as well as print any informational messages for the user's benefit related to the package that was just installed.

pkg_prerm

The pkg_prerm function is called by Portage before an ebuild is removed from the filesystem.

pkg_postrm

The pkg_postrm function is called by Portage after an ebuild is removed from the filesystem.

pkg_config

The pkg_config function is called by Portage when the user calls emerge --config for the ebuild. The current directory will be set to the current directory of the shell from where emerge --config is run.

Skipping over a function

To skip over a function, create a function that does not do anything. The recommended way is to use bash no-op command:

# Skip src_prepare.
src_prepare() { :; }

Extra pre_ and post_ functions

Modern versions of Portage also support functions identical to the above functions but with pre_ and post_ at the beginning of the function name. For example, post_src_configure will be executed after src_configure and before src_compile. These additional functions are supported by all EAPIs, provided that the parent function is supported by the EAPI in use. The initial current working directory should be identical to the initial current working directory of the parent function.

Helper Functions

econf()

econf() is part of ebuild.sh and is intended to be a wrapper to the configure command that is typically used in the src_configure() stage. It has a number of behaviors that are important for ebuild writers to understand. Once you understand what econf() does, you are free to use it in your ebuilds. Note that the behavior of econf() is generally safe for most autoconf-based source archives, but in some cases it may be necessary to avoid using econf() to avoid some of its default behaviors.

Automatically set prefix

--prefix=/usr will be passed to configure automatically, unless a --prefix argument was specified to econf(), in which case, that --prefix setting will be used instead.

Automatically set libdir

If the ABI variable is set (typically done in the profile), then econf() will look for a variable named LIBDIR_$ABI (ie. LIBDIR_amd64). If this variable is set, the value of this variable will be used to set libdir to the value of {prefix}/LIBDIR_$ABI.

Automatically set CHOST and CTARGET

The --host=$CHOST argument will be passed to configure. $CHOST is defined in the system profile. In addition, the --target=$CTARGET argument will be passed to configure if $CTARGET is defined. This is not normally required but is done to make Portage more capable of cross-compiling the ebuild. However, this functionality is not a guarantee that your ebuild will successfully cross-compile, as other changes to the ebuild may be necessary.

Disable Dependency Tracking (EAPI 4)

In EAPI 4, the --disable-dependency-tracking argument will be passed to configure in order to optimize the performance of the configuration process. This option should have no impact other than on the performance of the configure script.

List of arguments

The following arguments are passed to configure and are all overrideable by the user by passing similar options to econf():

  • --prefix=/usr
  • --libdir={prefix}/LIBDIR_$ABI
  • --host=${CHOST}
  • if CTARGET is defined, then --target=${CTARGET}
  • --mandir=/usr/share/man
  • --infodir=/usr/share/info
  • --datadir=/usr/share
  • --sysconfdir=/etc
  • --localstatedir=/var/lib
  • if EAPI 4+, then --disable-dependency-tracking