Note

The Funtoo Linux project has transitioned to "Hobby Mode" and this wiki is now read-only.

LXD

From Funtoo
Jump to navigation Jump to search

LXD is a container "hypervisor" it should provide user with a new and fresh experience using LXC technology.

LXD consists of three components:

  • A system-wide daemon (lxd)
  • A command line client (lxc)
  • An OpenStack Nova plugin (nova-compute-lxd)

A REST API that is accesible both locally and if enabled, over the network is provided from the lxd daemon.

The command line tool is designed to be a very simple, yet very powerful tool to manage all your containers. It can handle connections to multiple container hosts and easily give you an overview of all the containers on your network, let you create some more where you want them and even move them around while they're running.

The OpenStack plugin then allows you to use your lxd hosts as compute nodes, running workloads on containers rather than virtual machines.

The LXD project was founded and is currently led by Canonical Ltd and Ubuntu with contributions from a range of other companies and individual contributors.


Features

Some of the biggest features of LXD are:

  • Secure by design (unprivileged containers, resource restrictions and much more)
  • Scalable (from containers on your laptop to thousand of compute nodes)
  • Intuitive (simple, clear API and crisp command line experience)
  • Image based (no more distribution templates, only good, trusted images)
  • Live migration

Unprivileged Containers

LXD uses unprivileged containers by default. The difference between an unprivileged container and a privileged one is whether the root user in the container is the “real” root user (uid 0 at the kernel level).

The way unprivileged containers are created is by taking a set of normal UIDs and GIDs from the host, usually at least 65536 of each (to be POSIX compliant) and mapping those into the container.

The most common example and what most LXD users will end up with by default is a map of 65536 UIDs and GIDs, with a host base id of 100000. This means that root in the container (uid 0) will be mapped to the host uid 100000 and uid 65535 in the container will be mapped to uid 165535 on the host. UID/GID 65536 and higher in the container aren’t mapped and will return an error if you attempt to use them.

From a security point of view, that means that anything which is not owned by the users and groups mapped into the container will be inaccessible. Any such resource will show up as being owned by uid/gid “-1” (rendered as 65534 or nobody/nogroup in userspace). It also means that should there be a way to escape the container, even root in the container would find itself with just as much privileges on the host as a nobody user.

LXD does offer a number of options related to unprivileged configuration:

  • Increasing the size of the default uid/gid map
  • Setting up per-container maps
  • Punching holes into the map to expose host users and groups

Relationship with LXC

LXD isn't a rewrite of LXC, in fact it's building on top of LXC to provide a new, better user experience. Under the hood, LXD uses LXC through liblxc and its Go binding to create and manage the containers.

It's basically an alternative to LXC's tools and distribution template system with the added features that come from being controllable over the network.

Licensing

LXD is free software and is developed under the Apache 2 license.

Installing LXD in Funtoo

Kernel pre-requisities

These options should be enable in your kernel to use all of the functions of LXD:

 !GRKERNSEC_CHROOT_CAPS
 !GRKERNSEC_CHROOT_CHMOD
 !GRKERNSEC_CHROOT_DOUBLE
 !GRKERNSEC_CHROOT_MOUNT
 !GRKERNSEC_CHROOT_PIVOT
 !GRKERNSEC_PROC
 !GRKERNSEC_SYSFS_RESTRICT
 !NETPRIO_CGROUP
 BRIDGE
 CGROUP_CPUACCT
 CGROUP_DEVICE
 CGROUP_FREEZER
 CGROUP_SCHED
 CGROUPS
 CHECKPOINT_RESTORE
 CPUSETS
 DEVPTS_MULTIPLE_INSTANCES
 DUMMY
 EPOLL 
 EVENTFD 
 FHANDLE 
 IA32_EMULATION 
 INET_DIAG 
 INET_TCP_DIAG
 INET_UDP_DIAG
 INOTIFY_USER
 IP_NF_NAT
 IP_NF_TARGET_MASQUERADE
 IP6_NF_NAT
 IP6_NF_TARGET_MASQUERADE
 IPC_NS
 IPV6
 MACVLAN
 NAMESPACES 
 NET_IPGRE
 NET_IPGRE_DEMUX
 NET_IPIP
 NET_NS
 NETFILTER_XT_MATCH_COMMENT
 NETLINK_DIAG
 NF_NAT_MASQUERADE_IPV4
 NF_NAT_MASQUERADE_IPV6
 PACKET_DIAG 
 PID_NS 
 POSIX_MQUEUE
 UNIX_DIAG
 USER_NS
 UTS_NS
 VETH
 VXLAN

Getting LXD

Installing LXD is pretty straight forward as the ebuild exists in our portage tree. I would recommend putting /var on btrfs or zfs (or at least /var/lib/lxd) as LXD can take advantage of these COW filesytems. LXD doesn’t need any configuration to use btrfs, you just need to make sure that /var/lib/lxd is stored on a btrfs filesystem and LXD will automatically make use of it for you.

root # emerge -av lxd

These are the packages that would be merged, in order:

Calculating dependencies... done!
[ebuild  N     ] dev-lang/go-1.8-r1:0/1.8::gentoo  USE="-gccgo" 69,062 KiB
[ebuild  N     ] dev-go/go-crypto-0_pre20160126:0/0_pre20160126::gentoo  881 KiB
[ebuild  N     ] sys-fs/squashfs-tools-4.3-r2::gentoo  USE="xattr xz -debug -lz4 -lzma -lzo -static" 194 KiB
[ebuild  N     ] sys-libs/libseccomp-2.3.2::gentoo  USE="-static-libs" 547 KiB
[ebuild  N     ] net-libs/libnet-1.2_rc3-r1:1.1::gentoo  USE="-doc -static-libs" 661 KiB
[ebuild  N     ] dev-libs/libnl-3.3.0_rc1:3::gentoo  USE="python -static-libs -utils" PYTHON_TARGETS="python2_7 python3_4 -python3_5" 912 KiB
[ebuild  N     ] dev-python/ipaddr-2.1.11-r1::gentoo  PYTHON_TARGETS="python2_7 python3_4 -pypy -python3_5" 29 KiB
[ebuild  N     ] dev-go/go-text-0_pre20160211:0/0_pre20160211::gentoo  3,922 KiB
[ebuild  N     ] sys-libs/libcap-2.25::gentoo  USE="pam -static-libs" 63 KiB
[ebuild  N     ] dev-go/go-net-0_pre20160216:0/0_pre20160216::gentoo  724 KiB
[ebuild  N     ] net-dns/dnsmasq-2.76-r1::gentoo  USE="dhcp inotify ipv6 nls -auth-dns -conntrack -dbus -dhcp-tools -dnssec -idn -lua -script (-selinux) -static -tftp" LINGUAS="-de -es -fi -fr -id -it -no -pl -pt_BR -ro" 470 KiB
[ebuild  N     ] dev-libs/protobuf-c-1.2.1-r1:0/1.0.0::gentoo  USE="-static-libs {-test}" 448 KiB
[ebuild  N     ] sys-process/criu-2.12::gentoo  USE="python -setproctitle" PYTHON_TARGETS="python2_7" 632 KiB
[ebuild  N     ] app-emulation/lxc-2.0.7::gentoo  USE="python seccomp -cgmanager -doc -examples -lua" PYTHON_TARGETS="python3_4 -python3_5" 774 KiB
[ebuild  N     ] app-emulation/lxd-2.11::gentoo  USE="daemon nls {-test}" LINGUAS="-de -el -fr -ja -nl -ru" 2,352 KiB

Total: 15 packages (15 new), Size of downloads: 81,664 KiB

Would you like to add these changes to your config files? [Yes/No]

Running LXD

Once installed you need to start the LXD daemon. By running:

root # service lxd start
 * Starting lxd server ...

First setup of LXD/Initialisation

Before using LXD for the first time as a user, you may initialize your LXD environment. As recommended earlier I am using btrfs for this installation.

user $ lxd init
Do you want to configure a new storage pool (yes/no) [default=yes]?
Name of the new storage pool [default=default]:
Name of the storage backend to use (dir, btrfs, lvm) [default=dir]: btrfs
Create a new BTRFS pool (yes/no) [default=yes]?
Would you like to use an existing block device (yes/no) [default=no]?
Would you like to create a new subvolume for the BTRFS storage pool (yes/no) [default=yes]:
Would you like LXD to be available over the network (yes/no) [default=no]?
Would you like stale cached images to be updated automatically (yes/no) [default=yes]?
Would you like to create a new network bridge (yes/no) [default=yes]?
What should the new bridge be called [default=lxdbr0]?
What IPv4 address should be used (CIDR subnet notation, “auto” or “none”) [default=auto]?
What IPv6 address should be used (CIDR subnet notation, “auto” or “none”) [default=auto]?
LXD has been successfully configured.

What this does is it creates btrfs subvolumes like this:

user $ btrfs sub list .
ID 260 gen 1047 top level 5 path rootfs
ID 280 gen 1046 top level 260 path var/lib/lxd/storage-pools/default
ID 281 gen 1043 top level 280 path var/lib/lxd/storage-pools/default/containers
ID 282 gen 1044 top level 280 path var/lib/lxd/storage-pools/default/snapshots
ID 283 gen 1045 top level 280 path var/lib/lxd/storage-pools/default/images
ID 284 gen 1046 top level 280 path var/lib/lxd/storage-pools/default/custom

It also creates new network interface for you:

user $ ip a list dev lxdbr0
8: lxdbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether d2:9b:70:f2:8f:6f brd ff:ff:ff:ff:ff:ff
    inet 10.250.237.1/24 scope global lxdbr0
       valid_lft forever preferred_lft forever
    inet 169.254.59.23/16 brd 169.254.255.255 scope global lxdbr0
       valid_lft forever preferred_lft forever
    inet6 fd42:efd8:662e:3184::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::caf5:b7ed:445e:b112/64 scope link
       valid_lft forever preferred_lft forever

And last but not least it also generates iptables rules for you:

user $ iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:domain /* generated for LXD network lxdbr0 */
ACCEPT     udp  --  anywhere             anywhere             udp dpt:domain /* generated for LXD network lxdbr0 */
ACCEPT     udp  --  anywhere             anywhere             udp dpt:bootps /* generated for LXD network lxdbr0 */

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination
ACCEPT     all  --  anywhere             anywhere             /* generated for LXD network lxdbr0 */
ACCEPT     all  --  anywhere             anywhere             /* generated for LXD network lxdbr0 */

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
ACCEPT     tcp  --  anywhere             anywhere             tcp spt:domain /* generated for LXD network lxdbr0 */
ACCEPT     udp  --  anywhere             anywhere             udp spt:domain /* generated for LXD network lxdbr0 */
ACCEPT     udp  --  anywhere             anywhere             udp spt:bootps /* generated for LXD network lxdbr0 */

user $ iptables -L -t nat
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination

Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination
MASQUERADE  all  --  10.250.237.0/24     !10.250.237.0/24      /* generated for LXD network lxdbr0 */

user $ iptables -L -t mangle
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination

Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination
CHECKSUM   udp  --  anywhere             anywhere             udp dpt:bootpc /* generated for LXD network lxdbr0 */ CHECKSUM fill

Some other things done by the initialization and starting of the LXD daemon are:

  • dnsmasq listening on lxdbr0
  • ...

Finishing up the setup of LXD

There are still some things that you need to do manually. We need to setup subuid and subgid values for our containers to use. And for using non-systemd containers we will also need app-admin/cgmanager so emerge and start it now.

root # emerge app-admin/cgmanager
root # rc-update add lxd
root # rc-update add cgmanager && rc

Containers, snapshots and images

Containers in LXD are made of:

  • A filesystem (rootfs)
  • A list of configuration options, including resource limits, environment, security options and more
  • A bunch of devices like disks, character/block unix devices and network interfaces
  • A set of profiles the container inherits configuration from (see below)
  • Some properties (container architecture, ephemeral or persistent and the name)
  • Some runtime state (when using CRIU for checkpoint/restore)

Container snapshots as the name states snapshots of the container in time and cannot be modified in any way. It is worth noting that because snapshots can store the container runtime state, which gives us ability of “stateful” snapshots. That is, the ability to rollback the container including its cpu and memory state at the time of the snapshot.

LXD is image based, all LXD containers come from an image. Images are typically clean Linux distribution images similar to what you would use for a virtual machine or cloud instance. It is possible to “publish” a container, making an image from it which can then be used by the local or remote LXD hosts.

Our first image

Let's get our hands even more dirty and create our first image. We will be using a generic 64 bit Funtoo Linux image. Let's grab it, because we will need to modify it a little bit.

Modifying the stage3 image

So the steps that follow will download, extract the stage3 to a directory, and modify etc/rc.conf changing rc_sys value to lxc and comment out consoles in etc/inittab.

root # mkdir lxd-images
root # cd lxd-images
root # wget http://build.funtoo.org/funtoo-current/pure64/generic_64-pure64/stage3-latest.tar.xz
root # mkdir rootfs
root # tar -xaf stage3-latest.tar.xz -C rootfs/
root # cd rootfs
root # sed -i 's/^#rc_sys=""/rc_sys="lxc"/' etc/rc.conf
root # sed -i '/^c[1-6]/s/^\(.*\)$/#\1/' etc/inittab

Metadata and templates

Now we will create metadata and templates that will be used by the image and later by containers.

root # cd ..
root # mkdir templates
root # echo hostname=\"{{ container.name }}\" > templates/hostname.tpl

Create a file named metadata.yaml in current directory (lxd-images) with this contents:

architecture: x86_64
creation_date: 20170907
properties:
  architecture: x86_64
  description: Funtoo Current Generic Pure 64-bit
  name: funtoo-generic_64-pure64-funtoo-current-2016-12-10
  os: funtoo
  release: 1.0
  variant: current
templates:
  /etc/conf.d/hostname:
    template: hostname.tpl
    when:
      - create
      - copy

Prepare the archive

Now we recreate the archive and later use it to import into LXD.

root # tar -caf lxd-image.tar.xz metadata.yaml templates rootfs

Import the image

After we have successfully recreated our archive we can now finally import it into LXD and start using it as our "seed" image for all our containers.

root # lxc image import lxd-image.tar.xz --alias funtoo
Image imported with fingerprint: e279c16d1a801b2bd1698df95e148e0a968846835f4769b24988f2eb3700100f
root # lxc image ls
+--------+--------------+--------+------------------------------------+--------+----------+-----------------------------+
| ALIAS  | FINGERPRINT  | PUBLIC |            DESCRIPTION             |  ARCH  |   SIZE   |         UPLOAD DATE         |
+--------+--------------+--------+------------------------------------+--------+----------+-----------------------------+
| funtoo | e279c16d1a80 | no     | Funtoo Current Generic Pure 64-bit | x86_64 | 347.75MB | Sep 8, 2017 at 1:17am (UTC) |
+--------+--------------+--------+------------------------------------+--------+----------+-----------------------------+

And there we have our very first Funtoo Linux image imported inside LXD. You can reference the image through the alias or through the fingerprint. Aliases can be added also later.

Let me show you some basic usage then.

Creating your first container

First we have to add some subuid and subgid values for lxd to use. Add these lines to your /etc/subuid and /etc/subgid files on the host.

root # nano -w /etc/subuid
root:100000:65536
lxd:100000:65536
root # nano -w /etc/subgid
root:100000:65536
lxd:100000:65536

The maps for “lxd” and “root” should always be kept in sync. LXD itself is restricted by the “root” allocation. The “lxd” entry is used to track what needs to be removed if LXD is uninstalled.

This isolates the users inside the container and even if they would escape the container they would get nobody's privileges on the host.

So now we can init our first container. That is done using this command:

root # lxc init funtoo c1
Creating c1
root #  lxc ls
+------+---------+------+------+------------+-----------+
| NAME |  STATE  | IPV4 | IPV6 |    TYPE    | SNAPSHOTS |
+------+---------+------+------+------------+-----------+
| c1   | STOPPED |      |      | PERSISTENT | 0         |
+------+---------+------+------+------------+-----------+

Profiles intermezzo

LXD has the ability to change quite a few container settings, including resource limitation, control of container startup and a variety of device pass-through options using what is called profiles. Let me show you how can this be used.

This is the default profile that gets inherited by all containers.

root # lxc profile list
+---------+---------+
|  NAME   | USED BY |
+---------+---------+
| default | 1       |
+---------+---------+

root #  lxc profile show default
config: {}
description: Default LXD profile
devices:
  eth0:
    nictype: bridged
    parent: lxdbr0
    type: nic
  root:
    path: /
    pool: default
    type: disk
name: default
used_by:
- /1.0/containers/c1

Now let's make a profile specific for our funtoo containers. It will include shared meta-repo and some other goodies.

root # lxc profile create prf-funtoo
Profile prf-funtoo created
root #  lxc profile edit prf-funtoo
root ### This is a yaml representation of the profile.
root ### Any line starting with a '# will be ignored.
root ###
root ### A profile consists of a set of configuration items followed by a set of
root ### devices.
root ###
root ### An example would look like:
root ### name: onenic
root ### config:
root ###   raw.lxc: lxc.aa_profile=unconfined
root ### devices:
root ###   eth0:
root ###     nictype: bridged
root ###     parent: lxdbr0
root ###     type: nic
root ###
root ### Note that the name is shown but cannot be changed

config:   
  raw.lxc: lxc.mount.auto = proc sys cgroup
description: "LXD profile for Funtoo-based containers"
devices:   
  portage:
    path: var/git
    source: /var/git
    type: disk
name: prf-funtoo
used_by: []
root # lxc profile add c1 prf-funtoo
Profile prf-funtoo added to c1

Starting our first container

After we have done all these customizations we can now start our container.

root # lxc start c1

And now we can gain shell inside our container.

root # lxc exec c1 bash

Now you should see a different prompt starting with

c1 ~ #

If we run top or ps for example we will see only the processes of the container.

c1 ~ # ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0   4248   748 ?        Ss+  13:20   0:00 init [3]
root       266  0.0  0.0  30488   472 ?        Ss   13:20   0:00 /usr/sbin/sshd
root       312  0.2  0.0  17996  3416 ?        Ss   13:29   0:00 bash
root       317  0.0  0.0  19200  2260 ?        R+   13:29   0:00 ps aux
c1 ~ #

As you can see only the container's processes are shown. User running the processes is root here. What happens if we search for all sshd processes for example on the host box?

root # ps aux|grep ssh
root     14505  0.0  0.0  30564  1508 ?        Ss   Sep07   0:00 /usr/sbin/sshd   
100000   25863  0.0  0.0  30488   472 ?        Ss   15:20   0:00 /usr/sbin/sshd   
root     29487  0.0  0.0   8324   828 pts/2    S+   15:30   0:00 grep --colour=auto sshd
root #

So as you can see, the sshd process is running under user with uid 100000 on the host machine and has a different PID.

Getting information about your containers

Listing containers

root #  lxc ls
+------+---------+----------------------+-----------------------------------------------+------------+-----------+
| NAME |  STATE  |         IPV4         |                     IPV6                      |    TYPE    | SNAPSHOTS |
+------+---------+----------------------+-----------------------------------------------+------------+-----------+
| c1   | RUNNING | 10.214.101.79 (eth0) | fd42:156d:4593:a619:8619:546e:43f:2089 (eth0) | PERSISTENT | 0         |
|      |         |                      | fd42:156d:4593:a619:216:3eff:fe4a:3d4f (eth0) |            |           |
+------+---------+----------------------+-----------------------------------------------+------------+-----------+

Container details

root # lxc info c1
Name: c1
Remote: unix://
Architecture: x86_64
Created: 2017/09/08 02:07 UTC
Status: Running
Type: persistent
Profiles: default, prf-funtoo
Pid: 6366
Ips:
  eth0: inet    10.214.101.79   vethFG4HXG
  eth0: inet6   fd42:156d:4593:a619:8619:546e:43f:2089  vethFG4HXG
  eth0: inet6   fd42:156d:4593:a619:216:3eff:fe4a:3d4f  vethFG4HXG
  eth0: inet6   fe80::216:3eff:fe4a:3d4f        vethFG4HXG
  lo:   inet    127.0.0.1
  lo:   inet6   ::1
Resources:
  Processes: 6
  CPU usage:
    CPU usage (in seconds): 25
  Memory usage:
    Memory (current): 69.01MB
    Memory (peak): 258.92MB
  Network usage:
    eth0:
      Bytes received: 83.65kB
      Bytes sent: 9.44kB
      Packets received: 188
      Packets sent: 93
    lo:
      Bytes received: 0B
      Bytes sent: 0B
      Packets received: 0
      Packets sent: 0

Container configuration

root #  lxc config edit c1
root ### This is a yaml representation of the configuration.
root ### Any line starting with a '# will be ignored.
root ###
root ### A sample configuration looks like:
root ### name: container1
root ### profiles:
root ### - default
root ### config:
root ###   volatile.eth0.hwaddr: 00:16:3e:e9:f8:7f
root ### devices:
root ###   homedir:
root ###     path: /extra
root ###     source: /home/user
root ###     type: disk
root ### ephemeral: false
root ###
root ### Note that the name is shown but cannot be changed

architecture: x86_64
config:
  image.architecture: x86_64
  image.description: Funtoo Current Generic Pure 64-bit
  image.name: funtoo-generic_64-pure64-funtoo-current-2016-12-10
  image.os: funtoo
  image.release: "1.0"
  image.variant: current
  volatile.base_image: e279c16d1a801b2bd1698df95e148e0a968846835f4769b24988f2eb3700100f
  volatile.eth0.hwaddr: 00:16:3e:4a:3d:4f
  volatile.eth0.name: eth0
  volatile.idmap.base: "0"
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":100000,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":100000,"Nsid":0,"Maprange":65536}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":100000,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":100000,"Nsid":0,"Maprange":65536}]'
  volatile.last_state.power: RUNNING
devices: {}
ephemeral: false
profiles:
- default
- prf-funtoo
stateful: false
description: ""

Managing files

Snapshots

Cloning, copying and moving containers

Resource control

LXD offers a variety of resource limits. Some of those are tied to the container itself, like memory quotas, CPU limits and I/O priorities. Some are tied to a particular device instead, like I/O bandwidth or disk usage limits.

As with all LXD configuration, resource limits can be dynamically changed while the container is running. Some may fail to apply, for example if setting a memory value smaller than the current memory usage, but LXD will try anyway and report back on failure.

All limits can also be inherited through profiles in which case each affected container will be constrained by that limit. That is, if you set limits.memory=256MB in the default profile, every container using the default profile (typically all of them) will have a memory limit of 256MB.

Disk

Setting a size limit on the container’s filesystem and have it enforced against the container. Right now LXD only supports disk limits if you’re using the ZFS or btrfs storage backend.

CPU

Memory

Network

Block I/O

Image manipulations

Remote hosts