cgroup_namespaces manpage

Search topic Section

CGROUP_NAMESPACES(7)	   Linux Programmer's Manual	  CGROUP_NAMESPACES(7)

       cgroup_namespaces - overview of Linux cgroup namespaces

       For an overview of namespaces, see namespaces(7).

       Cgroup  namespaces  virtualize  the  view  of  a process's cgroups (see
       cgroups(7)) as seen via /proc/[pid]/cgroup and /proc/[pid]/mountinfo.

       Each cgroup namespace has its own set of cgroup root directories, which
       are   the   base	  points  for  the  relative  locations	 displayed  in
       /proc/[pid]/cgroup.  When a process  creates  a	new  cgroup  namespace
       using clone(2) or unshare(2) with the CLONE_NEWCGROUP flag, it enters a
       new cgroup namespace in which its current  cgroups  directories	become
       the  cgroup  root directories of the new namespace.  (This applies both
       for the cgroups version 1 hierarchies and the cgroups version 2 unified

       When  viewing /proc/[pid]/cgroup, the pathname shown in the third field
       of each record will be relative to the reading  process's  cgroup  root
       directory.   If the cgroup directory of the target process lies outside
       the root directory of the reading process's cgroup namespace, then  the
       pathname	 will  show  ../ entries for each ancestor level in the cgroup

       The following shell session demonstrates the effect of creating	a  new
       cgroup  namespace.   First,  (as superuser) we create a child cgroup in
       the freezer hierarchy, and put the shell into that cgroup:

	   # mkdir -p /sys/fs/cgroup/freezer/sub
	   # echo $$			  # Show PID of this shell
	   # sh -c 'echo 30655 > /sys/fs/cgroup/sub'
	   # cat /proc/self/cgroup | grep freezer

       Next, we use unshare(1) to create a process running a new shell in  new
       cgroup and mount namespaces:

	   # unshare -Cm bash

       We  then inspect the /proc/[pid]/cgroup files of, respectively, the new
       shell process started by the unshare(1) command, a process that	is  in
       the  original  cgroup  namespace (init, with PID 1), and a process in a
       sibling cgroup:

	   $ cat /proc/self/cgroup | grep freezer
	   $ cat /proc/1/cgroup | grep freezer
	   $ cat /proc/20124/cgroup | grep freezer

       However, when we look in	 /proc/self/mountinfo  we  see	the  following

	   # cat /proc/self/mountinfo | grep freezer
	   155 145 0:32 /.. /sys/fs/cgroup/freezer ...

       The  fourth  field of this file should show the directory in the cgroup
       filesystem which forms the root of this mount.  Since by the definition
       of  cgroup  namespaces,	the process's current freezer cgroup directory
       became its root freezer cgroup directory, we should  see	 '/'  in  this
       field.	The  problem  here is that we are seeing a mount entry for the
       cgroup filesystem corresponding to our initial shell  process's	cgroup
       namespace  (whose  cgroup  filesystem  is  indeed  rooted in the parent
       directory of sub).  We need to remount the  freezer  cgroup  filesystem
       inside this cgroup namespace, after which we see the expected results:

	   # mount --make-rslave /     # Don't propagate mount events
				       # to other namespaces
	   # umount /sys/fs/cgroup/freezer
	   # mount -t cgroup -o freezer freezer /sys/fs/cgroup/freezer
	   # cat /proc/self/mountinfo | grep freezer
	   155 145 0:32 / /sys/fs/cgroup/freezer rw,relatime ...

       Use  of cgroup namespaces requires a kernel that is configured with the
       CONFIG_CGROUPS option.

       Namespaces are a Linux-specific feature.

       Among the purposes served by  the  virtualization  provided  by	cgroup
       namespaces are the following:

       * It  prevents information leaks whereby cgroup directory paths outside
	 of a container would otherwise be visible to processes	 in  the  con-
	 tainer.   Such	 leakages could, for example, reveal information about
	 the container framework to containerized applications.

       * It eases tasks such as container migration.  The virtualization  pro-
	 vided	by  cgroup  namespaces	allows	containers to be isolated from
	 knowledge of the pathnames of ancestor cgroups.  Without such	isola-
	 tion,	the  full  cgroup  pathnames (displayed in /proc/self/cgroups)
	 would need to be replicated on the target  system  when  migrating  a
	 container; those pathnames would also need to be unique, so that they
	 don't conflict with other pathnames on the target system.

       * It allows better confinement of containerized processes,  because  it
	 is possible to mount the container's cgroup filesystems such that the
	 container processes can't gain access to ancestor cgroup directories.
	 Consider, for example, the following scenario:

	   o We have a cgroup directory, /cg/1, that is owned by user ID 9000.

	   o We	 have a process, X, also owned by user ID 9000, that is names-
	     paced under the cgroup /cg/1/2 (i.e.,  X  was  placed  in	a  new
	     cgroup  namespace via clone(2) or unshare(2) with the CLONE_NEWC-
	     GROUP flag).

	 In the absence of cgroup namespacing, because	the  cgroup  directory
	 /cg/1 is owned (and writable) by UID 9000 and process X is also owned
	 by user ID 9000, then process X would be able to modify the  contents
	 of  cgroups  files (i.e., change cgroup settings) not only in /cg/1/2
	 but also in the ancestor cgroup directory /cg/1.  Namespacing process
	 X  under  the	cgroup directory /cg/1/2, in combination with suitable
	 mount operations for the cgroup filesystem (as shown above), prevents
	 it modifying files in /cg/1, since it cannot even see the contents of
	 that directory (or of further removed cgroup  ancestor	 directories).
	 Combined  with	 correct enforcement of hierarchical limits, this pre-
	 vents process X from escaping the limits imposed by ancestor cgroups.

       unshare(1), clone(2), setns(2), unshare(2), proc(5),  cgroups(7),  cre-
       dentials(7), namespaces(7), user_namespaces(7)

       This  page  is  part of release 4.10 of the Linux man-pages project.  A
       description of the project, information about reporting bugs,  and  the
       latest	  version     of     this    page,    can    be	   found    at

Linux				  2016-07-17		  CGROUP_NAMESPACES(7)