YoLinux.com: pid_namespaces manpage

PID_NAMESPACES(7)	   Linux Programmer's Manual	     PID_NAMESPACES(7)



NAME
       pid_namespaces - overview of Linux PID namespaces

DESCRIPTION
       For an overview of namespaces, see namespaces(7).

       PID  namespaces	isolate the process ID number space, meaning that pro-
       cesses in different PID namespaces can have the same PID.   PID	names-
       paces  allow  containers	 to  provide  functionality  such  as suspend-
       ing/resuming the set of processes in the container  and	migrating  the
       container  to a new host while the processes inside the container main-
       tain the same PIDs.

       PIDs in a new PID namespace start at 1, somewhat like a standalone sys-
       tem, and calls to fork(2), vfork(2), or clone(2) will produce processes
       with PIDs that are unique within the namespace.

       Use of PID namespaces requires a kernel that  is	 configured  with  the
       CONFIG_PID_NS option.

   The namespace init process
       The first process created in a new namespace (i.e., the process created
       using clone(2) with the CLONE_NEWPID flag, or the first	child  created
       by  a  process  after a call to unshare(2) using the CLONE_NEWPID flag)
       has the PID 1, and  is  the  "init"  process  for  the  namespace  (see
       init(1)).   A  child process that is orphaned within the namespace will
       be reparented to this process rather than init(1) (unless  one  of  the
       ancestors  of the child in the same PID namespace employed the prctl(2)
       PR_SET_CHILD_SUBREAPER command to mark itself as the reaper of orphaned
       descendant processes).

       If  the "init" process of a PID namespace terminates, the kernel termi-
       nates all of the processes in the namespace via a SIGKILL signal.  This
       behavior reflects the fact that the "init" process is essential for the
       correct operation of a PID  namespace.	In  this  case,	 a  subsequent
       fork(2)	into this PID namespace will fail with the error ENOMEM; it is
       not possible to create a new processes in a PID namespace whose	"init"
       process	has terminated.	 Such scenarios can occur when, for example, a
       process uses an open file descriptor for a /proc/[pid]/ns/pid file cor-
       responding  to  a process that was in a namespace to setns(2) into that
       namespace after the "init" process has  terminated.   Another  possible
       scenario	 can occur after a call to unshare(2): if the first child sub-
       sequently created by a fork(2) terminates,  then	 subsequent  calls  to
       fork(2) will fail with ENOMEM.

       Only signals for which the "init" process has established a signal han-
       dler can be sent to the "init" process by  other	 members  of  the  PID
       namespace.   This restriction applies even to privileged processes, and
       prevents other members of the PID namespace from	 accidentally  killing
       the "init" process.

       Likewise,  a process in an ancestor namespace can--subject to the usual
       permission checks described in  kill(2)--send  signals  to  the	"init"
       process	of a child PID namespace only if the "init" process has estab-
       lished a handler for that signal.  (Within the handler,	the  siginfo_t
       si_pid  field  described	 in  sigaction(2)  will	 be zero.)  SIGKILL or
       SIGSTOP are treated exceptionally: these signals are forcibly delivered
       when sent from an ancestor PID namespace.  Neither of these signals can
       be caught by the "init" process,	 and  so  will	result	in  the	 usual
       actions	associated  with  those signals (respectively, terminating and
       stopping the process).

       Starting with Linux 3.4, the reboot(2) system call causes a  signal  to
       be  sent	 to  the  namespace  "init"  process.	See reboot(2) for more
       details.

   Nesting PID namespaces
       PID namespaces can be nested: each PID namespace has a  parent,	except
       for  the initial ("root") PID namespace.	 The parent of a PID namespace
       is the PID namespace of the process that created	 the  namespace	 using
       clone(2)	 or  unshare(2).   PID	namespaces  thus form a tree, with all
       namespaces ultimately tracing their ancestry to the root namespace.

       A process is visible to other processes in its PID  namespace,  and  to
       the  processes  in each direct ancestor PID namespace going back to the
       root PID namespace.  In this context, "visible" means that one  process
       can  be	the target of operations by another process using system calls
       that specify a process ID.  Conversely, the processes in	 a  child  PID
       namespace  can't see processes in the parent and further removed ances-
       tor namespaces.	More succinctly: a process can see (e.g., send signals
       with kill(2), set nice values with setpriority(2), etc.) only processes
       contained in its own PID namespace and in descendants  of  that	names-
       pace.

       A process has one process ID in each of the layers of the PID namespace
       hierarchy in which is visible, and  walking  back  though  each	direct
       ancestor	 namespace  through  to	 the root PID namespace.  System calls
       that operate on process IDs always operate using the process ID that is
       visible in the PID namespace of the caller.  A call to getpid(2) always
       returns the PID associated with the namespace in which the process  was
       created.

       Some  processes in a PID namespace may have parents that are outside of
       the namespace.  For example, the parent of the initial process  in  the
       namespace  (i.e.,  the  init(1)	process	 with PID 1) is necessarily in
       another namespace.  Likewise, the direct children  of  a	 process  that
       uses  setns(2)  to  cause its children to join a PID namespace are in a
       different PID namespace from the caller of setns(2).   Calls  to	 getp-
       pid(2) for such processes return 0.

       While  processes	 may  freely  descend into child PID namespaces (e.g.,
       using setns(2) with CLONE_NEWPID), they	may  not  move	in  the	 other
       direction.  That is to say, processes may not enter any ancestor names-
       paces (parent, grandparent, etc.).  Changing PID namespaces is  a  one-
       way operation.

       The  NS_GET_PARENT  ioctl(2)  operation	can  be	 used  to discover the
       parental relationship between PID namespaces; see ioctl_ns(2).

   setns(2) and unshare(2) semantics
       Calls to setns(2) that specify a	 PID  namespace	 file  descriptor  and
       calls  to  unshare(2)  with the CLONE_NEWPID flag cause children subse-
       quently created by the caller to be placed in a different PID namespace
       from the caller.	 These calls do not, however, change the PID namespace
       of the calling process, because doing so would change the caller's idea
       of its own PID (as reported by getpid()), which would break many appli-
       cations and libraries.

       To put things another way: a  process's	PID  namespace	membership  is
       determined  when	 the  process  is created and cannot be changed there-
       after.  Among other things, this means that the	parental  relationship
       between	processes mirrors the parental relationship between PID names-
       paces: the parent of a process is  either  in  the  same	 namespace  or
       resides in the immediate parent PID namespace.

   Compatibility of CLONE_NEWPID with other CLONE_* flags
       In  current  versions  of  Linux,  CLONE_NEWPID	can't be combined with
       CLONE_THREAD.  Threads are required to be in  the  same	PID  namespace
       such  that  the	threads	 in  a process can send signals to each other.
       Similarly, it must be possible to see all of the threads of a processes
       in  the	proc(5) filesystem.  Additionally, if two threads were in dif-
       ferent PID namespaces, the process ID of the process sending  a	signal
       could  not  be  meaningfully  encoded  when  a  signal is sent (see the
       description of the siginfo_t type in sigaction(2)).  Since this is com-
       puted  when a signal is enqueued, a signal queue shared by processes in
       multiple PID namespaces would defeat that.

       In earlier versions of Linux, CLONE_NEWPID was additionally  disallowed
       (failing	 with  the  error  EINVAL)  in	combination with CLONE_SIGHAND
       (before Linux 4.3) as  well  as	CLONE_VM  (before  Linux  3.12).   The
       changes that lifted these restrictions have also been ported to earlier
       stable kernels.

   /proc and PID namespaces
       A /proc filesystem shows (in the	 /proc/[pid]  directories)  only  pro-
       cesses  visible	in the PID namespace of the process that performed the
       mount, even if the /proc filesystem is viewed from processes  in	 other
       namespaces.

       After  creating	a  new	PID  namespace,	 it is useful for the child to
       change its root directory and mount a new procfs instance at  /proc  so
       that  tools  such as ps(1) work correctly.  If a new mount namespace is
       simultaneously created by including CLONE_NEWNS in the  flags  argument
       of  clone(2)  or unshare(2), then it isn't necessary to change the root
       directory: a new procfs instance can be mounted directly over /proc.

       From a shell, the command to mount /proc is:

	   $ mount -t proc proc /proc

       Calling readlink(2) on the path /proc/self yields the process ID of the
       caller  in  the PID namespace of the procfs mount (i.e., the PID names-
       pace of the process that mounted the procfs).  This can be  useful  for
       introspection  purposes,	 when  a  process wants to discover its PID in
       other namespaces.

   Miscellaneous
       When a process ID is passed over a UNIX domain socket to a process in a
       different  PID  namespace  (see	the  description of SCM_CREDENTIALS in
       unix(7)), it is translated into the  corresponding  PID	value  in  the
       receiving process's PID namespace.

CONFORMING TO
       Namespaces are a Linux-specific feature.

EXAMPLE
       See user_namespaces(7).

SEE ALSO
       clone(2),   setns(2),  unshare(2),  proc(5),  capabilities(7),  creden-
       tials(7), namespaces(7), user_namespaces(7), switch_root(8)

COLOPHON
       This page is part of release 4.10 of the Linux  man-pages  project.   A
       description  of	the project, information about reporting bugs, and the
       latest	 version    of	  this	  page,	   can	   be	  found	    at
       https://www.kernel.org/doc/man-pages/.



Linux				  2016-12-12		     PID_NAMESPACES(7)
Search topic		Section