summaryrefslogtreecommitdiff
path: root/Software/systemd/ControlGroupInterface.mdwn
diff options
context:
space:
mode:
authorLennartPoettering <LennartPoettering@web>2013-09-26 18:44:02 -0700
committerwww <iki-www@freedesktop.org>2013-09-26 18:44:02 -0700
commit07e6cd57003e961285423127e81176e471d06dd8 (patch)
treea8900da003bae116b38617f4674549a55066a43b /Software/systemd/ControlGroupInterface.mdwn
parent23cf7155f197f278d16d630331502f04ee9ad1e6 (diff)
Diffstat (limited to 'Software/systemd/ControlGroupInterface.mdwn')
-rw-r--r--Software/systemd/ControlGroupInterface.mdwn29
1 files changed, 29 insertions, 0 deletions
diff --git a/Software/systemd/ControlGroupInterface.mdwn b/Software/systemd/ControlGroupInterface.mdwn
index eab1d832..c246de4b 100644
--- a/Software/systemd/ControlGroupInterface.mdwn
+++ b/Software/systemd/ControlGroupInterface.mdwn
@@ -2,3 +2,32 @@
Starting with version 205 systemd provides a number of interfaces that may be used to create and manage labelled groups of processes for the purpose of monitoring and controlling them and their resource usage. This is built on top of the Linux kernel Control Groups ("cgroups") facility. Previously, the kernel's cgroups API was exposed directly as application API, following the rules of the [[Pax Control Groups|http://www.freedesktop.org/wiki/Software/systemd/PaxControlGroups/]] document. However, the kernel cgroup interface is in the process of being reworked into an API that requires a single writer in userspace managing it. With this change the cgroup tree becomes private property of that userspace component and is no longer a shared resource. On systemd systems PID 1 takes this role and hence needs to provide APIs for clients to take benefit of the control groups functionality of the kernel.
+## Why this all again?
+
+* Objects placed in the same level of the cgroup tree frequently need to propagate properties from one to each other. For example, when using the "cpu" controller for one object then all objects on the same level need to do the same, otherwise the entire cgroup of the first object will be scheduled against the individual processes of the others, thus giving the first object a drastic malus on scheduling if it uses many processes.
+
+* Similar, some properties also require propagation up the tree.
+
+* The tree needs to be refreshed/build up in scheduled steps as devices show up/go away as controllers like "blkio" or "devices" refer to devices via major/minor device node indexes, which are not fixed but determined only as a device appears.
+
+* Many of the attributes are too low-level as API. For example, the major/minor device interface which to be useful requires a userspace component to translate stable device paths into major/minor at the right time.
+
+* By unifying the cgroup logic under a single arbiter it is possible to write tools that can manage all objects the system contains, including services, virtual machines containers and whatever else applications register.
+
+* By unifying the cgroup logic under a single arbiter a good default that encompasses all kinds of objects may be shipped, thus making manual configuration unnecessary to take benefit of resource control.
+
+systemd through its "unit" concept already implements a dependency network between objects where propagation can take place and contains a powerful execution queue. Also, a major part of the objects resources need to be controlled for are already systemd objects, most prominently the services systemd manages.
+
+## Why is this not managed by a component independent of systemd?
+
+Well, as mentioned above, a dependency network between objects, usable for propagation, combined with a powerful execution engine is basically what systemd *is*. Since cgroups management requires precisely this it is an obvious choice to simply implement this in systemd itself.
+
+Implementing a similar propagation/dependency network with execution scheduler outside of systemd in an independent "cgroup" daemon would basically mean reimplementing systemd a second time. Also, accessing such an external service from PID 1 for managing other services would result in cyclic dependencies between PID 1 which would need this functionality to manage the cgroup service which would only be available however after that service finished starting up. Such cyclic dependencies can certainly be worked around, but make such a design complex.
+
+## I don't use systemd, what does this mean for me?
+
+Nothing. This page is about systemd's cgroups APIs. If you don't use systemd then the kernel cgroup rework will probably affect you eventually, but a different component will be the single writer userspace daemon managing the cgroup tree, with different APIs. Note that the APIs described here expose a lot of systemd-specific concepts and hence are unlikely to be available outside of systemd systems.
+
+## I want to write cgroup code that should work on both systemd systems and others (such as Ubuntu), what should I do?
+
+On systemd systems use the systemd APIs as described below. We are not aware of any component that would take the cgroup managing role on Upstart/sysvinit systems, so I cannot help you there.