1*da82c92fSMauro Carvalho Chehab==============
2*da82c92fSMauro Carvalho ChehabCgroup Freezer
3*da82c92fSMauro Carvalho Chehab==============
4*da82c92fSMauro Carvalho Chehab
5*da82c92fSMauro Carvalho ChehabThe cgroup freezer is useful to batch job management system which start
6*da82c92fSMauro Carvalho Chehaband stop sets of tasks in order to schedule the resources of a machine
7*da82c92fSMauro Carvalho Chehabaccording to the desires of a system administrator. This sort of program
8*da82c92fSMauro Carvalho Chehabis often used on HPC clusters to schedule access to the cluster as a
9*da82c92fSMauro Carvalho Chehabwhole. The cgroup freezer uses cgroups to describe the set of tasks to
10*da82c92fSMauro Carvalho Chehabbe started/stopped by the batch job management system. It also provides
11*da82c92fSMauro Carvalho Chehaba means to start and stop the tasks composing the job.
12*da82c92fSMauro Carvalho Chehab
13*da82c92fSMauro Carvalho ChehabThe cgroup freezer will also be useful for checkpointing running groups
14*da82c92fSMauro Carvalho Chehabof tasks. The freezer allows the checkpoint code to obtain a consistent
15*da82c92fSMauro Carvalho Chehabimage of the tasks by attempting to force the tasks in a cgroup into a
16*da82c92fSMauro Carvalho Chehabquiescent state. Once the tasks are quiescent another task can
17*da82c92fSMauro Carvalho Chehabwalk /proc or invoke a kernel interface to gather information about the
18*da82c92fSMauro Carvalho Chehabquiesced tasks. Checkpointed tasks can be restarted later should a
19*da82c92fSMauro Carvalho Chehabrecoverable error occur. This also allows the checkpointed tasks to be
20*da82c92fSMauro Carvalho Chehabmigrated between nodes in a cluster by copying the gathered information
21*da82c92fSMauro Carvalho Chehabto another node and restarting the tasks there.
22*da82c92fSMauro Carvalho Chehab
23*da82c92fSMauro Carvalho ChehabSequences of SIGSTOP and SIGCONT are not always sufficient for stopping
24*da82c92fSMauro Carvalho Chehaband resuming tasks in userspace. Both of these signals are observable
25*da82c92fSMauro Carvalho Chehabfrom within the tasks we wish to freeze. While SIGSTOP cannot be caught,
26*da82c92fSMauro Carvalho Chehabblocked, or ignored it can be seen by waiting or ptracing parent tasks.
27*da82c92fSMauro Carvalho ChehabSIGCONT is especially unsuitable since it can be caught by the task. Any
28*da82c92fSMauro Carvalho Chehabprograms designed to watch for SIGSTOP and SIGCONT could be broken by
29*da82c92fSMauro Carvalho Chehabattempting to use SIGSTOP and SIGCONT to stop and resume tasks. We can
30*da82c92fSMauro Carvalho Chehabdemonstrate this problem using nested bash shells::
31*da82c92fSMauro Carvalho Chehab
32*da82c92fSMauro Carvalho Chehab	$ echo $$
33*da82c92fSMauro Carvalho Chehab	16644
34*da82c92fSMauro Carvalho Chehab	$ bash
35*da82c92fSMauro Carvalho Chehab	$ echo $$
36*da82c92fSMauro Carvalho Chehab	16690
37*da82c92fSMauro Carvalho Chehab
38*da82c92fSMauro Carvalho Chehab	From a second, unrelated bash shell:
39*da82c92fSMauro Carvalho Chehab	$ kill -SIGSTOP 16690
40*da82c92fSMauro Carvalho Chehab	$ kill -SIGCONT 16690
41*da82c92fSMauro Carvalho Chehab
42*da82c92fSMauro Carvalho Chehab	<at this point 16690 exits and causes 16644 to exit too>
43*da82c92fSMauro Carvalho Chehab
44*da82c92fSMauro Carvalho ChehabThis happens because bash can observe both signals and choose how it
45*da82c92fSMauro Carvalho Chehabresponds to them.
46*da82c92fSMauro Carvalho Chehab
47*da82c92fSMauro Carvalho ChehabAnother example of a program which catches and responds to these
48*da82c92fSMauro Carvalho Chehabsignals is gdb. In fact any program designed to use ptrace is likely to
49*da82c92fSMauro Carvalho Chehabhave a problem with this method of stopping and resuming tasks.
50*da82c92fSMauro Carvalho Chehab
51*da82c92fSMauro Carvalho ChehabIn contrast, the cgroup freezer uses the kernel freezer code to
52*da82c92fSMauro Carvalho Chehabprevent the freeze/unfreeze cycle from becoming visible to the tasks
53*da82c92fSMauro Carvalho Chehabbeing frozen. This allows the bash example above and gdb to run as
54*da82c92fSMauro Carvalho Chehabexpected.
55*da82c92fSMauro Carvalho Chehab
56*da82c92fSMauro Carvalho ChehabThe cgroup freezer is hierarchical. Freezing a cgroup freezes all
57*da82c92fSMauro Carvalho Chehabtasks belonging to the cgroup and all its descendant cgroups. Each
58*da82c92fSMauro Carvalho Chehabcgroup has its own state (self-state) and the state inherited from the
59*da82c92fSMauro Carvalho Chehabparent (parent-state). Iff both states are THAWED, the cgroup is
60*da82c92fSMauro Carvalho ChehabTHAWED.
61*da82c92fSMauro Carvalho Chehab
62*da82c92fSMauro Carvalho ChehabThe following cgroupfs files are created by cgroup freezer.
63*da82c92fSMauro Carvalho Chehab
64*da82c92fSMauro Carvalho Chehab* freezer.state: Read-write.
65*da82c92fSMauro Carvalho Chehab
66*da82c92fSMauro Carvalho Chehab  When read, returns the effective state of the cgroup - "THAWED",
67*da82c92fSMauro Carvalho Chehab  "FREEZING" or "FROZEN". This is the combined self and parent-states.
68*da82c92fSMauro Carvalho Chehab  If any is freezing, the cgroup is freezing (FREEZING or FROZEN).
69*da82c92fSMauro Carvalho Chehab
70*da82c92fSMauro Carvalho Chehab  FREEZING cgroup transitions into FROZEN state when all tasks
71*da82c92fSMauro Carvalho Chehab  belonging to the cgroup and its descendants become frozen. Note that
72*da82c92fSMauro Carvalho Chehab  a cgroup reverts to FREEZING from FROZEN after a new task is added
73*da82c92fSMauro Carvalho Chehab  to the cgroup or one of its descendant cgroups until the new task is
74*da82c92fSMauro Carvalho Chehab  frozen.
75*da82c92fSMauro Carvalho Chehab
76*da82c92fSMauro Carvalho Chehab  When written, sets the self-state of the cgroup. Two values are
77*da82c92fSMauro Carvalho Chehab  allowed - "FROZEN" and "THAWED". If FROZEN is written, the cgroup,
78*da82c92fSMauro Carvalho Chehab  if not already freezing, enters FREEZING state along with all its
79*da82c92fSMauro Carvalho Chehab  descendant cgroups.
80*da82c92fSMauro Carvalho Chehab
81*da82c92fSMauro Carvalho Chehab  If THAWED is written, the self-state of the cgroup is changed to
82*da82c92fSMauro Carvalho Chehab  THAWED.  Note that the effective state may not change to THAWED if
83*da82c92fSMauro Carvalho Chehab  the parent-state is still freezing. If a cgroup's effective state
84*da82c92fSMauro Carvalho Chehab  becomes THAWED, all its descendants which are freezing because of
85*da82c92fSMauro Carvalho Chehab  the cgroup also leave the freezing state.
86*da82c92fSMauro Carvalho Chehab
87*da82c92fSMauro Carvalho Chehab* freezer.self_freezing: Read only.
88*da82c92fSMauro Carvalho Chehab
89*da82c92fSMauro Carvalho Chehab  Shows the self-state. 0 if the self-state is THAWED; otherwise, 1.
90*da82c92fSMauro Carvalho Chehab  This value is 1 iff the last write to freezer.state was "FROZEN".
91*da82c92fSMauro Carvalho Chehab
92*da82c92fSMauro Carvalho Chehab* freezer.parent_freezing: Read only.
93*da82c92fSMauro Carvalho Chehab
94*da82c92fSMauro Carvalho Chehab  Shows the parent-state.  0 if none of the cgroup's ancestors is
95*da82c92fSMauro Carvalho Chehab  frozen; otherwise, 1.
96*da82c92fSMauro Carvalho Chehab
97*da82c92fSMauro Carvalho ChehabThe root cgroup is non-freezable and the above interface files don't
98*da82c92fSMauro Carvalho Chehabexist.
99*da82c92fSMauro Carvalho Chehab
100*da82c92fSMauro Carvalho Chehab* Examples of usage::
101*da82c92fSMauro Carvalho Chehab
102*da82c92fSMauro Carvalho Chehab   # mkdir /sys/fs/cgroup/freezer
103*da82c92fSMauro Carvalho Chehab   # mount -t cgroup -ofreezer freezer /sys/fs/cgroup/freezer
104*da82c92fSMauro Carvalho Chehab   # mkdir /sys/fs/cgroup/freezer/0
105*da82c92fSMauro Carvalho Chehab   # echo $some_pid > /sys/fs/cgroup/freezer/0/tasks
106*da82c92fSMauro Carvalho Chehab
107*da82c92fSMauro Carvalho Chehabto get status of the freezer subsystem::
108*da82c92fSMauro Carvalho Chehab
109*da82c92fSMauro Carvalho Chehab   # cat /sys/fs/cgroup/freezer/0/freezer.state
110*da82c92fSMauro Carvalho Chehab   THAWED
111*da82c92fSMauro Carvalho Chehab
112*da82c92fSMauro Carvalho Chehabto freeze all tasks in the container::
113*da82c92fSMauro Carvalho Chehab
114*da82c92fSMauro Carvalho Chehab   # echo FROZEN > /sys/fs/cgroup/freezer/0/freezer.state
115*da82c92fSMauro Carvalho Chehab   # cat /sys/fs/cgroup/freezer/0/freezer.state
116*da82c92fSMauro Carvalho Chehab   FREEZING
117*da82c92fSMauro Carvalho Chehab   # cat /sys/fs/cgroup/freezer/0/freezer.state
118*da82c92fSMauro Carvalho Chehab   FROZEN
119*da82c92fSMauro Carvalho Chehab
120*da82c92fSMauro Carvalho Chehabto unfreeze all tasks in the container::
121*da82c92fSMauro Carvalho Chehab
122*da82c92fSMauro Carvalho Chehab   # echo THAWED > /sys/fs/cgroup/freezer/0/freezer.state
123*da82c92fSMauro Carvalho Chehab   # cat /sys/fs/cgroup/freezer/0/freezer.state
124*da82c92fSMauro Carvalho Chehab   THAWED
125*da82c92fSMauro Carvalho Chehab
126*da82c92fSMauro Carvalho ChehabThis is the basic mechanism which should do the right thing for user space task
127*da82c92fSMauro Carvalho Chehabin a simple scenario.
128