1*da82c92fSMauro Carvalho Chehab============== 2*da82c92fSMauro Carvalho ChehabCgroup Freezer 3*da82c92fSMauro Carvalho Chehab============== 4*da82c92fSMauro Carvalho Chehab 5*da82c92fSMauro Carvalho ChehabThe cgroup freezer is useful to batch job management system which start 6*da82c92fSMauro Carvalho Chehaband stop sets of tasks in order to schedule the resources of a machine 7*da82c92fSMauro Carvalho Chehabaccording to the desires of a system administrator. This sort of program 8*da82c92fSMauro Carvalho Chehabis often used on HPC clusters to schedule access to the cluster as a 9*da82c92fSMauro Carvalho Chehabwhole. The cgroup freezer uses cgroups to describe the set of tasks to 10*da82c92fSMauro Carvalho Chehabbe started/stopped by the batch job management system. It also provides 11*da82c92fSMauro Carvalho Chehaba means to start and stop the tasks composing the job. 12*da82c92fSMauro Carvalho Chehab 13*da82c92fSMauro Carvalho ChehabThe cgroup freezer will also be useful for checkpointing running groups 14*da82c92fSMauro Carvalho Chehabof tasks. The freezer allows the checkpoint code to obtain a consistent 15*da82c92fSMauro Carvalho Chehabimage of the tasks by attempting to force the tasks in a cgroup into a 16*da82c92fSMauro Carvalho Chehabquiescent state. Once the tasks are quiescent another task can 17*da82c92fSMauro Carvalho Chehabwalk /proc or invoke a kernel interface to gather information about the 18*da82c92fSMauro Carvalho Chehabquiesced tasks. Checkpointed tasks can be restarted later should a 19*da82c92fSMauro Carvalho Chehabrecoverable error occur. This also allows the checkpointed tasks to be 20*da82c92fSMauro Carvalho Chehabmigrated between nodes in a cluster by copying the gathered information 21*da82c92fSMauro Carvalho Chehabto another node and restarting the tasks there. 22*da82c92fSMauro Carvalho Chehab 23*da82c92fSMauro Carvalho ChehabSequences of SIGSTOP and SIGCONT are not always sufficient for stopping 24*da82c92fSMauro Carvalho Chehaband resuming tasks in userspace. Both of these signals are observable 25*da82c92fSMauro Carvalho Chehabfrom within the tasks we wish to freeze. While SIGSTOP cannot be caught, 26*da82c92fSMauro Carvalho Chehabblocked, or ignored it can be seen by waiting or ptracing parent tasks. 27*da82c92fSMauro Carvalho ChehabSIGCONT is especially unsuitable since it can be caught by the task. Any 28*da82c92fSMauro Carvalho Chehabprograms designed to watch for SIGSTOP and SIGCONT could be broken by 29*da82c92fSMauro Carvalho Chehabattempting to use SIGSTOP and SIGCONT to stop and resume tasks. We can 30*da82c92fSMauro Carvalho Chehabdemonstrate this problem using nested bash shells:: 31*da82c92fSMauro Carvalho Chehab 32*da82c92fSMauro Carvalho Chehab $ echo $$ 33*da82c92fSMauro Carvalho Chehab 16644 34*da82c92fSMauro Carvalho Chehab $ bash 35*da82c92fSMauro Carvalho Chehab $ echo $$ 36*da82c92fSMauro Carvalho Chehab 16690 37*da82c92fSMauro Carvalho Chehab 38*da82c92fSMauro Carvalho Chehab From a second, unrelated bash shell: 39*da82c92fSMauro Carvalho Chehab $ kill -SIGSTOP 16690 40*da82c92fSMauro Carvalho Chehab $ kill -SIGCONT 16690 41*da82c92fSMauro Carvalho Chehab 42*da82c92fSMauro Carvalho Chehab <at this point 16690 exits and causes 16644 to exit too> 43*da82c92fSMauro Carvalho Chehab 44*da82c92fSMauro Carvalho ChehabThis happens because bash can observe both signals and choose how it 45*da82c92fSMauro Carvalho Chehabresponds to them. 46*da82c92fSMauro Carvalho Chehab 47*da82c92fSMauro Carvalho ChehabAnother example of a program which catches and responds to these 48*da82c92fSMauro Carvalho Chehabsignals is gdb. In fact any program designed to use ptrace is likely to 49*da82c92fSMauro Carvalho Chehabhave a problem with this method of stopping and resuming tasks. 50*da82c92fSMauro Carvalho Chehab 51*da82c92fSMauro Carvalho ChehabIn contrast, the cgroup freezer uses the kernel freezer code to 52*da82c92fSMauro Carvalho Chehabprevent the freeze/unfreeze cycle from becoming visible to the tasks 53*da82c92fSMauro Carvalho Chehabbeing frozen. This allows the bash example above and gdb to run as 54*da82c92fSMauro Carvalho Chehabexpected. 55*da82c92fSMauro Carvalho Chehab 56*da82c92fSMauro Carvalho ChehabThe cgroup freezer is hierarchical. Freezing a cgroup freezes all 57*da82c92fSMauro Carvalho Chehabtasks belonging to the cgroup and all its descendant cgroups. Each 58*da82c92fSMauro Carvalho Chehabcgroup has its own state (self-state) and the state inherited from the 59*da82c92fSMauro Carvalho Chehabparent (parent-state). Iff both states are THAWED, the cgroup is 60*da82c92fSMauro Carvalho ChehabTHAWED. 61*da82c92fSMauro Carvalho Chehab 62*da82c92fSMauro Carvalho ChehabThe following cgroupfs files are created by cgroup freezer. 63*da82c92fSMauro Carvalho Chehab 64*da82c92fSMauro Carvalho Chehab* freezer.state: Read-write. 65*da82c92fSMauro Carvalho Chehab 66*da82c92fSMauro Carvalho Chehab When read, returns the effective state of the cgroup - "THAWED", 67*da82c92fSMauro Carvalho Chehab "FREEZING" or "FROZEN". This is the combined self and parent-states. 68*da82c92fSMauro Carvalho Chehab If any is freezing, the cgroup is freezing (FREEZING or FROZEN). 69*da82c92fSMauro Carvalho Chehab 70*da82c92fSMauro Carvalho Chehab FREEZING cgroup transitions into FROZEN state when all tasks 71*da82c92fSMauro Carvalho Chehab belonging to the cgroup and its descendants become frozen. Note that 72*da82c92fSMauro Carvalho Chehab a cgroup reverts to FREEZING from FROZEN after a new task is added 73*da82c92fSMauro Carvalho Chehab to the cgroup or one of its descendant cgroups until the new task is 74*da82c92fSMauro Carvalho Chehab frozen. 75*da82c92fSMauro Carvalho Chehab 76*da82c92fSMauro Carvalho Chehab When written, sets the self-state of the cgroup. Two values are 77*da82c92fSMauro Carvalho Chehab allowed - "FROZEN" and "THAWED". If FROZEN is written, the cgroup, 78*da82c92fSMauro Carvalho Chehab if not already freezing, enters FREEZING state along with all its 79*da82c92fSMauro Carvalho Chehab descendant cgroups. 80*da82c92fSMauro Carvalho Chehab 81*da82c92fSMauro Carvalho Chehab If THAWED is written, the self-state of the cgroup is changed to 82*da82c92fSMauro Carvalho Chehab THAWED. Note that the effective state may not change to THAWED if 83*da82c92fSMauro Carvalho Chehab the parent-state is still freezing. If a cgroup's effective state 84*da82c92fSMauro Carvalho Chehab becomes THAWED, all its descendants which are freezing because of 85*da82c92fSMauro Carvalho Chehab the cgroup also leave the freezing state. 86*da82c92fSMauro Carvalho Chehab 87*da82c92fSMauro Carvalho Chehab* freezer.self_freezing: Read only. 88*da82c92fSMauro Carvalho Chehab 89*da82c92fSMauro Carvalho Chehab Shows the self-state. 0 if the self-state is THAWED; otherwise, 1. 90*da82c92fSMauro Carvalho Chehab This value is 1 iff the last write to freezer.state was "FROZEN". 91*da82c92fSMauro Carvalho Chehab 92*da82c92fSMauro Carvalho Chehab* freezer.parent_freezing: Read only. 93*da82c92fSMauro Carvalho Chehab 94*da82c92fSMauro Carvalho Chehab Shows the parent-state. 0 if none of the cgroup's ancestors is 95*da82c92fSMauro Carvalho Chehab frozen; otherwise, 1. 96*da82c92fSMauro Carvalho Chehab 97*da82c92fSMauro Carvalho ChehabThe root cgroup is non-freezable and the above interface files don't 98*da82c92fSMauro Carvalho Chehabexist. 99*da82c92fSMauro Carvalho Chehab 100*da82c92fSMauro Carvalho Chehab* Examples of usage:: 101*da82c92fSMauro Carvalho Chehab 102*da82c92fSMauro Carvalho Chehab # mkdir /sys/fs/cgroup/freezer 103*da82c92fSMauro Carvalho Chehab # mount -t cgroup -ofreezer freezer /sys/fs/cgroup/freezer 104*da82c92fSMauro Carvalho Chehab # mkdir /sys/fs/cgroup/freezer/0 105*da82c92fSMauro Carvalho Chehab # echo $some_pid > /sys/fs/cgroup/freezer/0/tasks 106*da82c92fSMauro Carvalho Chehab 107*da82c92fSMauro Carvalho Chehabto get status of the freezer subsystem:: 108*da82c92fSMauro Carvalho Chehab 109*da82c92fSMauro Carvalho Chehab # cat /sys/fs/cgroup/freezer/0/freezer.state 110*da82c92fSMauro Carvalho Chehab THAWED 111*da82c92fSMauro Carvalho Chehab 112*da82c92fSMauro Carvalho Chehabto freeze all tasks in the container:: 113*da82c92fSMauro Carvalho Chehab 114*da82c92fSMauro Carvalho Chehab # echo FROZEN > /sys/fs/cgroup/freezer/0/freezer.state 115*da82c92fSMauro Carvalho Chehab # cat /sys/fs/cgroup/freezer/0/freezer.state 116*da82c92fSMauro Carvalho Chehab FREEZING 117*da82c92fSMauro Carvalho Chehab # cat /sys/fs/cgroup/freezer/0/freezer.state 118*da82c92fSMauro Carvalho Chehab FROZEN 119*da82c92fSMauro Carvalho Chehab 120*da82c92fSMauro Carvalho Chehabto unfreeze all tasks in the container:: 121*da82c92fSMauro Carvalho Chehab 122*da82c92fSMauro Carvalho Chehab # echo THAWED > /sys/fs/cgroup/freezer/0/freezer.state 123*da82c92fSMauro Carvalho Chehab # cat /sys/fs/cgroup/freezer/0/freezer.state 124*da82c92fSMauro Carvalho Chehab THAWED 125*da82c92fSMauro Carvalho Chehab 126*da82c92fSMauro Carvalho ChehabThis is the basic mechanism which should do the right thing for user space task 127*da82c92fSMauro Carvalho Chehabin a simple scenario. 128