1da82c92fSMauro Carvalho Chehab===============
2da82c92fSMauro Carvalho ChehabRDMA Controller
3da82c92fSMauro Carvalho Chehab===============
4da82c92fSMauro Carvalho Chehab
5da82c92fSMauro Carvalho Chehab.. Contents
6da82c92fSMauro Carvalho Chehab
7da82c92fSMauro Carvalho Chehab   1. Overview
8da82c92fSMauro Carvalho Chehab     1-1. What is RDMA controller?
9da82c92fSMauro Carvalho Chehab     1-2. Why RDMA controller needed?
10da82c92fSMauro Carvalho Chehab     1-3. How is RDMA controller implemented?
11da82c92fSMauro Carvalho Chehab   2. Usage Examples
12da82c92fSMauro Carvalho Chehab
13da82c92fSMauro Carvalho Chehab1. Overview
14da82c92fSMauro Carvalho Chehab===========
15da82c92fSMauro Carvalho Chehab
16da82c92fSMauro Carvalho Chehab1-1. What is RDMA controller?
17da82c92fSMauro Carvalho Chehab-----------------------------
18da82c92fSMauro Carvalho Chehab
19da82c92fSMauro Carvalho ChehabRDMA controller allows user to limit RDMA/IB specific resources that a given
20da82c92fSMauro Carvalho Chehabset of processes can use. These processes are grouped using RDMA controller.
21da82c92fSMauro Carvalho Chehab
22da82c92fSMauro Carvalho ChehabRDMA controller defines two resources which can be limited for processes of a
23da82c92fSMauro Carvalho Chehabcgroup.
24da82c92fSMauro Carvalho Chehab
25da82c92fSMauro Carvalho Chehab1-2. Why RDMA controller needed?
26da82c92fSMauro Carvalho Chehab--------------------------------
27da82c92fSMauro Carvalho Chehab
28da82c92fSMauro Carvalho ChehabCurrently user space applications can easily take away all the rdma verb
29da82c92fSMauro Carvalho Chehabspecific resources such as AH, CQ, QP, MR etc. Due to which other applications
30da82c92fSMauro Carvalho Chehabin other cgroup or kernel space ULPs may not even get chance to allocate any
31da82c92fSMauro Carvalho Chehabrdma resources. This can lead to service unavailability.
32da82c92fSMauro Carvalho Chehab
33da82c92fSMauro Carvalho ChehabTherefore RDMA controller is needed through which resource consumption
34da82c92fSMauro Carvalho Chehabof processes can be limited. Through this controller different rdma
35da82c92fSMauro Carvalho Chehabresources can be accounted.
36da82c92fSMauro Carvalho Chehab
37da82c92fSMauro Carvalho Chehab1-3. How is RDMA controller implemented?
38da82c92fSMauro Carvalho Chehab----------------------------------------
39da82c92fSMauro Carvalho Chehab
40da82c92fSMauro Carvalho ChehabRDMA cgroup allows limit configuration of resources. Rdma cgroup maintains
41da82c92fSMauro Carvalho Chehabresource accounting per cgroup, per device using resource pool structure.
42da82c92fSMauro Carvalho ChehabEach such resource pool is limited up to 64 resources in given resource pool
43da82c92fSMauro Carvalho Chehabby rdma cgroup, which can be extended later if required.
44da82c92fSMauro Carvalho Chehab
45da82c92fSMauro Carvalho ChehabThis resource pool object is linked to the cgroup css. Typically there
46da82c92fSMauro Carvalho Chehabare 0 to 4 resource pool instances per cgroup, per device in most use cases.
47da82c92fSMauro Carvalho ChehabBut nothing limits to have it more. At present hundreds of RDMA devices per
48da82c92fSMauro Carvalho Chehabsingle cgroup may not be handled optimally, however there is no
49da82c92fSMauro Carvalho Chehabknown use case or requirement for such configuration either.
50da82c92fSMauro Carvalho Chehab
51da82c92fSMauro Carvalho ChehabSince RDMA resources can be allocated from any process and can be freed by any
52da82c92fSMauro Carvalho Chehabof the child processes which shares the address space, rdma resources are
53da82c92fSMauro Carvalho Chehabalways owned by the creator cgroup css. This allows process migration from one
54da82c92fSMauro Carvalho Chehabto other cgroup without major complexity of transferring resource ownership;
55da82c92fSMauro Carvalho Chehabbecause such ownership is not really present due to shared nature of
56da82c92fSMauro Carvalho Chehabrdma resources. Linking resources around css also ensures that cgroups can be
57da82c92fSMauro Carvalho Chehabdeleted after processes migrated. This allow progress migration as well with
58da82c92fSMauro Carvalho Chehabactive resources, even though that is not a primary use case.
59da82c92fSMauro Carvalho Chehab
60da82c92fSMauro Carvalho ChehabWhenever RDMA resource charging occurs, owner rdma cgroup is returned to
61da82c92fSMauro Carvalho Chehabthe caller. Same rdma cgroup should be passed while uncharging the resource.
62da82c92fSMauro Carvalho ChehabThis also allows process migrated with active RDMA resource to charge
63da82c92fSMauro Carvalho Chehabto new owner cgroup for new resource. It also allows to uncharge resource of
64da82c92fSMauro Carvalho Chehaba process from previously charged cgroup which is migrated to new cgroup,
65da82c92fSMauro Carvalho Chehabeven though that is not a primary use case.
66da82c92fSMauro Carvalho Chehab
67da82c92fSMauro Carvalho ChehabResource pool object is created in following situations.
68da82c92fSMauro Carvalho Chehab(a) User sets the limit and no previous resource pool exist for the device
69da82c92fSMauro Carvalho Chehabof interest for the cgroup.
70da82c92fSMauro Carvalho Chehab(b) No resource limits were configured, but IB/RDMA stack tries to
71da82c92fSMauro Carvalho Chehabcharge the resource. So that it correctly uncharge them when applications are
72da82c92fSMauro Carvalho Chehabrunning without limits and later on when limits are enforced during uncharging,
73da82c92fSMauro Carvalho Chehabotherwise usage count will drop to negative.
74da82c92fSMauro Carvalho Chehab
75da82c92fSMauro Carvalho ChehabResource pool is destroyed if all the resource limits are set to max and
76da82c92fSMauro Carvalho Chehabit is the last resource getting deallocated.
77da82c92fSMauro Carvalho Chehab
78da82c92fSMauro Carvalho ChehabUser should set all the limit to max value if it intents to remove/unconfigure
79da82c92fSMauro Carvalho Chehabthe resource pool for a particular device.
80da82c92fSMauro Carvalho Chehab
81da82c92fSMauro Carvalho ChehabIB stack honors limits enforced by the rdma controller. When application
82da82c92fSMauro Carvalho Chehabquery about maximum resource limits of IB device, it returns minimum of
83da82c92fSMauro Carvalho Chehabwhat is configured by user for a given cgroup and what is supported by
84da82c92fSMauro Carvalho ChehabIB device.
85da82c92fSMauro Carvalho Chehab
86da82c92fSMauro Carvalho ChehabFollowing resources can be accounted by rdma controller.
87da82c92fSMauro Carvalho Chehab
88da82c92fSMauro Carvalho Chehab  ==========    =============================
89da82c92fSMauro Carvalho Chehab  hca_handle	Maximum number of HCA Handles
90da82c92fSMauro Carvalho Chehab  hca_object 	Maximum number of HCA Objects
91da82c92fSMauro Carvalho Chehab  ==========    =============================
92da82c92fSMauro Carvalho Chehab
93da82c92fSMauro Carvalho Chehab2. Usage Examples
94da82c92fSMauro Carvalho Chehab=================
95da82c92fSMauro Carvalho Chehab
96da82c92fSMauro Carvalho Chehab(a) Configure resource limit::
97da82c92fSMauro Carvalho Chehab
98da82c92fSMauro Carvalho Chehab	echo mlx4_0 hca_handle=2 hca_object=2000 > /sys/fs/cgroup/rdma/1/rdma.max
99da82c92fSMauro Carvalho Chehab	echo ocrdma1 hca_handle=3 > /sys/fs/cgroup/rdma/2/rdma.max
100da82c92fSMauro Carvalho Chehab
101da82c92fSMauro Carvalho Chehab(b) Query resource limit::
102da82c92fSMauro Carvalho Chehab
103da82c92fSMauro Carvalho Chehab	cat /sys/fs/cgroup/rdma/2/rdma.max
104da82c92fSMauro Carvalho Chehab	#Output:
105da82c92fSMauro Carvalho Chehab	mlx4_0 hca_handle=2 hca_object=2000
106da82c92fSMauro Carvalho Chehab	ocrdma1 hca_handle=3 hca_object=max
107da82c92fSMauro Carvalho Chehab
108da82c92fSMauro Carvalho Chehab(c) Query current usage::
109da82c92fSMauro Carvalho Chehab
110da82c92fSMauro Carvalho Chehab	cat /sys/fs/cgroup/rdma/2/rdma.current
111da82c92fSMauro Carvalho Chehab	#Output:
112da82c92fSMauro Carvalho Chehab	mlx4_0 hca_handle=1 hca_object=20
113da82c92fSMauro Carvalho Chehab	ocrdma1 hca_handle=1 hca_object=23
114da82c92fSMauro Carvalho Chehab
115da82c92fSMauro Carvalho Chehab(d) Delete resource limit::
116da82c92fSMauro Carvalho Chehab
117*80466139SRandy Dunlap	echo mlx4_0 hca_handle=max hca_object=max > /sys/fs/cgroup/rdma/1/rdma.max
118