1da82c92fSMauro Carvalho Chehab=============== 2da82c92fSMauro Carvalho ChehabRDMA Controller 3da82c92fSMauro Carvalho Chehab=============== 4da82c92fSMauro Carvalho Chehab 5da82c92fSMauro Carvalho Chehab.. Contents 6da82c92fSMauro Carvalho Chehab 7da82c92fSMauro Carvalho Chehab 1. Overview 8da82c92fSMauro Carvalho Chehab 1-1. What is RDMA controller? 9da82c92fSMauro Carvalho Chehab 1-2. Why RDMA controller needed? 10da82c92fSMauro Carvalho Chehab 1-3. How is RDMA controller implemented? 11da82c92fSMauro Carvalho Chehab 2. Usage Examples 12da82c92fSMauro Carvalho Chehab 13da82c92fSMauro Carvalho Chehab1. Overview 14da82c92fSMauro Carvalho Chehab=========== 15da82c92fSMauro Carvalho Chehab 16da82c92fSMauro Carvalho Chehab1-1. What is RDMA controller? 17da82c92fSMauro Carvalho Chehab----------------------------- 18da82c92fSMauro Carvalho Chehab 19da82c92fSMauro Carvalho ChehabRDMA controller allows user to limit RDMA/IB specific resources that a given 20da82c92fSMauro Carvalho Chehabset of processes can use. These processes are grouped using RDMA controller. 21da82c92fSMauro Carvalho Chehab 22da82c92fSMauro Carvalho ChehabRDMA controller defines two resources which can be limited for processes of a 23da82c92fSMauro Carvalho Chehabcgroup. 24da82c92fSMauro Carvalho Chehab 25da82c92fSMauro Carvalho Chehab1-2. Why RDMA controller needed? 26da82c92fSMauro Carvalho Chehab-------------------------------- 27da82c92fSMauro Carvalho Chehab 28da82c92fSMauro Carvalho ChehabCurrently user space applications can easily take away all the rdma verb 29da82c92fSMauro Carvalho Chehabspecific resources such as AH, CQ, QP, MR etc. Due to which other applications 30da82c92fSMauro Carvalho Chehabin other cgroup or kernel space ULPs may not even get chance to allocate any 31da82c92fSMauro Carvalho Chehabrdma resources. This can lead to service unavailability. 32da82c92fSMauro Carvalho Chehab 33da82c92fSMauro Carvalho ChehabTherefore RDMA controller is needed through which resource consumption 34da82c92fSMauro Carvalho Chehabof processes can be limited. Through this controller different rdma 35da82c92fSMauro Carvalho Chehabresources can be accounted. 36da82c92fSMauro Carvalho Chehab 37da82c92fSMauro Carvalho Chehab1-3. How is RDMA controller implemented? 38da82c92fSMauro Carvalho Chehab---------------------------------------- 39da82c92fSMauro Carvalho Chehab 40da82c92fSMauro Carvalho ChehabRDMA cgroup allows limit configuration of resources. Rdma cgroup maintains 41da82c92fSMauro Carvalho Chehabresource accounting per cgroup, per device using resource pool structure. 42da82c92fSMauro Carvalho ChehabEach such resource pool is limited up to 64 resources in given resource pool 43da82c92fSMauro Carvalho Chehabby rdma cgroup, which can be extended later if required. 44da82c92fSMauro Carvalho Chehab 45da82c92fSMauro Carvalho ChehabThis resource pool object is linked to the cgroup css. Typically there 46da82c92fSMauro Carvalho Chehabare 0 to 4 resource pool instances per cgroup, per device in most use cases. 47da82c92fSMauro Carvalho ChehabBut nothing limits to have it more. At present hundreds of RDMA devices per 48da82c92fSMauro Carvalho Chehabsingle cgroup may not be handled optimally, however there is no 49da82c92fSMauro Carvalho Chehabknown use case or requirement for such configuration either. 50da82c92fSMauro Carvalho Chehab 51da82c92fSMauro Carvalho ChehabSince RDMA resources can be allocated from any process and can be freed by any 52da82c92fSMauro Carvalho Chehabof the child processes which shares the address space, rdma resources are 53da82c92fSMauro Carvalho Chehabalways owned by the creator cgroup css. This allows process migration from one 54da82c92fSMauro Carvalho Chehabto other cgroup without major complexity of transferring resource ownership; 55da82c92fSMauro Carvalho Chehabbecause such ownership is not really present due to shared nature of 56da82c92fSMauro Carvalho Chehabrdma resources. Linking resources around css also ensures that cgroups can be 57da82c92fSMauro Carvalho Chehabdeleted after processes migrated. This allow progress migration as well with 58da82c92fSMauro Carvalho Chehabactive resources, even though that is not a primary use case. 59da82c92fSMauro Carvalho Chehab 60da82c92fSMauro Carvalho ChehabWhenever RDMA resource charging occurs, owner rdma cgroup is returned to 61da82c92fSMauro Carvalho Chehabthe caller. Same rdma cgroup should be passed while uncharging the resource. 62da82c92fSMauro Carvalho ChehabThis also allows process migrated with active RDMA resource to charge 63da82c92fSMauro Carvalho Chehabto new owner cgroup for new resource. It also allows to uncharge resource of 64da82c92fSMauro Carvalho Chehaba process from previously charged cgroup which is migrated to new cgroup, 65da82c92fSMauro Carvalho Chehabeven though that is not a primary use case. 66da82c92fSMauro Carvalho Chehab 67da82c92fSMauro Carvalho ChehabResource pool object is created in following situations. 68da82c92fSMauro Carvalho Chehab(a) User sets the limit and no previous resource pool exist for the device 69da82c92fSMauro Carvalho Chehabof interest for the cgroup. 70da82c92fSMauro Carvalho Chehab(b) No resource limits were configured, but IB/RDMA stack tries to 71da82c92fSMauro Carvalho Chehabcharge the resource. So that it correctly uncharge them when applications are 72da82c92fSMauro Carvalho Chehabrunning without limits and later on when limits are enforced during uncharging, 73da82c92fSMauro Carvalho Chehabotherwise usage count will drop to negative. 74da82c92fSMauro Carvalho Chehab 75da82c92fSMauro Carvalho ChehabResource pool is destroyed if all the resource limits are set to max and 76da82c92fSMauro Carvalho Chehabit is the last resource getting deallocated. 77da82c92fSMauro Carvalho Chehab 78da82c92fSMauro Carvalho ChehabUser should set all the limit to max value if it intents to remove/unconfigure 79da82c92fSMauro Carvalho Chehabthe resource pool for a particular device. 80da82c92fSMauro Carvalho Chehab 81da82c92fSMauro Carvalho ChehabIB stack honors limits enforced by the rdma controller. When application 82da82c92fSMauro Carvalho Chehabquery about maximum resource limits of IB device, it returns minimum of 83da82c92fSMauro Carvalho Chehabwhat is configured by user for a given cgroup and what is supported by 84da82c92fSMauro Carvalho ChehabIB device. 85da82c92fSMauro Carvalho Chehab 86da82c92fSMauro Carvalho ChehabFollowing resources can be accounted by rdma controller. 87da82c92fSMauro Carvalho Chehab 88da82c92fSMauro Carvalho Chehab ========== ============================= 89da82c92fSMauro Carvalho Chehab hca_handle Maximum number of HCA Handles 90da82c92fSMauro Carvalho Chehab hca_object Maximum number of HCA Objects 91da82c92fSMauro Carvalho Chehab ========== ============================= 92da82c92fSMauro Carvalho Chehab 93da82c92fSMauro Carvalho Chehab2. Usage Examples 94da82c92fSMauro Carvalho Chehab================= 95da82c92fSMauro Carvalho Chehab 96da82c92fSMauro Carvalho Chehab(a) Configure resource limit:: 97da82c92fSMauro Carvalho Chehab 98da82c92fSMauro Carvalho Chehab echo mlx4_0 hca_handle=2 hca_object=2000 > /sys/fs/cgroup/rdma/1/rdma.max 99da82c92fSMauro Carvalho Chehab echo ocrdma1 hca_handle=3 > /sys/fs/cgroup/rdma/2/rdma.max 100da82c92fSMauro Carvalho Chehab 101da82c92fSMauro Carvalho Chehab(b) Query resource limit:: 102da82c92fSMauro Carvalho Chehab 103da82c92fSMauro Carvalho Chehab cat /sys/fs/cgroup/rdma/2/rdma.max 104da82c92fSMauro Carvalho Chehab #Output: 105da82c92fSMauro Carvalho Chehab mlx4_0 hca_handle=2 hca_object=2000 106da82c92fSMauro Carvalho Chehab ocrdma1 hca_handle=3 hca_object=max 107da82c92fSMauro Carvalho Chehab 108da82c92fSMauro Carvalho Chehab(c) Query current usage:: 109da82c92fSMauro Carvalho Chehab 110da82c92fSMauro Carvalho Chehab cat /sys/fs/cgroup/rdma/2/rdma.current 111da82c92fSMauro Carvalho Chehab #Output: 112da82c92fSMauro Carvalho Chehab mlx4_0 hca_handle=1 hca_object=20 113da82c92fSMauro Carvalho Chehab ocrdma1 hca_handle=1 hca_object=23 114da82c92fSMauro Carvalho Chehab 115da82c92fSMauro Carvalho Chehab(d) Delete resource limit:: 116da82c92fSMauro Carvalho Chehab 117*80466139SRandy Dunlap echo mlx4_0 hca_handle=max hca_object=max > /sys/fs/cgroup/rdma/1/rdma.max 118