1387b1468SMauro Carvalho Chehab====================================== 2387b1468SMauro Carvalho ChehabWound/Wait Deadlock-Proof Mutex Design 3387b1468SMauro Carvalho Chehab====================================== 4387b1468SMauro Carvalho Chehab 58c7a729dSAlexander AringPlease read mutex-design.rst first, as it applies to wait/wound mutexes too. 6387b1468SMauro Carvalho Chehab 7387b1468SMauro Carvalho ChehabMotivation for WW-Mutexes 8387b1468SMauro Carvalho Chehab------------------------- 9387b1468SMauro Carvalho Chehab 10387b1468SMauro Carvalho ChehabGPU's do operations that commonly involve many buffers. Those buffers 11387b1468SMauro Carvalho Chehabcan be shared across contexts/processes, exist in different memory 12387b1468SMauro Carvalho Chehabdomains (for example VRAM vs system memory), and so on. And with 13387b1468SMauro Carvalho ChehabPRIME / dmabuf, they can even be shared across devices. So there are 14387b1468SMauro Carvalho Chehaba handful of situations where the driver needs to wait for buffers to 15387b1468SMauro Carvalho Chehabbecome ready. If you think about this in terms of waiting on a buffer 16387b1468SMauro Carvalho Chehabmutex for it to become available, this presents a problem because 17387b1468SMauro Carvalho Chehabthere is no way to guarantee that buffers appear in a execbuf/batch in 18387b1468SMauro Carvalho Chehabthe same order in all contexts. That is directly under control of 19387b1468SMauro Carvalho Chehabuserspace, and a result of the sequence of GL calls that an application 20387b1468SMauro Carvalho Chehabmakes. Which results in the potential for deadlock. The problem gets 21387b1468SMauro Carvalho Chehabmore complex when you consider that the kernel may need to migrate the 22387b1468SMauro Carvalho Chehabbuffer(s) into VRAM before the GPU operates on the buffer(s), which 23387b1468SMauro Carvalho Chehabmay in turn require evicting some other buffers (and you don't want to 24387b1468SMauro Carvalho Chehabevict other buffers which are already queued up to the GPU), but for a 25387b1468SMauro Carvalho Chehabsimplified understanding of the problem you can ignore this. 26387b1468SMauro Carvalho Chehab 27387b1468SMauro Carvalho ChehabThe algorithm that the TTM graphics subsystem came up with for dealing with 28387b1468SMauro Carvalho Chehabthis problem is quite simple. For each group of buffers (execbuf) that need 29387b1468SMauro Carvalho Chehabto be locked, the caller would be assigned a unique reservation id/ticket, 30387b1468SMauro Carvalho Chehabfrom a global counter. In case of deadlock while locking all the buffers 31387b1468SMauro Carvalho Chehabassociated with a execbuf, the one with the lowest reservation ticket (i.e. 32387b1468SMauro Carvalho Chehabthe oldest task) wins, and the one with the higher reservation id (i.e. the 33387b1468SMauro Carvalho Chehabyounger task) unlocks all of the buffers that it has already locked, and then 34387b1468SMauro Carvalho Chehabtries again. 35387b1468SMauro Carvalho Chehab 36387b1468SMauro Carvalho ChehabIn the RDBMS literature, a reservation ticket is associated with a transaction. 37387b1468SMauro Carvalho Chehaband the deadlock handling approach is called Wait-Die. The name is based on 38387b1468SMauro Carvalho Chehabthe actions of a locking thread when it encounters an already locked mutex. 39387b1468SMauro Carvalho ChehabIf the transaction holding the lock is younger, the locking transaction waits. 40387b1468SMauro Carvalho ChehabIf the transaction holding the lock is older, the locking transaction backs off 41387b1468SMauro Carvalho Chehaband dies. Hence Wait-Die. 42387b1468SMauro Carvalho ChehabThere is also another algorithm called Wound-Wait: 43387b1468SMauro Carvalho ChehabIf the transaction holding the lock is younger, the locking transaction 44387b1468SMauro Carvalho Chehabwounds the transaction holding the lock, requesting it to die. 45387b1468SMauro Carvalho ChehabIf the transaction holding the lock is older, it waits for the other 46387b1468SMauro Carvalho Chehabtransaction. Hence Wound-Wait. 47387b1468SMauro Carvalho ChehabThe two algorithms are both fair in that a transaction will eventually succeed. 48387b1468SMauro Carvalho ChehabHowever, the Wound-Wait algorithm is typically stated to generate fewer backoffs 49387b1468SMauro Carvalho Chehabcompared to Wait-Die, but is, on the other hand, associated with more work than 50387b1468SMauro Carvalho ChehabWait-Die when recovering from a backoff. Wound-Wait is also a preemptive 51387b1468SMauro Carvalho Chehabalgorithm in that transactions are wounded by other transactions, and that 528b1a17c7SRandy Dunlaprequires a reliable way to pick up the wounded condition and preempt the 53387b1468SMauro Carvalho Chehabrunning transaction. Note that this is not the same as process preemption. A 54387b1468SMauro Carvalho ChehabWound-Wait transaction is considered preempted when it dies (returning 55387b1468SMauro Carvalho Chehab-EDEADLK) following a wound. 56387b1468SMauro Carvalho Chehab 57387b1468SMauro Carvalho ChehabConcepts 58387b1468SMauro Carvalho Chehab-------- 59387b1468SMauro Carvalho Chehab 60387b1468SMauro Carvalho ChehabCompared to normal mutexes two additional concepts/objects show up in the lock 61387b1468SMauro Carvalho Chehabinterface for w/w mutexes: 62387b1468SMauro Carvalho Chehab 63*18be03efSFernando RamosAcquire context: To ensure eventual forward progress it is important that a task 64387b1468SMauro Carvalho Chehabtrying to acquire locks doesn't grab a new reservation id, but keeps the one it 65387b1468SMauro Carvalho Chehabacquired when starting the lock acquisition. This ticket is stored in the 66387b1468SMauro Carvalho Chehabacquire context. Furthermore the acquire context keeps track of debugging state 67387b1468SMauro Carvalho Chehabto catch w/w mutex interface abuse. An acquire context is representing a 68387b1468SMauro Carvalho Chehabtransaction. 69387b1468SMauro Carvalho Chehab 70387b1468SMauro Carvalho ChehabW/w class: In contrast to normal mutexes the lock class needs to be explicit for 71387b1468SMauro Carvalho Chehabw/w mutexes, since it is required to initialize the acquire context. The lock 72387b1468SMauro Carvalho Chehabclass also specifies what algorithm to use, Wound-Wait or Wait-Die. 73387b1468SMauro Carvalho Chehab 74387b1468SMauro Carvalho ChehabFurthermore there are three different class of w/w lock acquire functions: 75387b1468SMauro Carvalho Chehab 76387b1468SMauro Carvalho Chehab* Normal lock acquisition with a context, using ww_mutex_lock. 77387b1468SMauro Carvalho Chehab 78387b1468SMauro Carvalho Chehab* Slowpath lock acquisition on the contending lock, used by the task that just 79387b1468SMauro Carvalho Chehab killed its transaction after having dropped all already acquired locks. 80387b1468SMauro Carvalho Chehab These functions have the _slow postfix. 81387b1468SMauro Carvalho Chehab 82387b1468SMauro Carvalho Chehab From a simple semantics point-of-view the _slow functions are not strictly 83387b1468SMauro Carvalho Chehab required, since simply calling the normal ww_mutex_lock functions on the 84387b1468SMauro Carvalho Chehab contending lock (after having dropped all other already acquired locks) will 85387b1468SMauro Carvalho Chehab work correctly. After all if no other ww mutex has been acquired yet there's 86387b1468SMauro Carvalho Chehab no deadlock potential and hence the ww_mutex_lock call will block and not 87387b1468SMauro Carvalho Chehab prematurely return -EDEADLK. The advantage of the _slow functions is in 88387b1468SMauro Carvalho Chehab interface safety: 89387b1468SMauro Carvalho Chehab 90387b1468SMauro Carvalho Chehab - ww_mutex_lock has a __must_check int return type, whereas ww_mutex_lock_slow 91387b1468SMauro Carvalho Chehab has a void return type. Note that since ww mutex code needs loops/retries 92387b1468SMauro Carvalho Chehab anyway the __must_check doesn't result in spurious warnings, even though the 93387b1468SMauro Carvalho Chehab very first lock operation can never fail. 94387b1468SMauro Carvalho Chehab - When full debugging is enabled ww_mutex_lock_slow checks that all acquired 95387b1468SMauro Carvalho Chehab ww mutex have been released (preventing deadlocks) and makes sure that we 96387b1468SMauro Carvalho Chehab block on the contending lock (preventing spinning through the -EDEADLK 97387b1468SMauro Carvalho Chehab slowpath until the contended lock can be acquired). 98387b1468SMauro Carvalho Chehab 99387b1468SMauro Carvalho Chehab* Functions to only acquire a single w/w mutex, which results in the exact same 100387b1468SMauro Carvalho Chehab semantics as a normal mutex. This is done by calling ww_mutex_lock with a NULL 101387b1468SMauro Carvalho Chehab context. 102387b1468SMauro Carvalho Chehab 103387b1468SMauro Carvalho Chehab Again this is not strictly required. But often you only want to acquire a 104387b1468SMauro Carvalho Chehab single lock in which case it's pointless to set up an acquire context (and so 105387b1468SMauro Carvalho Chehab better to avoid grabbing a deadlock avoidance ticket). 106387b1468SMauro Carvalho Chehab 107387b1468SMauro Carvalho ChehabOf course, all the usual variants for handling wake-ups due to signals are also 108387b1468SMauro Carvalho Chehabprovided. 109387b1468SMauro Carvalho Chehab 110387b1468SMauro Carvalho ChehabUsage 111387b1468SMauro Carvalho Chehab----- 112387b1468SMauro Carvalho Chehab 113387b1468SMauro Carvalho ChehabThe algorithm (Wait-Die vs Wound-Wait) is chosen by using either 114387b1468SMauro Carvalho ChehabDEFINE_WW_CLASS() (Wound-Wait) or DEFINE_WD_CLASS() (Wait-Die) 115387b1468SMauro Carvalho ChehabAs a rough rule of thumb, use Wound-Wait iff you 116387b1468SMauro Carvalho Chehabexpect the number of simultaneous competing transactions to be typically small, 117387b1468SMauro Carvalho Chehaband you want to reduce the number of rollbacks. 118387b1468SMauro Carvalho Chehab 119387b1468SMauro Carvalho ChehabThree different ways to acquire locks within the same w/w class. Common 120387b1468SMauro Carvalho Chehabdefinitions for methods #1 and #2:: 121387b1468SMauro Carvalho Chehab 122387b1468SMauro Carvalho Chehab static DEFINE_WW_CLASS(ww_class); 123387b1468SMauro Carvalho Chehab 124387b1468SMauro Carvalho Chehab struct obj { 125387b1468SMauro Carvalho Chehab struct ww_mutex lock; 126387b1468SMauro Carvalho Chehab /* obj data */ 127387b1468SMauro Carvalho Chehab }; 128387b1468SMauro Carvalho Chehab 129387b1468SMauro Carvalho Chehab struct obj_entry { 130387b1468SMauro Carvalho Chehab struct list_head head; 131387b1468SMauro Carvalho Chehab struct obj *obj; 132387b1468SMauro Carvalho Chehab }; 133387b1468SMauro Carvalho Chehab 134387b1468SMauro Carvalho ChehabMethod 1, using a list in execbuf->buffers that's not allowed to be reordered. 135387b1468SMauro Carvalho ChehabThis is useful if a list of required objects is already tracked somewhere. 136387b1468SMauro Carvalho ChehabFurthermore the lock helper can use propagate the -EALREADY return code back to 137387b1468SMauro Carvalho Chehabthe caller as a signal that an object is twice on the list. This is useful if 138387b1468SMauro Carvalho Chehabthe list is constructed from userspace input and the ABI requires userspace to 139387b1468SMauro Carvalho Chehabnot have duplicate entries (e.g. for a gpu commandbuffer submission ioctl):: 140387b1468SMauro Carvalho Chehab 141387b1468SMauro Carvalho Chehab int lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) 142387b1468SMauro Carvalho Chehab { 143387b1468SMauro Carvalho Chehab struct obj *res_obj = NULL; 144387b1468SMauro Carvalho Chehab struct obj_entry *contended_entry = NULL; 145387b1468SMauro Carvalho Chehab struct obj_entry *entry; 146387b1468SMauro Carvalho Chehab 147387b1468SMauro Carvalho Chehab ww_acquire_init(ctx, &ww_class); 148387b1468SMauro Carvalho Chehab 149387b1468SMauro Carvalho Chehab retry: 150387b1468SMauro Carvalho Chehab list_for_each_entry (entry, list, head) { 151387b1468SMauro Carvalho Chehab if (entry->obj == res_obj) { 152387b1468SMauro Carvalho Chehab res_obj = NULL; 153387b1468SMauro Carvalho Chehab continue; 154387b1468SMauro Carvalho Chehab } 155387b1468SMauro Carvalho Chehab ret = ww_mutex_lock(&entry->obj->lock, ctx); 156387b1468SMauro Carvalho Chehab if (ret < 0) { 157387b1468SMauro Carvalho Chehab contended_entry = entry; 158387b1468SMauro Carvalho Chehab goto err; 159387b1468SMauro Carvalho Chehab } 160387b1468SMauro Carvalho Chehab } 161387b1468SMauro Carvalho Chehab 162387b1468SMauro Carvalho Chehab ww_acquire_done(ctx); 163387b1468SMauro Carvalho Chehab return 0; 164387b1468SMauro Carvalho Chehab 165387b1468SMauro Carvalho Chehab err: 166387b1468SMauro Carvalho Chehab list_for_each_entry_continue_reverse (entry, list, head) 167387b1468SMauro Carvalho Chehab ww_mutex_unlock(&entry->obj->lock); 168387b1468SMauro Carvalho Chehab 169387b1468SMauro Carvalho Chehab if (res_obj) 170387b1468SMauro Carvalho Chehab ww_mutex_unlock(&res_obj->lock); 171387b1468SMauro Carvalho Chehab 172387b1468SMauro Carvalho Chehab if (ret == -EDEADLK) { 173387b1468SMauro Carvalho Chehab /* we lost out in a seqno race, lock and retry.. */ 174387b1468SMauro Carvalho Chehab ww_mutex_lock_slow(&contended_entry->obj->lock, ctx); 175387b1468SMauro Carvalho Chehab res_obj = contended_entry->obj; 176387b1468SMauro Carvalho Chehab goto retry; 177387b1468SMauro Carvalho Chehab } 178387b1468SMauro Carvalho Chehab ww_acquire_fini(ctx); 179387b1468SMauro Carvalho Chehab 180387b1468SMauro Carvalho Chehab return ret; 181387b1468SMauro Carvalho Chehab } 182387b1468SMauro Carvalho Chehab 183387b1468SMauro Carvalho ChehabMethod 2, using a list in execbuf->buffers that can be reordered. Same semantics 184387b1468SMauro Carvalho Chehabof duplicate entry detection using -EALREADY as method 1 above. But the 185387b1468SMauro Carvalho Chehablist-reordering allows for a bit more idiomatic code:: 186387b1468SMauro Carvalho Chehab 187387b1468SMauro Carvalho Chehab int lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) 188387b1468SMauro Carvalho Chehab { 189387b1468SMauro Carvalho Chehab struct obj_entry *entry, *entry2; 190387b1468SMauro Carvalho Chehab 191387b1468SMauro Carvalho Chehab ww_acquire_init(ctx, &ww_class); 192387b1468SMauro Carvalho Chehab 193387b1468SMauro Carvalho Chehab list_for_each_entry (entry, list, head) { 194387b1468SMauro Carvalho Chehab ret = ww_mutex_lock(&entry->obj->lock, ctx); 195387b1468SMauro Carvalho Chehab if (ret < 0) { 196387b1468SMauro Carvalho Chehab entry2 = entry; 197387b1468SMauro Carvalho Chehab 198387b1468SMauro Carvalho Chehab list_for_each_entry_continue_reverse (entry2, list, head) 199387b1468SMauro Carvalho Chehab ww_mutex_unlock(&entry2->obj->lock); 200387b1468SMauro Carvalho Chehab 201387b1468SMauro Carvalho Chehab if (ret != -EDEADLK) { 202387b1468SMauro Carvalho Chehab ww_acquire_fini(ctx); 203387b1468SMauro Carvalho Chehab return ret; 204387b1468SMauro Carvalho Chehab } 205387b1468SMauro Carvalho Chehab 206387b1468SMauro Carvalho Chehab /* we lost out in a seqno race, lock and retry.. */ 207387b1468SMauro Carvalho Chehab ww_mutex_lock_slow(&entry->obj->lock, ctx); 208387b1468SMauro Carvalho Chehab 209387b1468SMauro Carvalho Chehab /* 210387b1468SMauro Carvalho Chehab * Move buf to head of the list, this will point 211387b1468SMauro Carvalho Chehab * buf->next to the first unlocked entry, 212387b1468SMauro Carvalho Chehab * restarting the for loop. 213387b1468SMauro Carvalho Chehab */ 214387b1468SMauro Carvalho Chehab list_del(&entry->head); 215387b1468SMauro Carvalho Chehab list_add(&entry->head, list); 216387b1468SMauro Carvalho Chehab } 217387b1468SMauro Carvalho Chehab } 218387b1468SMauro Carvalho Chehab 219387b1468SMauro Carvalho Chehab ww_acquire_done(ctx); 220387b1468SMauro Carvalho Chehab return 0; 221387b1468SMauro Carvalho Chehab } 222387b1468SMauro Carvalho Chehab 223387b1468SMauro Carvalho ChehabUnlocking works the same way for both methods #1 and #2:: 224387b1468SMauro Carvalho Chehab 225387b1468SMauro Carvalho Chehab void unlock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) 226387b1468SMauro Carvalho Chehab { 227387b1468SMauro Carvalho Chehab struct obj_entry *entry; 228387b1468SMauro Carvalho Chehab 229387b1468SMauro Carvalho Chehab list_for_each_entry (entry, list, head) 230387b1468SMauro Carvalho Chehab ww_mutex_unlock(&entry->obj->lock); 231387b1468SMauro Carvalho Chehab 232387b1468SMauro Carvalho Chehab ww_acquire_fini(ctx); 233387b1468SMauro Carvalho Chehab } 234387b1468SMauro Carvalho Chehab 235387b1468SMauro Carvalho ChehabMethod 3 is useful if the list of objects is constructed ad-hoc and not upfront, 236387b1468SMauro Carvalho Chehabe.g. when adjusting edges in a graph where each node has its own ww_mutex lock, 237387b1468SMauro Carvalho Chehaband edges can only be changed when holding the locks of all involved nodes. w/w 238387b1468SMauro Carvalho Chehabmutexes are a natural fit for such a case for two reasons: 239387b1468SMauro Carvalho Chehab 240387b1468SMauro Carvalho Chehab- They can handle lock-acquisition in any order which allows us to start walking 241387b1468SMauro Carvalho Chehab a graph from a starting point and then iteratively discovering new edges and 242387b1468SMauro Carvalho Chehab locking down the nodes those edges connect to. 243387b1468SMauro Carvalho Chehab- Due to the -EALREADY return code signalling that a given objects is already 244387b1468SMauro Carvalho Chehab held there's no need for additional book-keeping to break cycles in the graph 245387b1468SMauro Carvalho Chehab or keep track off which looks are already held (when using more than one node 246387b1468SMauro Carvalho Chehab as a starting point). 247387b1468SMauro Carvalho Chehab 248387b1468SMauro Carvalho ChehabNote that this approach differs in two important ways from the above methods: 249387b1468SMauro Carvalho Chehab 250387b1468SMauro Carvalho Chehab- Since the list of objects is dynamically constructed (and might very well be 251387b1468SMauro Carvalho Chehab different when retrying due to hitting the -EDEADLK die condition) there's 252387b1468SMauro Carvalho Chehab no need to keep any object on a persistent list when it's not locked. We can 253387b1468SMauro Carvalho Chehab therefore move the list_head into the object itself. 254387b1468SMauro Carvalho Chehab- On the other hand the dynamic object list construction also means that the -EALREADY return 255387b1468SMauro Carvalho Chehab code can't be propagated. 256387b1468SMauro Carvalho Chehab 257387b1468SMauro Carvalho ChehabNote also that methods #1 and #2 and method #3 can be combined, e.g. to first lock a 258387b1468SMauro Carvalho Chehablist of starting nodes (passed in from userspace) using one of the above 259387b1468SMauro Carvalho Chehabmethods. And then lock any additional objects affected by the operations using 260387b1468SMauro Carvalho Chehabmethod #3 below. The backoff/retry procedure will be a bit more involved, since 261387b1468SMauro Carvalho Chehabwhen the dynamic locking step hits -EDEADLK we also need to unlock all the 262387b1468SMauro Carvalho Chehabobjects acquired with the fixed list. But the w/w mutex debug checks will catch 263387b1468SMauro Carvalho Chehabany interface misuse for these cases. 264387b1468SMauro Carvalho Chehab 265387b1468SMauro Carvalho ChehabAlso, method 3 can't fail the lock acquisition step since it doesn't return 266387b1468SMauro Carvalho Chehab-EALREADY. Of course this would be different when using the _interruptible 267387b1468SMauro Carvalho Chehabvariants, but that's outside of the scope of these examples here:: 268387b1468SMauro Carvalho Chehab 269387b1468SMauro Carvalho Chehab struct obj { 270387b1468SMauro Carvalho Chehab struct ww_mutex ww_mutex; 271387b1468SMauro Carvalho Chehab struct list_head locked_list; 272387b1468SMauro Carvalho Chehab }; 273387b1468SMauro Carvalho Chehab 274387b1468SMauro Carvalho Chehab static DEFINE_WW_CLASS(ww_class); 275387b1468SMauro Carvalho Chehab 276387b1468SMauro Carvalho Chehab void __unlock_objs(struct list_head *list) 277387b1468SMauro Carvalho Chehab { 278387b1468SMauro Carvalho Chehab struct obj *entry, *temp; 279387b1468SMauro Carvalho Chehab 280387b1468SMauro Carvalho Chehab list_for_each_entry_safe (entry, temp, list, locked_list) { 281387b1468SMauro Carvalho Chehab /* need to do that before unlocking, since only the current lock holder is 282387b1468SMauro Carvalho Chehab allowed to use object */ 283387b1468SMauro Carvalho Chehab list_del(&entry->locked_list); 284387b1468SMauro Carvalho Chehab ww_mutex_unlock(entry->ww_mutex) 285387b1468SMauro Carvalho Chehab } 286387b1468SMauro Carvalho Chehab } 287387b1468SMauro Carvalho Chehab 288387b1468SMauro Carvalho Chehab void lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) 289387b1468SMauro Carvalho Chehab { 290387b1468SMauro Carvalho Chehab struct obj *obj; 291387b1468SMauro Carvalho Chehab 292387b1468SMauro Carvalho Chehab ww_acquire_init(ctx, &ww_class); 293387b1468SMauro Carvalho Chehab 294387b1468SMauro Carvalho Chehab retry: 295387b1468SMauro Carvalho Chehab /* re-init loop start state */ 296387b1468SMauro Carvalho Chehab loop { 297387b1468SMauro Carvalho Chehab /* magic code which walks over a graph and decides which objects 298387b1468SMauro Carvalho Chehab * to lock */ 299387b1468SMauro Carvalho Chehab 300387b1468SMauro Carvalho Chehab ret = ww_mutex_lock(obj->ww_mutex, ctx); 301387b1468SMauro Carvalho Chehab if (ret == -EALREADY) { 302387b1468SMauro Carvalho Chehab /* we have that one already, get to the next object */ 303387b1468SMauro Carvalho Chehab continue; 304387b1468SMauro Carvalho Chehab } 305387b1468SMauro Carvalho Chehab if (ret == -EDEADLK) { 306387b1468SMauro Carvalho Chehab __unlock_objs(list); 307387b1468SMauro Carvalho Chehab 308387b1468SMauro Carvalho Chehab ww_mutex_lock_slow(obj, ctx); 309387b1468SMauro Carvalho Chehab list_add(&entry->locked_list, list); 310387b1468SMauro Carvalho Chehab goto retry; 311387b1468SMauro Carvalho Chehab } 312387b1468SMauro Carvalho Chehab 313387b1468SMauro Carvalho Chehab /* locked a new object, add it to the list */ 314387b1468SMauro Carvalho Chehab list_add_tail(&entry->locked_list, list); 315387b1468SMauro Carvalho Chehab } 316387b1468SMauro Carvalho Chehab 317387b1468SMauro Carvalho Chehab ww_acquire_done(ctx); 318387b1468SMauro Carvalho Chehab return 0; 319387b1468SMauro Carvalho Chehab } 320387b1468SMauro Carvalho Chehab 321387b1468SMauro Carvalho Chehab void unlock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) 322387b1468SMauro Carvalho Chehab { 323387b1468SMauro Carvalho Chehab __unlock_objs(list); 324387b1468SMauro Carvalho Chehab ww_acquire_fini(ctx); 325387b1468SMauro Carvalho Chehab } 326387b1468SMauro Carvalho Chehab 327387b1468SMauro Carvalho ChehabMethod 4: Only lock one single objects. In that case deadlock detection and 328387b1468SMauro Carvalho Chehabprevention is obviously overkill, since with grabbing just one lock you can't 329387b1468SMauro Carvalho Chehabproduce a deadlock within just one class. To simplify this case the w/w mutex 330387b1468SMauro Carvalho Chehabapi can be used with a NULL context. 331387b1468SMauro Carvalho Chehab 332387b1468SMauro Carvalho ChehabImplementation Details 333387b1468SMauro Carvalho Chehab---------------------- 334387b1468SMauro Carvalho Chehab 335387b1468SMauro Carvalho ChehabDesign: 336387b1468SMauro Carvalho Chehab^^^^^^^ 337387b1468SMauro Carvalho Chehab 338387b1468SMauro Carvalho Chehab ww_mutex currently encapsulates a struct mutex, this means no extra overhead for 339387b1468SMauro Carvalho Chehab normal mutex locks, which are far more common. As such there is only a small 340387b1468SMauro Carvalho Chehab increase in code size if wait/wound mutexes are not used. 341387b1468SMauro Carvalho Chehab 342387b1468SMauro Carvalho Chehab We maintain the following invariants for the wait list: 343387b1468SMauro Carvalho Chehab 344387b1468SMauro Carvalho Chehab (1) Waiters with an acquire context are sorted by stamp order; waiters 345387b1468SMauro Carvalho Chehab without an acquire context are interspersed in FIFO order. 346387b1468SMauro Carvalho Chehab (2) For Wait-Die, among waiters with contexts, only the first one can have 347387b1468SMauro Carvalho Chehab other locks acquired already (ctx->acquired > 0). Note that this waiter 348387b1468SMauro Carvalho Chehab may come after other waiters without contexts in the list. 349387b1468SMauro Carvalho Chehab 350387b1468SMauro Carvalho Chehab The Wound-Wait preemption is implemented with a lazy-preemption scheme: 351387b1468SMauro Carvalho Chehab The wounded status of the transaction is checked only when there is 352387b1468SMauro Carvalho Chehab contention for a new lock and hence a true chance of deadlock. In that 353387b1468SMauro Carvalho Chehab situation, if the transaction is wounded, it backs off, clears the 354387b1468SMauro Carvalho Chehab wounded status and retries. A great benefit of implementing preemption in 355387b1468SMauro Carvalho Chehab this way is that the wounded transaction can identify a contending lock to 356387b1468SMauro Carvalho Chehab wait for before restarting the transaction. Just blindly restarting the 357387b1468SMauro Carvalho Chehab transaction would likely make the transaction end up in a situation where 358387b1468SMauro Carvalho Chehab it would have to back off again. 359387b1468SMauro Carvalho Chehab 360387b1468SMauro Carvalho Chehab In general, not much contention is expected. The locks are typically used to 361387b1468SMauro Carvalho Chehab serialize access to resources for devices, and optimization focus should 362387b1468SMauro Carvalho Chehab therefore be directed towards the uncontended cases. 363387b1468SMauro Carvalho Chehab 364387b1468SMauro Carvalho ChehabLockdep: 365387b1468SMauro Carvalho Chehab^^^^^^^^ 366387b1468SMauro Carvalho Chehab 367387b1468SMauro Carvalho Chehab Special care has been taken to warn for as many cases of api abuse 368387b1468SMauro Carvalho Chehab as possible. Some common api abuses will be caught with 369387b1468SMauro Carvalho Chehab CONFIG_DEBUG_MUTEXES, but CONFIG_PROVE_LOCKING is recommended. 370387b1468SMauro Carvalho Chehab 371387b1468SMauro Carvalho Chehab Some of the errors which will be warned about: 372387b1468SMauro Carvalho Chehab - Forgetting to call ww_acquire_fini or ww_acquire_init. 373387b1468SMauro Carvalho Chehab - Attempting to lock more mutexes after ww_acquire_done. 374387b1468SMauro Carvalho Chehab - Attempting to lock the wrong mutex after -EDEADLK and 375387b1468SMauro Carvalho Chehab unlocking all mutexes. 376387b1468SMauro Carvalho Chehab - Attempting to lock the right mutex after -EDEADLK, 377387b1468SMauro Carvalho Chehab before unlocking all mutexes. 378387b1468SMauro Carvalho Chehab 379387b1468SMauro Carvalho Chehab - Calling ww_mutex_lock_slow before -EDEADLK was returned. 380387b1468SMauro Carvalho Chehab 381387b1468SMauro Carvalho Chehab - Unlocking mutexes with the wrong unlock function. 382387b1468SMauro Carvalho Chehab - Calling one of the ww_acquire_* twice on the same context. 383387b1468SMauro Carvalho Chehab - Using a different ww_class for the mutex than for the ww_acquire_ctx. 384387b1468SMauro Carvalho Chehab - Normal lockdep errors that can result in deadlocks. 385387b1468SMauro Carvalho Chehab 386387b1468SMauro Carvalho Chehab Some of the lockdep errors that can result in deadlocks: 387387b1468SMauro Carvalho Chehab - Calling ww_acquire_init to initialize a second ww_acquire_ctx before 388387b1468SMauro Carvalho Chehab having called ww_acquire_fini on the first. 389387b1468SMauro Carvalho Chehab - 'normal' deadlocks that can occur. 390387b1468SMauro Carvalho Chehab 391387b1468SMauro Carvalho ChehabFIXME: 392387b1468SMauro Carvalho Chehab Update this section once we have the TASK_DEADLOCK task state flag magic 393387b1468SMauro Carvalho Chehab implemented. 394