Documentation/locking/ww-mutex-design.rst

387b1468SMauro Carvalho Chehab======================================
387b1468SMauro Carvalho ChehabWound/Wait Deadlock-Proof Mutex Design
387b1468SMauro Carvalho Chehab======================================
387b1468SMauro Carvalho Chehab
8c7a729dSAlexander AringPlease read mutex-design.rst first, as it applies to wait/wound mutexes too.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabMotivation for WW-Mutexes
387b1468SMauro Carvalho Chehab-------------------------
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabGPU's do operations that commonly involve many buffers.  Those buffers
387b1468SMauro Carvalho Chehabcan be shared across contexts/processes, exist in different memory
387b1468SMauro Carvalho Chehabdomains (for example VRAM vs system memory), and so on.  And with
387b1468SMauro Carvalho ChehabPRIME / dmabuf, they can even be shared across devices.  So there are
387b1468SMauro Carvalho Chehaba handful of situations where the driver needs to wait for buffers to
387b1468SMauro Carvalho Chehabbecome ready.  If you think about this in terms of waiting on a buffer
387b1468SMauro Carvalho Chehabmutex for it to become available, this presents a problem because
387b1468SMauro Carvalho Chehabthere is no way to guarantee that buffers appear in a execbuf/batch in
387b1468SMauro Carvalho Chehabthe same order in all contexts.  That is directly under control of
387b1468SMauro Carvalho Chehabuserspace, and a result of the sequence of GL calls that an application
387b1468SMauro Carvalho Chehabmakes.	Which results in the potential for deadlock.  The problem gets
387b1468SMauro Carvalho Chehabmore complex when you consider that the kernel may need to migrate the
387b1468SMauro Carvalho Chehabbuffer(s) into VRAM before the GPU operates on the buffer(s), which
387b1468SMauro Carvalho Chehabmay in turn require evicting some other buffers (and you don't want to
387b1468SMauro Carvalho Chehabevict other buffers which are already queued up to the GPU), but for a
387b1468SMauro Carvalho Chehabsimplified understanding of the problem you can ignore this.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabThe algorithm that the TTM graphics subsystem came up with for dealing with
387b1468SMauro Carvalho Chehabthis problem is quite simple.  For each group of buffers (execbuf) that need
387b1468SMauro Carvalho Chehabto be locked, the caller would be assigned a unique reservation id/ticket,
387b1468SMauro Carvalho Chehabfrom a global counter.  In case of deadlock while locking all the buffers
387b1468SMauro Carvalho Chehabassociated with a execbuf, the one with the lowest reservation ticket (i.e.
387b1468SMauro Carvalho Chehabthe oldest task) wins, and the one with the higher reservation id (i.e. the
387b1468SMauro Carvalho Chehabyounger task) unlocks all of the buffers that it has already locked, and then
387b1468SMauro Carvalho Chehabtries again.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabIn the RDBMS literature, a reservation ticket is associated with a transaction.
387b1468SMauro Carvalho Chehaband the deadlock handling approach is called Wait-Die. The name is based on
387b1468SMauro Carvalho Chehabthe actions of a locking thread when it encounters an already locked mutex.
387b1468SMauro Carvalho ChehabIf the transaction holding the lock is younger, the locking transaction waits.
387b1468SMauro Carvalho ChehabIf the transaction holding the lock is older, the locking transaction backs off
387b1468SMauro Carvalho Chehaband dies. Hence Wait-Die.
387b1468SMauro Carvalho ChehabThere is also another algorithm called Wound-Wait:
387b1468SMauro Carvalho ChehabIf the transaction holding the lock is younger, the locking transaction
387b1468SMauro Carvalho Chehabwounds the transaction holding the lock, requesting it to die.
387b1468SMauro Carvalho ChehabIf the transaction holding the lock is older, it waits for the other
387b1468SMauro Carvalho Chehabtransaction. Hence Wound-Wait.
387b1468SMauro Carvalho ChehabThe two algorithms are both fair in that a transaction will eventually succeed.
387b1468SMauro Carvalho ChehabHowever, the Wound-Wait algorithm is typically stated to generate fewer backoffs
387b1468SMauro Carvalho Chehabcompared to Wait-Die, but is, on the other hand, associated with more work than
387b1468SMauro Carvalho ChehabWait-Die when recovering from a backoff. Wound-Wait is also a preemptive
387b1468SMauro Carvalho Chehabalgorithm in that transactions are wounded by other transactions, and that
8b1a17c7SRandy Dunlaprequires a reliable way to pick up the wounded condition and preempt the
387b1468SMauro Carvalho Chehabrunning transaction. Note that this is not the same as process preemption. A
387b1468SMauro Carvalho ChehabWound-Wait transaction is considered preempted when it dies (returning
387b1468SMauro Carvalho Chehab-EDEADLK) following a wound.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabConcepts
387b1468SMauro Carvalho Chehab--------
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabCompared to normal mutexes two additional concepts/objects show up in the lock
387b1468SMauro Carvalho Chehabinterface for w/w mutexes:
387b1468SMauro Carvalho Chehab
*18be03efSFernando RamosAcquire context: To ensure eventual forward progress it is important that a task
387b1468SMauro Carvalho Chehabtrying to acquire locks doesn't grab a new reservation id, but keeps the one it
387b1468SMauro Carvalho Chehabacquired when starting the lock acquisition. This ticket is stored in the
387b1468SMauro Carvalho Chehabacquire context. Furthermore the acquire context keeps track of debugging state
387b1468SMauro Carvalho Chehabto catch w/w mutex interface abuse. An acquire context is representing a
387b1468SMauro Carvalho Chehabtransaction.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabW/w class: In contrast to normal mutexes the lock class needs to be explicit for
387b1468SMauro Carvalho Chehabw/w mutexes, since it is required to initialize the acquire context. The lock
387b1468SMauro Carvalho Chehabclass also specifies what algorithm to use, Wound-Wait or Wait-Die.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabFurthermore there are three different class of w/w lock acquire functions:
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab* Normal lock acquisition with a context, using ww_mutex_lock.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab* Slowpath lock acquisition on the contending lock, used by the task that just
387b1468SMauro Carvalho Chehab  killed its transaction after having dropped all already acquired locks.
387b1468SMauro Carvalho Chehab  These functions have the _slow postfix.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  From a simple semantics point-of-view the _slow functions are not strictly
387b1468SMauro Carvalho Chehab  required, since simply calling the normal ww_mutex_lock functions on the
387b1468SMauro Carvalho Chehab  contending lock (after having dropped all other already acquired locks) will
387b1468SMauro Carvalho Chehab  work correctly. After all if no other ww mutex has been acquired yet there's
387b1468SMauro Carvalho Chehab  no deadlock potential and hence the ww_mutex_lock call will block and not
387b1468SMauro Carvalho Chehab  prematurely return -EDEADLK. The advantage of the _slow functions is in
387b1468SMauro Carvalho Chehab  interface safety:
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  - ww_mutex_lock has a __must_check int return type, whereas ww_mutex_lock_slow
387b1468SMauro Carvalho Chehab    has a void return type. Note that since ww mutex code needs loops/retries
387b1468SMauro Carvalho Chehab    anyway the __must_check doesn't result in spurious warnings, even though the
387b1468SMauro Carvalho Chehab    very first lock operation can never fail.
387b1468SMauro Carvalho Chehab  - When full debugging is enabled ww_mutex_lock_slow checks that all acquired
387b1468SMauro Carvalho Chehab    ww mutex have been released (preventing deadlocks) and makes sure that we
387b1468SMauro Carvalho Chehab    block on the contending lock (preventing spinning through the -EDEADLK
387b1468SMauro Carvalho Chehab    slowpath until the contended lock can be acquired).
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab* Functions to only acquire a single w/w mutex, which results in the exact same
387b1468SMauro Carvalho Chehab  semantics as a normal mutex. This is done by calling ww_mutex_lock with a NULL
387b1468SMauro Carvalho Chehab  context.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  Again this is not strictly required. But often you only want to acquire a
387b1468SMauro Carvalho Chehab  single lock in which case it's pointless to set up an acquire context (and so
387b1468SMauro Carvalho Chehab  better to avoid grabbing a deadlock avoidance ticket).
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabOf course, all the usual variants for handling wake-ups due to signals are also
387b1468SMauro Carvalho Chehabprovided.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabUsage
387b1468SMauro Carvalho Chehab-----
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabThe algorithm (Wait-Die vs Wound-Wait) is chosen by using either
387b1468SMauro Carvalho ChehabDEFINE_WW_CLASS() (Wound-Wait) or DEFINE_WD_CLASS() (Wait-Die)
387b1468SMauro Carvalho ChehabAs a rough rule of thumb, use Wound-Wait iff you
387b1468SMauro Carvalho Chehabexpect the number of simultaneous competing transactions to be typically small,
387b1468SMauro Carvalho Chehaband you want to reduce the number of rollbacks.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabThree different ways to acquire locks within the same w/w class. Common
387b1468SMauro Carvalho Chehabdefinitions for methods #1 and #2::
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  static DEFINE_WW_CLASS(ww_class);
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  struct obj {
387b1468SMauro Carvalho Chehab	struct ww_mutex lock;
387b1468SMauro Carvalho Chehab	/* obj data */
387b1468SMauro Carvalho Chehab  };
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  struct obj_entry {
387b1468SMauro Carvalho Chehab	struct list_head head;
387b1468SMauro Carvalho Chehab	struct obj *obj;
387b1468SMauro Carvalho Chehab  };
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabMethod 1, using a list in execbuf->buffers that's not allowed to be reordered.
387b1468SMauro Carvalho ChehabThis is useful if a list of required objects is already tracked somewhere.
387b1468SMauro Carvalho ChehabFurthermore the lock helper can use propagate the -EALREADY return code back to
387b1468SMauro Carvalho Chehabthe caller as a signal that an object is twice on the list. This is useful if
387b1468SMauro Carvalho Chehabthe list is constructed from userspace input and the ABI requires userspace to
387b1468SMauro Carvalho Chehabnot have duplicate entries (e.g. for a gpu commandbuffer submission ioctl)::
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  int lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx)
387b1468SMauro Carvalho Chehab  {
387b1468SMauro Carvalho Chehab	struct obj *res_obj = NULL;
387b1468SMauro Carvalho Chehab	struct obj_entry *contended_entry = NULL;
387b1468SMauro Carvalho Chehab	struct obj_entry *entry;
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab	ww_acquire_init(ctx, &ww_class);
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  retry:
387b1468SMauro Carvalho Chehab	list_for_each_entry (entry, list, head) {
387b1468SMauro Carvalho Chehab		if (entry->obj == res_obj) {
387b1468SMauro Carvalho Chehab			res_obj = NULL;
387b1468SMauro Carvalho Chehab			continue;
387b1468SMauro Carvalho Chehab		}
387b1468SMauro Carvalho Chehab		ret = ww_mutex_lock(&entry->obj->lock, ctx);
387b1468SMauro Carvalho Chehab		if (ret < 0) {
387b1468SMauro Carvalho Chehab			contended_entry = entry;
387b1468SMauro Carvalho Chehab			goto err;
387b1468SMauro Carvalho Chehab		}
387b1468SMauro Carvalho Chehab	}
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab	ww_acquire_done(ctx);
387b1468SMauro Carvalho Chehab	return 0;
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  err:
387b1468SMauro Carvalho Chehab	list_for_each_entry_continue_reverse (entry, list, head)
387b1468SMauro Carvalho Chehab		ww_mutex_unlock(&entry->obj->lock);
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab	if (res_obj)
387b1468SMauro Carvalho Chehab		ww_mutex_unlock(&res_obj->lock);
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab	if (ret == -EDEADLK) {
387b1468SMauro Carvalho Chehab		/* we lost out in a seqno race, lock and retry.. */
387b1468SMauro Carvalho Chehab		ww_mutex_lock_slow(&contended_entry->obj->lock, ctx);
387b1468SMauro Carvalho Chehab		res_obj = contended_entry->obj;
387b1468SMauro Carvalho Chehab		goto retry;
387b1468SMauro Carvalho Chehab	}
387b1468SMauro Carvalho Chehab	ww_acquire_fini(ctx);
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab	return ret;
387b1468SMauro Carvalho Chehab  }
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabMethod 2, using a list in execbuf->buffers that can be reordered. Same semantics
387b1468SMauro Carvalho Chehabof duplicate entry detection using -EALREADY as method 1 above. But the
387b1468SMauro Carvalho Chehablist-reordering allows for a bit more idiomatic code::
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  int lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx)
387b1468SMauro Carvalho Chehab  {
387b1468SMauro Carvalho Chehab	struct obj_entry *entry, *entry2;
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab	ww_acquire_init(ctx, &ww_class);
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab	list_for_each_entry (entry, list, head) {
387b1468SMauro Carvalho Chehab		ret = ww_mutex_lock(&entry->obj->lock, ctx);
387b1468SMauro Carvalho Chehab		if (ret < 0) {
387b1468SMauro Carvalho Chehab			entry2 = entry;
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab			list_for_each_entry_continue_reverse (entry2, list, head)
387b1468SMauro Carvalho Chehab				ww_mutex_unlock(&entry2->obj->lock);
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab			if (ret != -EDEADLK) {
387b1468SMauro Carvalho Chehab				ww_acquire_fini(ctx);
387b1468SMauro Carvalho Chehab				return ret;
387b1468SMauro Carvalho Chehab			}
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab			/* we lost out in a seqno race, lock and retry.. */
387b1468SMauro Carvalho Chehab			ww_mutex_lock_slow(&entry->obj->lock, ctx);
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab			/*
387b1468SMauro Carvalho Chehab			 * Move buf to head of the list, this will point
387b1468SMauro Carvalho Chehab			 * buf->next to the first unlocked entry,
387b1468SMauro Carvalho Chehab			 * restarting the for loop.
387b1468SMauro Carvalho Chehab			 */
387b1468SMauro Carvalho Chehab			list_del(&entry->head);
387b1468SMauro Carvalho Chehab			list_add(&entry->head, list);
387b1468SMauro Carvalho Chehab		}
387b1468SMauro Carvalho Chehab	}
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab	ww_acquire_done(ctx);
387b1468SMauro Carvalho Chehab	return 0;
387b1468SMauro Carvalho Chehab  }
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabUnlocking works the same way for both methods #1 and #2::
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  void unlock_objs(struct list_head *list, struct ww_acquire_ctx *ctx)
387b1468SMauro Carvalho Chehab  {
387b1468SMauro Carvalho Chehab	struct obj_entry *entry;
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab	list_for_each_entry (entry, list, head)
387b1468SMauro Carvalho Chehab		ww_mutex_unlock(&entry->obj->lock);
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab	ww_acquire_fini(ctx);
387b1468SMauro Carvalho Chehab  }
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabMethod 3 is useful if the list of objects is constructed ad-hoc and not upfront,
387b1468SMauro Carvalho Chehabe.g. when adjusting edges in a graph where each node has its own ww_mutex lock,
387b1468SMauro Carvalho Chehaband edges can only be changed when holding the locks of all involved nodes. w/w
387b1468SMauro Carvalho Chehabmutexes are a natural fit for such a case for two reasons:
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab- They can handle lock-acquisition in any order which allows us to start walking
387b1468SMauro Carvalho Chehab  a graph from a starting point and then iteratively discovering new edges and
387b1468SMauro Carvalho Chehab  locking down the nodes those edges connect to.
387b1468SMauro Carvalho Chehab- Due to the -EALREADY return code signalling that a given objects is already
387b1468SMauro Carvalho Chehab  held there's no need for additional book-keeping to break cycles in the graph
387b1468SMauro Carvalho Chehab  or keep track off which looks are already held (when using more than one node
387b1468SMauro Carvalho Chehab  as a starting point).
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabNote that this approach differs in two important ways from the above methods:
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab- Since the list of objects is dynamically constructed (and might very well be
387b1468SMauro Carvalho Chehab  different when retrying due to hitting the -EDEADLK die condition) there's
387b1468SMauro Carvalho Chehab  no need to keep any object on a persistent list when it's not locked. We can
387b1468SMauro Carvalho Chehab  therefore move the list_head into the object itself.
387b1468SMauro Carvalho Chehab- On the other hand the dynamic object list construction also means that the -EALREADY return
387b1468SMauro Carvalho Chehab  code can't be propagated.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabNote also that methods #1 and #2 and method #3 can be combined, e.g. to first lock a
387b1468SMauro Carvalho Chehablist of starting nodes (passed in from userspace) using one of the above
387b1468SMauro Carvalho Chehabmethods. And then lock any additional objects affected by the operations using
387b1468SMauro Carvalho Chehabmethod #3 below. The backoff/retry procedure will be a bit more involved, since
387b1468SMauro Carvalho Chehabwhen the dynamic locking step hits -EDEADLK we also need to unlock all the
387b1468SMauro Carvalho Chehabobjects acquired with the fixed list. But the w/w mutex debug checks will catch
387b1468SMauro Carvalho Chehabany interface misuse for these cases.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabAlso, method 3 can't fail the lock acquisition step since it doesn't return
387b1468SMauro Carvalho Chehab-EALREADY. Of course this would be different when using the _interruptible
387b1468SMauro Carvalho Chehabvariants, but that's outside of the scope of these examples here::
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  struct obj {
387b1468SMauro Carvalho Chehab	struct ww_mutex ww_mutex;
387b1468SMauro Carvalho Chehab	struct list_head locked_list;
387b1468SMauro Carvalho Chehab  };
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  static DEFINE_WW_CLASS(ww_class);
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  void __unlock_objs(struct list_head *list)
387b1468SMauro Carvalho Chehab  {
387b1468SMauro Carvalho Chehab	struct obj *entry, *temp;
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab	list_for_each_entry_safe (entry, temp, list, locked_list) {
387b1468SMauro Carvalho Chehab		/* need to do that before unlocking, since only the current lock holder is
387b1468SMauro Carvalho Chehab		allowed to use object */
387b1468SMauro Carvalho Chehab		list_del(&entry->locked_list);
387b1468SMauro Carvalho Chehab		ww_mutex_unlock(entry->ww_mutex)
387b1468SMauro Carvalho Chehab	}
387b1468SMauro Carvalho Chehab  }
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  void lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx)
387b1468SMauro Carvalho Chehab  {
387b1468SMauro Carvalho Chehab	struct obj *obj;
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab	ww_acquire_init(ctx, &ww_class);
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  retry:
387b1468SMauro Carvalho Chehab	/* re-init loop start state */
387b1468SMauro Carvalho Chehab	loop {
387b1468SMauro Carvalho Chehab		/* magic code which walks over a graph and decides which objects
387b1468SMauro Carvalho Chehab		 * to lock */
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab		ret = ww_mutex_lock(obj->ww_mutex, ctx);
387b1468SMauro Carvalho Chehab		if (ret == -EALREADY) {
387b1468SMauro Carvalho Chehab			/* we have that one already, get to the next object */
387b1468SMauro Carvalho Chehab			continue;
387b1468SMauro Carvalho Chehab		}
387b1468SMauro Carvalho Chehab		if (ret == -EDEADLK) {
387b1468SMauro Carvalho Chehab			__unlock_objs(list);
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab			ww_mutex_lock_slow(obj, ctx);
387b1468SMauro Carvalho Chehab			list_add(&entry->locked_list, list);
387b1468SMauro Carvalho Chehab			goto retry;
387b1468SMauro Carvalho Chehab		}
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab		/* locked a new object, add it to the list */
387b1468SMauro Carvalho Chehab		list_add_tail(&entry->locked_list, list);
387b1468SMauro Carvalho Chehab	}
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab	ww_acquire_done(ctx);
387b1468SMauro Carvalho Chehab	return 0;
387b1468SMauro Carvalho Chehab  }
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  void unlock_objs(struct list_head *list, struct ww_acquire_ctx *ctx)
387b1468SMauro Carvalho Chehab  {
387b1468SMauro Carvalho Chehab	__unlock_objs(list);
387b1468SMauro Carvalho Chehab	ww_acquire_fini(ctx);
387b1468SMauro Carvalho Chehab  }
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabMethod 4: Only lock one single objects. In that case deadlock detection and
387b1468SMauro Carvalho Chehabprevention is obviously overkill, since with grabbing just one lock you can't
387b1468SMauro Carvalho Chehabproduce a deadlock within just one class. To simplify this case the w/w mutex
387b1468SMauro Carvalho Chehabapi can be used with a NULL context.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabImplementation Details
387b1468SMauro Carvalho Chehab----------------------
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabDesign:
387b1468SMauro Carvalho Chehab^^^^^^^
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  ww_mutex currently encapsulates a struct mutex, this means no extra overhead for
387b1468SMauro Carvalho Chehab  normal mutex locks, which are far more common. As such there is only a small
387b1468SMauro Carvalho Chehab  increase in code size if wait/wound mutexes are not used.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  We maintain the following invariants for the wait list:
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  (1) Waiters with an acquire context are sorted by stamp order; waiters
387b1468SMauro Carvalho Chehab      without an acquire context are interspersed in FIFO order.
387b1468SMauro Carvalho Chehab  (2) For Wait-Die, among waiters with contexts, only the first one can have
387b1468SMauro Carvalho Chehab      other locks acquired already (ctx->acquired > 0). Note that this waiter
387b1468SMauro Carvalho Chehab      may come after other waiters without contexts in the list.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  The Wound-Wait preemption is implemented with a lazy-preemption scheme:
387b1468SMauro Carvalho Chehab  The wounded status of the transaction is checked only when there is
387b1468SMauro Carvalho Chehab  contention for a new lock and hence a true chance of deadlock. In that
387b1468SMauro Carvalho Chehab  situation, if the transaction is wounded, it backs off, clears the
387b1468SMauro Carvalho Chehab  wounded status and retries. A great benefit of implementing preemption in
387b1468SMauro Carvalho Chehab  this way is that the wounded transaction can identify a contending lock to
387b1468SMauro Carvalho Chehab  wait for before restarting the transaction. Just blindly restarting the
387b1468SMauro Carvalho Chehab  transaction would likely make the transaction end up in a situation where
387b1468SMauro Carvalho Chehab  it would have to back off again.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  In general, not much contention is expected. The locks are typically used to
387b1468SMauro Carvalho Chehab  serialize access to resources for devices, and optimization focus should
387b1468SMauro Carvalho Chehab  therefore be directed towards the uncontended cases.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabLockdep:
387b1468SMauro Carvalho Chehab^^^^^^^^
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  Special care has been taken to warn for as many cases of api abuse
387b1468SMauro Carvalho Chehab  as possible. Some common api abuses will be caught with
387b1468SMauro Carvalho Chehab  CONFIG_DEBUG_MUTEXES, but CONFIG_PROVE_LOCKING is recommended.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  Some of the errors which will be warned about:
387b1468SMauro Carvalho Chehab   - Forgetting to call ww_acquire_fini or ww_acquire_init.
387b1468SMauro Carvalho Chehab   - Attempting to lock more mutexes after ww_acquire_done.
387b1468SMauro Carvalho Chehab   - Attempting to lock the wrong mutex after -EDEADLK and
387b1468SMauro Carvalho Chehab     unlocking all mutexes.
387b1468SMauro Carvalho Chehab   - Attempting to lock the right mutex after -EDEADLK,
387b1468SMauro Carvalho Chehab     before unlocking all mutexes.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab   - Calling ww_mutex_lock_slow before -EDEADLK was returned.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab   - Unlocking mutexes with the wrong unlock function.
387b1468SMauro Carvalho Chehab   - Calling one of the ww_acquire_* twice on the same context.
387b1468SMauro Carvalho Chehab   - Using a different ww_class for the mutex than for the ww_acquire_ctx.
387b1468SMauro Carvalho Chehab   - Normal lockdep errors that can result in deadlocks.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho Chehab  Some of the lockdep errors that can result in deadlocks:
387b1468SMauro Carvalho Chehab   - Calling ww_acquire_init to initialize a second ww_acquire_ctx before
387b1468SMauro Carvalho Chehab     having called ww_acquire_fini on the first.
387b1468SMauro Carvalho Chehab   - 'normal' deadlocks that can occur.
387b1468SMauro Carvalho Chehab
387b1468SMauro Carvalho ChehabFIXME:
387b1468SMauro Carvalho Chehab  Update this section once we have the TASK_DEADLOCK task state flag magic
387b1468SMauro Carvalho Chehab  implemented.