1387b1468SMauro Carvalho Chehab======================================
2387b1468SMauro Carvalho ChehabWound/Wait Deadlock-Proof Mutex Design
3387b1468SMauro Carvalho Chehab======================================
4387b1468SMauro Carvalho Chehab
58c7a729dSAlexander AringPlease read mutex-design.rst first, as it applies to wait/wound mutexes too.
6387b1468SMauro Carvalho Chehab
7387b1468SMauro Carvalho ChehabMotivation for WW-Mutexes
8387b1468SMauro Carvalho Chehab-------------------------
9387b1468SMauro Carvalho Chehab
10387b1468SMauro Carvalho ChehabGPU's do operations that commonly involve many buffers.  Those buffers
11387b1468SMauro Carvalho Chehabcan be shared across contexts/processes, exist in different memory
12387b1468SMauro Carvalho Chehabdomains (for example VRAM vs system memory), and so on.  And with
13387b1468SMauro Carvalho ChehabPRIME / dmabuf, they can even be shared across devices.  So there are
14387b1468SMauro Carvalho Chehaba handful of situations where the driver needs to wait for buffers to
15387b1468SMauro Carvalho Chehabbecome ready.  If you think about this in terms of waiting on a buffer
16387b1468SMauro Carvalho Chehabmutex for it to become available, this presents a problem because
17387b1468SMauro Carvalho Chehabthere is no way to guarantee that buffers appear in a execbuf/batch in
18387b1468SMauro Carvalho Chehabthe same order in all contexts.  That is directly under control of
19387b1468SMauro Carvalho Chehabuserspace, and a result of the sequence of GL calls that an application
20387b1468SMauro Carvalho Chehabmakes.	Which results in the potential for deadlock.  The problem gets
21387b1468SMauro Carvalho Chehabmore complex when you consider that the kernel may need to migrate the
22387b1468SMauro Carvalho Chehabbuffer(s) into VRAM before the GPU operates on the buffer(s), which
23387b1468SMauro Carvalho Chehabmay in turn require evicting some other buffers (and you don't want to
24387b1468SMauro Carvalho Chehabevict other buffers which are already queued up to the GPU), but for a
25387b1468SMauro Carvalho Chehabsimplified understanding of the problem you can ignore this.
26387b1468SMauro Carvalho Chehab
27387b1468SMauro Carvalho ChehabThe algorithm that the TTM graphics subsystem came up with for dealing with
28387b1468SMauro Carvalho Chehabthis problem is quite simple.  For each group of buffers (execbuf) that need
29387b1468SMauro Carvalho Chehabto be locked, the caller would be assigned a unique reservation id/ticket,
30387b1468SMauro Carvalho Chehabfrom a global counter.  In case of deadlock while locking all the buffers
31387b1468SMauro Carvalho Chehabassociated with a execbuf, the one with the lowest reservation ticket (i.e.
32387b1468SMauro Carvalho Chehabthe oldest task) wins, and the one with the higher reservation id (i.e. the
33387b1468SMauro Carvalho Chehabyounger task) unlocks all of the buffers that it has already locked, and then
34387b1468SMauro Carvalho Chehabtries again.
35387b1468SMauro Carvalho Chehab
36387b1468SMauro Carvalho ChehabIn the RDBMS literature, a reservation ticket is associated with a transaction.
37387b1468SMauro Carvalho Chehaband the deadlock handling approach is called Wait-Die. The name is based on
38387b1468SMauro Carvalho Chehabthe actions of a locking thread when it encounters an already locked mutex.
39387b1468SMauro Carvalho ChehabIf the transaction holding the lock is younger, the locking transaction waits.
40387b1468SMauro Carvalho ChehabIf the transaction holding the lock is older, the locking transaction backs off
41387b1468SMauro Carvalho Chehaband dies. Hence Wait-Die.
42387b1468SMauro Carvalho ChehabThere is also another algorithm called Wound-Wait:
43387b1468SMauro Carvalho ChehabIf the transaction holding the lock is younger, the locking transaction
44387b1468SMauro Carvalho Chehabwounds the transaction holding the lock, requesting it to die.
45387b1468SMauro Carvalho ChehabIf the transaction holding the lock is older, it waits for the other
46387b1468SMauro Carvalho Chehabtransaction. Hence Wound-Wait.
47387b1468SMauro Carvalho ChehabThe two algorithms are both fair in that a transaction will eventually succeed.
48387b1468SMauro Carvalho ChehabHowever, the Wound-Wait algorithm is typically stated to generate fewer backoffs
49387b1468SMauro Carvalho Chehabcompared to Wait-Die, but is, on the other hand, associated with more work than
50387b1468SMauro Carvalho ChehabWait-Die when recovering from a backoff. Wound-Wait is also a preemptive
51387b1468SMauro Carvalho Chehabalgorithm in that transactions are wounded by other transactions, and that
528b1a17c7SRandy Dunlaprequires a reliable way to pick up the wounded condition and preempt the
53387b1468SMauro Carvalho Chehabrunning transaction. Note that this is not the same as process preemption. A
54387b1468SMauro Carvalho ChehabWound-Wait transaction is considered preempted when it dies (returning
55387b1468SMauro Carvalho Chehab-EDEADLK) following a wound.
56387b1468SMauro Carvalho Chehab
57387b1468SMauro Carvalho ChehabConcepts
58387b1468SMauro Carvalho Chehab--------
59387b1468SMauro Carvalho Chehab
60387b1468SMauro Carvalho ChehabCompared to normal mutexes two additional concepts/objects show up in the lock
61387b1468SMauro Carvalho Chehabinterface for w/w mutexes:
62387b1468SMauro Carvalho Chehab
63*18be03efSFernando RamosAcquire context: To ensure eventual forward progress it is important that a task
64387b1468SMauro Carvalho Chehabtrying to acquire locks doesn't grab a new reservation id, but keeps the one it
65387b1468SMauro Carvalho Chehabacquired when starting the lock acquisition. This ticket is stored in the
66387b1468SMauro Carvalho Chehabacquire context. Furthermore the acquire context keeps track of debugging state
67387b1468SMauro Carvalho Chehabto catch w/w mutex interface abuse. An acquire context is representing a
68387b1468SMauro Carvalho Chehabtransaction.
69387b1468SMauro Carvalho Chehab
70387b1468SMauro Carvalho ChehabW/w class: In contrast to normal mutexes the lock class needs to be explicit for
71387b1468SMauro Carvalho Chehabw/w mutexes, since it is required to initialize the acquire context. The lock
72387b1468SMauro Carvalho Chehabclass also specifies what algorithm to use, Wound-Wait or Wait-Die.
73387b1468SMauro Carvalho Chehab
74387b1468SMauro Carvalho ChehabFurthermore there are three different class of w/w lock acquire functions:
75387b1468SMauro Carvalho Chehab
76387b1468SMauro Carvalho Chehab* Normal lock acquisition with a context, using ww_mutex_lock.
77387b1468SMauro Carvalho Chehab
78387b1468SMauro Carvalho Chehab* Slowpath lock acquisition on the contending lock, used by the task that just
79387b1468SMauro Carvalho Chehab  killed its transaction after having dropped all already acquired locks.
80387b1468SMauro Carvalho Chehab  These functions have the _slow postfix.
81387b1468SMauro Carvalho Chehab
82387b1468SMauro Carvalho Chehab  From a simple semantics point-of-view the _slow functions are not strictly
83387b1468SMauro Carvalho Chehab  required, since simply calling the normal ww_mutex_lock functions on the
84387b1468SMauro Carvalho Chehab  contending lock (after having dropped all other already acquired locks) will
85387b1468SMauro Carvalho Chehab  work correctly. After all if no other ww mutex has been acquired yet there's
86387b1468SMauro Carvalho Chehab  no deadlock potential and hence the ww_mutex_lock call will block and not
87387b1468SMauro Carvalho Chehab  prematurely return -EDEADLK. The advantage of the _slow functions is in
88387b1468SMauro Carvalho Chehab  interface safety:
89387b1468SMauro Carvalho Chehab
90387b1468SMauro Carvalho Chehab  - ww_mutex_lock has a __must_check int return type, whereas ww_mutex_lock_slow
91387b1468SMauro Carvalho Chehab    has a void return type. Note that since ww mutex code needs loops/retries
92387b1468SMauro Carvalho Chehab    anyway the __must_check doesn't result in spurious warnings, even though the
93387b1468SMauro Carvalho Chehab    very first lock operation can never fail.
94387b1468SMauro Carvalho Chehab  - When full debugging is enabled ww_mutex_lock_slow checks that all acquired
95387b1468SMauro Carvalho Chehab    ww mutex have been released (preventing deadlocks) and makes sure that we
96387b1468SMauro Carvalho Chehab    block on the contending lock (preventing spinning through the -EDEADLK
97387b1468SMauro Carvalho Chehab    slowpath until the contended lock can be acquired).
98387b1468SMauro Carvalho Chehab
99387b1468SMauro Carvalho Chehab* Functions to only acquire a single w/w mutex, which results in the exact same
100387b1468SMauro Carvalho Chehab  semantics as a normal mutex. This is done by calling ww_mutex_lock with a NULL
101387b1468SMauro Carvalho Chehab  context.
102387b1468SMauro Carvalho Chehab
103387b1468SMauro Carvalho Chehab  Again this is not strictly required. But often you only want to acquire a
104387b1468SMauro Carvalho Chehab  single lock in which case it's pointless to set up an acquire context (and so
105387b1468SMauro Carvalho Chehab  better to avoid grabbing a deadlock avoidance ticket).
106387b1468SMauro Carvalho Chehab
107387b1468SMauro Carvalho ChehabOf course, all the usual variants for handling wake-ups due to signals are also
108387b1468SMauro Carvalho Chehabprovided.
109387b1468SMauro Carvalho Chehab
110387b1468SMauro Carvalho ChehabUsage
111387b1468SMauro Carvalho Chehab-----
112387b1468SMauro Carvalho Chehab
113387b1468SMauro Carvalho ChehabThe algorithm (Wait-Die vs Wound-Wait) is chosen by using either
114387b1468SMauro Carvalho ChehabDEFINE_WW_CLASS() (Wound-Wait) or DEFINE_WD_CLASS() (Wait-Die)
115387b1468SMauro Carvalho ChehabAs a rough rule of thumb, use Wound-Wait iff you
116387b1468SMauro Carvalho Chehabexpect the number of simultaneous competing transactions to be typically small,
117387b1468SMauro Carvalho Chehaband you want to reduce the number of rollbacks.
118387b1468SMauro Carvalho Chehab
119387b1468SMauro Carvalho ChehabThree different ways to acquire locks within the same w/w class. Common
120387b1468SMauro Carvalho Chehabdefinitions for methods #1 and #2::
121387b1468SMauro Carvalho Chehab
122387b1468SMauro Carvalho Chehab  static DEFINE_WW_CLASS(ww_class);
123387b1468SMauro Carvalho Chehab
124387b1468SMauro Carvalho Chehab  struct obj {
125387b1468SMauro Carvalho Chehab	struct ww_mutex lock;
126387b1468SMauro Carvalho Chehab	/* obj data */
127387b1468SMauro Carvalho Chehab  };
128387b1468SMauro Carvalho Chehab
129387b1468SMauro Carvalho Chehab  struct obj_entry {
130387b1468SMauro Carvalho Chehab	struct list_head head;
131387b1468SMauro Carvalho Chehab	struct obj *obj;
132387b1468SMauro Carvalho Chehab  };
133387b1468SMauro Carvalho Chehab
134387b1468SMauro Carvalho ChehabMethod 1, using a list in execbuf->buffers that's not allowed to be reordered.
135387b1468SMauro Carvalho ChehabThis is useful if a list of required objects is already tracked somewhere.
136387b1468SMauro Carvalho ChehabFurthermore the lock helper can use propagate the -EALREADY return code back to
137387b1468SMauro Carvalho Chehabthe caller as a signal that an object is twice on the list. This is useful if
138387b1468SMauro Carvalho Chehabthe list is constructed from userspace input and the ABI requires userspace to
139387b1468SMauro Carvalho Chehabnot have duplicate entries (e.g. for a gpu commandbuffer submission ioctl)::
140387b1468SMauro Carvalho Chehab
141387b1468SMauro Carvalho Chehab  int lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx)
142387b1468SMauro Carvalho Chehab  {
143387b1468SMauro Carvalho Chehab	struct obj *res_obj = NULL;
144387b1468SMauro Carvalho Chehab	struct obj_entry *contended_entry = NULL;
145387b1468SMauro Carvalho Chehab	struct obj_entry *entry;
146387b1468SMauro Carvalho Chehab
147387b1468SMauro Carvalho Chehab	ww_acquire_init(ctx, &ww_class);
148387b1468SMauro Carvalho Chehab
149387b1468SMauro Carvalho Chehab  retry:
150387b1468SMauro Carvalho Chehab	list_for_each_entry (entry, list, head) {
151387b1468SMauro Carvalho Chehab		if (entry->obj == res_obj) {
152387b1468SMauro Carvalho Chehab			res_obj = NULL;
153387b1468SMauro Carvalho Chehab			continue;
154387b1468SMauro Carvalho Chehab		}
155387b1468SMauro Carvalho Chehab		ret = ww_mutex_lock(&entry->obj->lock, ctx);
156387b1468SMauro Carvalho Chehab		if (ret < 0) {
157387b1468SMauro Carvalho Chehab			contended_entry = entry;
158387b1468SMauro Carvalho Chehab			goto err;
159387b1468SMauro Carvalho Chehab		}
160387b1468SMauro Carvalho Chehab	}
161387b1468SMauro Carvalho Chehab
162387b1468SMauro Carvalho Chehab	ww_acquire_done(ctx);
163387b1468SMauro Carvalho Chehab	return 0;
164387b1468SMauro Carvalho Chehab
165387b1468SMauro Carvalho Chehab  err:
166387b1468SMauro Carvalho Chehab	list_for_each_entry_continue_reverse (entry, list, head)
167387b1468SMauro Carvalho Chehab		ww_mutex_unlock(&entry->obj->lock);
168387b1468SMauro Carvalho Chehab
169387b1468SMauro Carvalho Chehab	if (res_obj)
170387b1468SMauro Carvalho Chehab		ww_mutex_unlock(&res_obj->lock);
171387b1468SMauro Carvalho Chehab
172387b1468SMauro Carvalho Chehab	if (ret == -EDEADLK) {
173387b1468SMauro Carvalho Chehab		/* we lost out in a seqno race, lock and retry.. */
174387b1468SMauro Carvalho Chehab		ww_mutex_lock_slow(&contended_entry->obj->lock, ctx);
175387b1468SMauro Carvalho Chehab		res_obj = contended_entry->obj;
176387b1468SMauro Carvalho Chehab		goto retry;
177387b1468SMauro Carvalho Chehab	}
178387b1468SMauro Carvalho Chehab	ww_acquire_fini(ctx);
179387b1468SMauro Carvalho Chehab
180387b1468SMauro Carvalho Chehab	return ret;
181387b1468SMauro Carvalho Chehab  }
182387b1468SMauro Carvalho Chehab
183387b1468SMauro Carvalho ChehabMethod 2, using a list in execbuf->buffers that can be reordered. Same semantics
184387b1468SMauro Carvalho Chehabof duplicate entry detection using -EALREADY as method 1 above. But the
185387b1468SMauro Carvalho Chehablist-reordering allows for a bit more idiomatic code::
186387b1468SMauro Carvalho Chehab
187387b1468SMauro Carvalho Chehab  int lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx)
188387b1468SMauro Carvalho Chehab  {
189387b1468SMauro Carvalho Chehab	struct obj_entry *entry, *entry2;
190387b1468SMauro Carvalho Chehab
191387b1468SMauro Carvalho Chehab	ww_acquire_init(ctx, &ww_class);
192387b1468SMauro Carvalho Chehab
193387b1468SMauro Carvalho Chehab	list_for_each_entry (entry, list, head) {
194387b1468SMauro Carvalho Chehab		ret = ww_mutex_lock(&entry->obj->lock, ctx);
195387b1468SMauro Carvalho Chehab		if (ret < 0) {
196387b1468SMauro Carvalho Chehab			entry2 = entry;
197387b1468SMauro Carvalho Chehab
198387b1468SMauro Carvalho Chehab			list_for_each_entry_continue_reverse (entry2, list, head)
199387b1468SMauro Carvalho Chehab				ww_mutex_unlock(&entry2->obj->lock);
200387b1468SMauro Carvalho Chehab
201387b1468SMauro Carvalho Chehab			if (ret != -EDEADLK) {
202387b1468SMauro Carvalho Chehab				ww_acquire_fini(ctx);
203387b1468SMauro Carvalho Chehab				return ret;
204387b1468SMauro Carvalho Chehab			}
205387b1468SMauro Carvalho Chehab
206387b1468SMauro Carvalho Chehab			/* we lost out in a seqno race, lock and retry.. */
207387b1468SMauro Carvalho Chehab			ww_mutex_lock_slow(&entry->obj->lock, ctx);
208387b1468SMauro Carvalho Chehab
209387b1468SMauro Carvalho Chehab			/*
210387b1468SMauro Carvalho Chehab			 * Move buf to head of the list, this will point
211387b1468SMauro Carvalho Chehab			 * buf->next to the first unlocked entry,
212387b1468SMauro Carvalho Chehab			 * restarting the for loop.
213387b1468SMauro Carvalho Chehab			 */
214387b1468SMauro Carvalho Chehab			list_del(&entry->head);
215387b1468SMauro Carvalho Chehab			list_add(&entry->head, list);
216387b1468SMauro Carvalho Chehab		}
217387b1468SMauro Carvalho Chehab	}
218387b1468SMauro Carvalho Chehab
219387b1468SMauro Carvalho Chehab	ww_acquire_done(ctx);
220387b1468SMauro Carvalho Chehab	return 0;
221387b1468SMauro Carvalho Chehab  }
222387b1468SMauro Carvalho Chehab
223387b1468SMauro Carvalho ChehabUnlocking works the same way for both methods #1 and #2::
224387b1468SMauro Carvalho Chehab
225387b1468SMauro Carvalho Chehab  void unlock_objs(struct list_head *list, struct ww_acquire_ctx *ctx)
226387b1468SMauro Carvalho Chehab  {
227387b1468SMauro Carvalho Chehab	struct obj_entry *entry;
228387b1468SMauro Carvalho Chehab
229387b1468SMauro Carvalho Chehab	list_for_each_entry (entry, list, head)
230387b1468SMauro Carvalho Chehab		ww_mutex_unlock(&entry->obj->lock);
231387b1468SMauro Carvalho Chehab
232387b1468SMauro Carvalho Chehab	ww_acquire_fini(ctx);
233387b1468SMauro Carvalho Chehab  }
234387b1468SMauro Carvalho Chehab
235387b1468SMauro Carvalho ChehabMethod 3 is useful if the list of objects is constructed ad-hoc and not upfront,
236387b1468SMauro Carvalho Chehabe.g. when adjusting edges in a graph where each node has its own ww_mutex lock,
237387b1468SMauro Carvalho Chehaband edges can only be changed when holding the locks of all involved nodes. w/w
238387b1468SMauro Carvalho Chehabmutexes are a natural fit for such a case for two reasons:
239387b1468SMauro Carvalho Chehab
240387b1468SMauro Carvalho Chehab- They can handle lock-acquisition in any order which allows us to start walking
241387b1468SMauro Carvalho Chehab  a graph from a starting point and then iteratively discovering new edges and
242387b1468SMauro Carvalho Chehab  locking down the nodes those edges connect to.
243387b1468SMauro Carvalho Chehab- Due to the -EALREADY return code signalling that a given objects is already
244387b1468SMauro Carvalho Chehab  held there's no need for additional book-keeping to break cycles in the graph
245387b1468SMauro Carvalho Chehab  or keep track off which looks are already held (when using more than one node
246387b1468SMauro Carvalho Chehab  as a starting point).
247387b1468SMauro Carvalho Chehab
248387b1468SMauro Carvalho ChehabNote that this approach differs in two important ways from the above methods:
249387b1468SMauro Carvalho Chehab
250387b1468SMauro Carvalho Chehab- Since the list of objects is dynamically constructed (and might very well be
251387b1468SMauro Carvalho Chehab  different when retrying due to hitting the -EDEADLK die condition) there's
252387b1468SMauro Carvalho Chehab  no need to keep any object on a persistent list when it's not locked. We can
253387b1468SMauro Carvalho Chehab  therefore move the list_head into the object itself.
254387b1468SMauro Carvalho Chehab- On the other hand the dynamic object list construction also means that the -EALREADY return
255387b1468SMauro Carvalho Chehab  code can't be propagated.
256387b1468SMauro Carvalho Chehab
257387b1468SMauro Carvalho ChehabNote also that methods #1 and #2 and method #3 can be combined, e.g. to first lock a
258387b1468SMauro Carvalho Chehablist of starting nodes (passed in from userspace) using one of the above
259387b1468SMauro Carvalho Chehabmethods. And then lock any additional objects affected by the operations using
260387b1468SMauro Carvalho Chehabmethod #3 below. The backoff/retry procedure will be a bit more involved, since
261387b1468SMauro Carvalho Chehabwhen the dynamic locking step hits -EDEADLK we also need to unlock all the
262387b1468SMauro Carvalho Chehabobjects acquired with the fixed list. But the w/w mutex debug checks will catch
263387b1468SMauro Carvalho Chehabany interface misuse for these cases.
264387b1468SMauro Carvalho Chehab
265387b1468SMauro Carvalho ChehabAlso, method 3 can't fail the lock acquisition step since it doesn't return
266387b1468SMauro Carvalho Chehab-EALREADY. Of course this would be different when using the _interruptible
267387b1468SMauro Carvalho Chehabvariants, but that's outside of the scope of these examples here::
268387b1468SMauro Carvalho Chehab
269387b1468SMauro Carvalho Chehab  struct obj {
270387b1468SMauro Carvalho Chehab	struct ww_mutex ww_mutex;
271387b1468SMauro Carvalho Chehab	struct list_head locked_list;
272387b1468SMauro Carvalho Chehab  };
273387b1468SMauro Carvalho Chehab
274387b1468SMauro Carvalho Chehab  static DEFINE_WW_CLASS(ww_class);
275387b1468SMauro Carvalho Chehab
276387b1468SMauro Carvalho Chehab  void __unlock_objs(struct list_head *list)
277387b1468SMauro Carvalho Chehab  {
278387b1468SMauro Carvalho Chehab	struct obj *entry, *temp;
279387b1468SMauro Carvalho Chehab
280387b1468SMauro Carvalho Chehab	list_for_each_entry_safe (entry, temp, list, locked_list) {
281387b1468SMauro Carvalho Chehab		/* need to do that before unlocking, since only the current lock holder is
282387b1468SMauro Carvalho Chehab		allowed to use object */
283387b1468SMauro Carvalho Chehab		list_del(&entry->locked_list);
284387b1468SMauro Carvalho Chehab		ww_mutex_unlock(entry->ww_mutex)
285387b1468SMauro Carvalho Chehab	}
286387b1468SMauro Carvalho Chehab  }
287387b1468SMauro Carvalho Chehab
288387b1468SMauro Carvalho Chehab  void lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx)
289387b1468SMauro Carvalho Chehab  {
290387b1468SMauro Carvalho Chehab	struct obj *obj;
291387b1468SMauro Carvalho Chehab
292387b1468SMauro Carvalho Chehab	ww_acquire_init(ctx, &ww_class);
293387b1468SMauro Carvalho Chehab
294387b1468SMauro Carvalho Chehab  retry:
295387b1468SMauro Carvalho Chehab	/* re-init loop start state */
296387b1468SMauro Carvalho Chehab	loop {
297387b1468SMauro Carvalho Chehab		/* magic code which walks over a graph and decides which objects
298387b1468SMauro Carvalho Chehab		 * to lock */
299387b1468SMauro Carvalho Chehab
300387b1468SMauro Carvalho Chehab		ret = ww_mutex_lock(obj->ww_mutex, ctx);
301387b1468SMauro Carvalho Chehab		if (ret == -EALREADY) {
302387b1468SMauro Carvalho Chehab			/* we have that one already, get to the next object */
303387b1468SMauro Carvalho Chehab			continue;
304387b1468SMauro Carvalho Chehab		}
305387b1468SMauro Carvalho Chehab		if (ret == -EDEADLK) {
306387b1468SMauro Carvalho Chehab			__unlock_objs(list);
307387b1468SMauro Carvalho Chehab
308387b1468SMauro Carvalho Chehab			ww_mutex_lock_slow(obj, ctx);
309387b1468SMauro Carvalho Chehab			list_add(&entry->locked_list, list);
310387b1468SMauro Carvalho Chehab			goto retry;
311387b1468SMauro Carvalho Chehab		}
312387b1468SMauro Carvalho Chehab
313387b1468SMauro Carvalho Chehab		/* locked a new object, add it to the list */
314387b1468SMauro Carvalho Chehab		list_add_tail(&entry->locked_list, list);
315387b1468SMauro Carvalho Chehab	}
316387b1468SMauro Carvalho Chehab
317387b1468SMauro Carvalho Chehab	ww_acquire_done(ctx);
318387b1468SMauro Carvalho Chehab	return 0;
319387b1468SMauro Carvalho Chehab  }
320387b1468SMauro Carvalho Chehab
321387b1468SMauro Carvalho Chehab  void unlock_objs(struct list_head *list, struct ww_acquire_ctx *ctx)
322387b1468SMauro Carvalho Chehab  {
323387b1468SMauro Carvalho Chehab	__unlock_objs(list);
324387b1468SMauro Carvalho Chehab	ww_acquire_fini(ctx);
325387b1468SMauro Carvalho Chehab  }
326387b1468SMauro Carvalho Chehab
327387b1468SMauro Carvalho ChehabMethod 4: Only lock one single objects. In that case deadlock detection and
328387b1468SMauro Carvalho Chehabprevention is obviously overkill, since with grabbing just one lock you can't
329387b1468SMauro Carvalho Chehabproduce a deadlock within just one class. To simplify this case the w/w mutex
330387b1468SMauro Carvalho Chehabapi can be used with a NULL context.
331387b1468SMauro Carvalho Chehab
332387b1468SMauro Carvalho ChehabImplementation Details
333387b1468SMauro Carvalho Chehab----------------------
334387b1468SMauro Carvalho Chehab
335387b1468SMauro Carvalho ChehabDesign:
336387b1468SMauro Carvalho Chehab^^^^^^^
337387b1468SMauro Carvalho Chehab
338387b1468SMauro Carvalho Chehab  ww_mutex currently encapsulates a struct mutex, this means no extra overhead for
339387b1468SMauro Carvalho Chehab  normal mutex locks, which are far more common. As such there is only a small
340387b1468SMauro Carvalho Chehab  increase in code size if wait/wound mutexes are not used.
341387b1468SMauro Carvalho Chehab
342387b1468SMauro Carvalho Chehab  We maintain the following invariants for the wait list:
343387b1468SMauro Carvalho Chehab
344387b1468SMauro Carvalho Chehab  (1) Waiters with an acquire context are sorted by stamp order; waiters
345387b1468SMauro Carvalho Chehab      without an acquire context are interspersed in FIFO order.
346387b1468SMauro Carvalho Chehab  (2) For Wait-Die, among waiters with contexts, only the first one can have
347387b1468SMauro Carvalho Chehab      other locks acquired already (ctx->acquired > 0). Note that this waiter
348387b1468SMauro Carvalho Chehab      may come after other waiters without contexts in the list.
349387b1468SMauro Carvalho Chehab
350387b1468SMauro Carvalho Chehab  The Wound-Wait preemption is implemented with a lazy-preemption scheme:
351387b1468SMauro Carvalho Chehab  The wounded status of the transaction is checked only when there is
352387b1468SMauro Carvalho Chehab  contention for a new lock and hence a true chance of deadlock. In that
353387b1468SMauro Carvalho Chehab  situation, if the transaction is wounded, it backs off, clears the
354387b1468SMauro Carvalho Chehab  wounded status and retries. A great benefit of implementing preemption in
355387b1468SMauro Carvalho Chehab  this way is that the wounded transaction can identify a contending lock to
356387b1468SMauro Carvalho Chehab  wait for before restarting the transaction. Just blindly restarting the
357387b1468SMauro Carvalho Chehab  transaction would likely make the transaction end up in a situation where
358387b1468SMauro Carvalho Chehab  it would have to back off again.
359387b1468SMauro Carvalho Chehab
360387b1468SMauro Carvalho Chehab  In general, not much contention is expected. The locks are typically used to
361387b1468SMauro Carvalho Chehab  serialize access to resources for devices, and optimization focus should
362387b1468SMauro Carvalho Chehab  therefore be directed towards the uncontended cases.
363387b1468SMauro Carvalho Chehab
364387b1468SMauro Carvalho ChehabLockdep:
365387b1468SMauro Carvalho Chehab^^^^^^^^
366387b1468SMauro Carvalho Chehab
367387b1468SMauro Carvalho Chehab  Special care has been taken to warn for as many cases of api abuse
368387b1468SMauro Carvalho Chehab  as possible. Some common api abuses will be caught with
369387b1468SMauro Carvalho Chehab  CONFIG_DEBUG_MUTEXES, but CONFIG_PROVE_LOCKING is recommended.
370387b1468SMauro Carvalho Chehab
371387b1468SMauro Carvalho Chehab  Some of the errors which will be warned about:
372387b1468SMauro Carvalho Chehab   - Forgetting to call ww_acquire_fini or ww_acquire_init.
373387b1468SMauro Carvalho Chehab   - Attempting to lock more mutexes after ww_acquire_done.
374387b1468SMauro Carvalho Chehab   - Attempting to lock the wrong mutex after -EDEADLK and
375387b1468SMauro Carvalho Chehab     unlocking all mutexes.
376387b1468SMauro Carvalho Chehab   - Attempting to lock the right mutex after -EDEADLK,
377387b1468SMauro Carvalho Chehab     before unlocking all mutexes.
378387b1468SMauro Carvalho Chehab
379387b1468SMauro Carvalho Chehab   - Calling ww_mutex_lock_slow before -EDEADLK was returned.
380387b1468SMauro Carvalho Chehab
381387b1468SMauro Carvalho Chehab   - Unlocking mutexes with the wrong unlock function.
382387b1468SMauro Carvalho Chehab   - Calling one of the ww_acquire_* twice on the same context.
383387b1468SMauro Carvalho Chehab   - Using a different ww_class for the mutex than for the ww_acquire_ctx.
384387b1468SMauro Carvalho Chehab   - Normal lockdep errors that can result in deadlocks.
385387b1468SMauro Carvalho Chehab
386387b1468SMauro Carvalho Chehab  Some of the lockdep errors that can result in deadlocks:
387387b1468SMauro Carvalho Chehab   - Calling ww_acquire_init to initialize a second ww_acquire_ctx before
388387b1468SMauro Carvalho Chehab     having called ww_acquire_fini on the first.
389387b1468SMauro Carvalho Chehab   - 'normal' deadlocks that can occur.
390387b1468SMauro Carvalho Chehab
391387b1468SMauro Carvalho ChehabFIXME:
392387b1468SMauro Carvalho Chehab  Update this section once we have the TASK_DEADLOCK task state flag magic
393387b1468SMauro Carvalho Chehab  implemented.
394