• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

exception/H03-May-2022-949

jobs/H03-May-2022-2,9751,583

utils/H03-May-2022-237105

GenericParameterJob.phpH A D13-Nov-20211.2 KiB374

IJobSpecification.phpH A D15-Dec-20213 KiB9311

Job.phpH A D15-Dec-202112.7 KiB476221

JobQueue.phpH A D15-Dec-202122.2 KiB757244

JobQueueDB.phpH A D15-Dec-202127.2 KiB926582

JobQueueFederated.phpH A D15-Dec-202115.5 KiB491322

JobQueueGroup.phpH A D15-Dec-202113.5 KiB481266

JobQueueMemory.phpH A D13-Nov-20215 KiB233110

JobQueueRedis.phpH A D15-Dec-202124.5 KiB828498

JobRunner.phpH A D15-Dec-202123.1 KiB693404

JobSpecification.phpH A D15-Dec-20214.7 KiB18298

README.mdH A D13-Nov-20213.7 KiB7961

RunnableJob.phpH A D13-Nov-20213.3 KiB11115

README.md

1JobQueue Architecture {#jobqueuearch}
2=====================
3Notes on the Job queuing system architecture.
4
5## Introduction
6
7The data model consist of the following main components:
8* The Job object represents a particular deferred task that happens in the
9  background. All jobs subclass the Job object and put the main logic in the
10  function called run().
11* The JobQueue object represents a particular queue of jobs of a certain type.
12  For example there may be a queue for email jobs and a queue for CDN purge
13  jobs.
14
15## Job queues
16
17Each job type has its own queue and is associated to a storage medium. One
18queue might save its jobs in redis while another one uses would use a database.
19
20Storage medium are defined in a queue class. Before using it, you must
21define in $wgJobTypeConf a mapping of the job type to a queue class.
22
23The factory class JobQueueGroup provides helper functions:
24- getting the queue for a given job
25- route new job insertions to the proper queue
26
27The following queue classes are available:
28* JobQueueDB (stores jobs in the `job` table in a database)
29* JobQueueRedis (stores jobs in a redis server)
30
31All queue classes support some basic operations (though some may be no-ops):
32* enqueueing a batch of jobs
33* dequeueing a single job
34* acknowledging a job is completed
35* checking if the queue is empty
36
37Some queue classes (like JobQueueDB) may dequeue jobs in random order while other
38queues might dequeue jobs in exact FIFO order. Callers should thus not assume jobs
39are executed in FIFO order.
40
41Also note that not all queue classes will have the same reliability guarantees.
42In-memory queues may lose data when restarted depending on snapshot and journal
43settings (including journal fsync() frequency).  Some queue types may totally remove
44jobs when dequeued while leaving the ack() function as a no-op; if a job is
45dequeued by a job runner, which crashes before completion, the job will be
46lost. Some jobs, like purging CDN caches after a template change, may not
47require durable queues, whereas other jobs might be more important.
48
49## Job queue aggregator
50
51The aggregators are used by nextJobDB.php, which is a script that will return a
52random ready queue (on any wiki in the farm) that can be used with runJobs.php.
53This can be used in conjunction with any scripts that handle wiki farm job queues.
54Note that $wgLocalDatabases defines what wikis are in the wiki farm.
55
56Since each job type has its own queue, and wiki-farms may have many wikis,
57there might be a large number of queues to keep track of. To avoid wasting
58large amounts of time polling empty queues, aggregators exists to keep track
59of which queues are ready.
60
61The following queue aggregator classes are available:
62* JobQueueAggregatorRedis (uses a redis server to track ready queues)
63
64Some aggregators cache data for a few minutes while others may be always up to date.
65This can be an important factor for jobs that need a low pickup time (or latency).
66
67## Jobs
68
69Callers should also try to make jobs maintain correctness when executed twice.
70This is useful for queues that actually implement ack(), since they may recycle
71dequeued but un-acknowledged jobs back into the queue to be attempted again. If
72a runner dequeues a job, runs it, but then crashes before calling ack(), the
73job may be returned to the queue and run a second time. Jobs like cache purging can
74happen several times without any correctness problems. However, a pathological case
75would be if a bug causes the problem to systematically keep repeating. For example,
76a job may always throw a DB error at the end of run(). This problem is trickier to
77solve and more obnoxious for things like email jobs, for example. For such jobs,
78it might be useful to use a queue that does not retry jobs.
79