1This file merges information from the various README.r[5-8] notes
2shipped with Ben Jackson and Jay Carlson's "rogue" server patches.
3The information below didn't fit well into the changelog format.
4
5From README.r5:
6
71.8.0r5 is a collection of unofficial patches to Erik Ostrom's
8LambdaMOO 1.8.0p6 server release.  They're primarily bug fixes and
9speedups.  For logistical reasons they're packaged as a tar file
10rather than as a collection of diffs.
11
12It's difficult to measure MOO server performance.  All we can say is
13that some plausible synthetic benchmarks are now two to four times
14faster.  Users have noted that production systems running this code
15feel much more responsive at computationally expensive tasks.
16
17[...]
18
19All files were run through GNU indent with settings given in
20.indent.pro in an attempt to normalize coding style.
21
22code_gen.c:
23
24Fixed bizarre bug where uninitialized memory was accessed; usually
25multiplied by zero immediately, so nobody ever noticed.
26
27eval_env.c, db_io.c, objects.c, utils.c:
28
29Type identifiers (TYPE_STR et al) now contain a bit flag indicating
30whether additional work needs to be done when a Var of their type is
31freed.  This allows free_var to run inline without a case statement
32when "simple" Vars are freed.  Code to translate between the internal
33TYPE_STR and the previous external representation added.
34
35db_verbs.c, db_objects.c:
36
37(This part is primarily Jay's fault, so we'll let him talk about it
38using the first person.)
39
40The verb lookup cache.  Traditionally, the server has spent large
41amounts of time searching for what verbcode to run.  MOO verbs can
42have aliases ($object_utils:descendents/descendants), incomplete
43specification ($room:l*ook), and command-line verbs distinguished by
44args...and verb definition order matters during lookup!  These
45features ruled out the naive speedup of just dumping all verbdefs in a
46hash table per object.
47
48I decided not to work too hard on improving the performance of command
49line verb lookups.  Any solution that addressed them looked to be many
50times more complex than just fixing verbs calling verbs
51(db_find_callable_verb), and the later appeared more significant to
52overall performance.
53
54Originally I built a 7 element per object table to cache lookups but
55this significantly inflated the server size relative to the
56performance increase.  If you're interested in this, it's in the
57moo-cows archive as one of the steak patches.
58
59My current solution to lookup performance is to build a global hash
60table mapping
61
62   (hash(object_key x target_verbname), object_key, target_verbname)
63        => (verbdef, handle)
64
65used only for callable verb lookups.
66
67Any action on the db that could affect the validity of this table
68clears the whole table by calling
69db_priv_affected_callable_verb_lookup().  Here's a list:
70
71  recycle()
72  renumber()
73  chparent(): in some circumstances
74  add_verb()
75  delete_verb()
76  set_verb_info(): name changes, flag changes
77  set_verb_args()
78
79Since a good number of objects don't have verbs on them (inheriting
80all behavior from parents) I decided to use "first parent with verbs"
81as the object_key.  This means that all those kids of $exit don't need
82to have separate table entries for :invoke or whatever.  All kids of a
83player class get a single entry for :tell unless the player has verbs
84on emself.  (Sadly, on LambdaMOO, the lag reduction feature object
85places a trivial :tell on anyone using it.  Since the verb is
86immediately at hand the lookup is short but unavoidable for every
87player using it.)
88
89Since I use "first parent with verbs" as object_key, chparent() does
90not need to clear the table that often.  If the object has no verbs,
91it can't be mentioned in the table directly; however, if it has
92children it could indirectly affect lookup of its kids that do have
93verbs.  Transient objects going through the usual
94$recycler:_create()/$recycler:_recycle() life cycle avoid both of
95these problems and in this release no longer trigger a flush.
96
97For this release, Ben added negative caching---failed verb lookups are
98stored in the table as well.
99
100The table itself is implemented as a fixed number of hash chains.  The
101compiled-in default is 7507 (DEFAULT_VC_SIZE in db_verbs.c).
102Statistics on occupancy are available through two new wiz-only
103primitives.  log_cache_stats() dumps formatted info into the server
104log; verb_cache_stats() returns a list of the form:
105
106  {hits, negative_hits, misses, table_clears, histogram}
107
108where histogram is a 17 element list.  histogram[1] is the number of
109chains with length 0; histogram[2] is the number of chains with length
1101 and so on up to histogram[17] which counts the number of chains with
111length of 16 or greater.
112
113hits, negative_hits, misses, and table_clears are counters only zeroed
114at server start.  The histogram is a snapshot of current cache
115condition.  If you're running a really busy server you can overflow
116the hits counter in a few weeks; your server won't crash but values
117reported by these functions will be wrong.  Yes, LambdaMOO executes
118*billions* of verbs in a typical run.
119
120If you start fretting about how much memory the lookup table is using,
121write a continuously running verb that forces one of the table clear
122conditions.
123
124extensions.c, db_tune.h:
125
126The functions in extensions.c that provide verb cache stats need to
127talk to the db layer's internals in order to gather information, but
128they aren't part of the db layer proper.  db_tune.h was invented as a
129middle ground between db.h and db_private.h for source files that
130needed access to implementation-specific interfaces provided by the db
131layer.
132
133Comments (and suggestions on a better name!) on this are solicited.
134
135decompile.c, program.c:
136
137When errors are thrown, the line number of the error is included in
138the traceback information.  Mapping between bytecode program counter
139and line number is expensive, so each Program now maintains a single
140pc->lineno cache entry---hopefully most programs that fail multiple
141times usually fail on the same line.
142
143eval_env.c, execute.c:
144
145To avoid calling malloc()/free() as often, the server now keeps a
146central pool of rt_stacks and rt_envs of given sizes.  They revert to
147malloc()/free() for large requests.
148
149execute.c:
150
151General optimization; Ben can write more extensively about this.  One of
152the more significant is that OP_IMM followed by OP_POP is "peephole
153optimized"; this makes verb comments like
154
155  "$string_utils:from_list(l, [, separator])";
156  "Return a string etc";
157  do_some_work();
158  "and do some more work";
159  do_more_work();
160
161much cheaper.
162
163An important memory leak involving failed property lookups was closed.
164
165execute.c, options.h
166
167Because very few sites actually use protected builtin properties and
168using them is a very substantial performance hit, a new options.h
169define, IGNORE_PROP_PROTECTED, allows them to be disabled at
170compile-time.  This is the default.
171
172functions.c, server.c:
173
174Doing property lookups per builtin function call to determine whether
175the function needs the $server_options.protect_foo treatment is
176extremely expensive.  A protectedness flag was directly added to the
177builtin function struct; the value of these flags are loaded from the
178db at startup time, or whenever the new builtin function
179load_server_options() is called.
180
181list.c:
182
183There's now a canonical empty list.
184
185The regexp pattern cache wasn't storing the case_matters flag, causing
186many patterns to be impossible to find in the cache.
187
188decode_binary() was broken on systems where char is signed by default.
189
190doinsert reallocs lists with refcount 1 when appending rather than
191calling var_ref/free_var on all the elements.  (The general case could
192be sped up with memcpy as well.)
193
194my-types.h:
195
196sys/time.h may be necessary for FD_ZERO et al definitions.
197
198parse_cmd.c, storage.h:
199
200parse_into_words was incorrectly allocating an array of (char *) as
201M_STRING.  This caused a million unaligned memory access warnings on
202the Alpha.  Created a new M_STRING_PTRS allocation class for this.
203
204pattern.c:
205
206fastmap was allocated with mymalloc() but freed with the normal
207free().  Fixed.
208
209ref_count.c:
210
211Refcounts are now allocated as part of objects that can be
212addref()'d.  This allows macros to manipulate those counts and makes a
213request for the current refcount of an object much cheaper.  This
214completely replaces the old hash table implementation.
215
216storage.c:
217
218There's now a canonical empty string.
219
220myrealloc(), the mymalloc/myfree analog of realloc() is now available.
221
222As a result of the changes, the memory debugging code is no longer
223available.  Also, since we now hold pointers to only the interior of
224some allocated objects, tools such as Purify will claim a million
225possible memory leaks.
226
227tasks.c:
228
229If a forked task was killed before it ever started, it leaked some
230memory.  Fixed.
231
232utils.c:
233
234var_refcount(Var v) added.  Returns the refcount of any Var.
235
236From README.r6:
237
238The two big changes in r6 over r5 are:
239
240  o  Bytecode optimizations to try to modify lists in-place whenever
241possible.  List manipulation and mutation should be orders of
242magnitude faster in some cases.
243
244  o  String "interning" during load; initially, there will be one and
245only one in-memory copy of each identical string.  (In JHCore that
246means we only allocate memory for "do" once...)
247
248From README.r7:
249
250r7 fixes BYTECODE_REDUCE_REF.  It's now safe to turn on.
251[This turned out to be false.]
252
253The default input and output buffer sizes in options.h are now 64k.
254
255From README.r8:
256
257r8 adds more fixes to BYTECODE_REDUCE_REF.  It's now safe to turn on.
258However, suspended tasks are a problem for switchover.  From options.h:
259
260 * This option affects the length of certain bytecode sequences.
261 * Suspended tasks in a database from a server built with this option
262 * are not guaranteed to work with a server built without this option,
263 * and vice versa.  It is safe to flip this switch only if there are
264 * no suspended tasks in the database you are loading.  (It might work
265 * anyway, but hey, it's your database.)  This restriction will be
266 * lifted in a future version of the server software.  Consider this
267 * option as being BETA QUALITY until then.
268