1This file merges information from the various README.r[5-8] notes 2shipped with Ben Jackson and Jay Carlson's "rogue" server patches. 3The information below didn't fit well into the changelog format. 4 5From README.r5: 6 71.8.0r5 is a collection of unofficial patches to Erik Ostrom's 8LambdaMOO 1.8.0p6 server release. They're primarily bug fixes and 9speedups. For logistical reasons they're packaged as a tar file 10rather than as a collection of diffs. 11 12It's difficult to measure MOO server performance. All we can say is 13that some plausible synthetic benchmarks are now two to four times 14faster. Users have noted that production systems running this code 15feel much more responsive at computationally expensive tasks. 16 17[...] 18 19All files were run through GNU indent with settings given in 20.indent.pro in an attempt to normalize coding style. 21 22code_gen.c: 23 24Fixed bizarre bug where uninitialized memory was accessed; usually 25multiplied by zero immediately, so nobody ever noticed. 26 27eval_env.c, db_io.c, objects.c, utils.c: 28 29Type identifiers (TYPE_STR et al) now contain a bit flag indicating 30whether additional work needs to be done when a Var of their type is 31freed. This allows free_var to run inline without a case statement 32when "simple" Vars are freed. Code to translate between the internal 33TYPE_STR and the previous external representation added. 34 35db_verbs.c, db_objects.c: 36 37(This part is primarily Jay's fault, so we'll let him talk about it 38using the first person.) 39 40The verb lookup cache. Traditionally, the server has spent large 41amounts of time searching for what verbcode to run. MOO verbs can 42have aliases ($object_utils:descendents/descendants), incomplete 43specification ($room:l*ook), and command-line verbs distinguished by 44args...and verb definition order matters during lookup! These 45features ruled out the naive speedup of just dumping all verbdefs in a 46hash table per object. 47 48I decided not to work too hard on improving the performance of command 49line verb lookups. Any solution that addressed them looked to be many 50times more complex than just fixing verbs calling verbs 51(db_find_callable_verb), and the later appeared more significant to 52overall performance. 53 54Originally I built a 7 element per object table to cache lookups but 55this significantly inflated the server size relative to the 56performance increase. If you're interested in this, it's in the 57moo-cows archive as one of the steak patches. 58 59My current solution to lookup performance is to build a global hash 60table mapping 61 62 (hash(object_key x target_verbname), object_key, target_verbname) 63 => (verbdef, handle) 64 65used only for callable verb lookups. 66 67Any action on the db that could affect the validity of this table 68clears the whole table by calling 69db_priv_affected_callable_verb_lookup(). Here's a list: 70 71 recycle() 72 renumber() 73 chparent(): in some circumstances 74 add_verb() 75 delete_verb() 76 set_verb_info(): name changes, flag changes 77 set_verb_args() 78 79Since a good number of objects don't have verbs on them (inheriting 80all behavior from parents) I decided to use "first parent with verbs" 81as the object_key. This means that all those kids of $exit don't need 82to have separate table entries for :invoke or whatever. All kids of a 83player class get a single entry for :tell unless the player has verbs 84on emself. (Sadly, on LambdaMOO, the lag reduction feature object 85places a trivial :tell on anyone using it. Since the verb is 86immediately at hand the lookup is short but unavoidable for every 87player using it.) 88 89Since I use "first parent with verbs" as object_key, chparent() does 90not need to clear the table that often. If the object has no verbs, 91it can't be mentioned in the table directly; however, if it has 92children it could indirectly affect lookup of its kids that do have 93verbs. Transient objects going through the usual 94$recycler:_create()/$recycler:_recycle() life cycle avoid both of 95these problems and in this release no longer trigger a flush. 96 97For this release, Ben added negative caching---failed verb lookups are 98stored in the table as well. 99 100The table itself is implemented as a fixed number of hash chains. The 101compiled-in default is 7507 (DEFAULT_VC_SIZE in db_verbs.c). 102Statistics on occupancy are available through two new wiz-only 103primitives. log_cache_stats() dumps formatted info into the server 104log; verb_cache_stats() returns a list of the form: 105 106 {hits, negative_hits, misses, table_clears, histogram} 107 108where histogram is a 17 element list. histogram[1] is the number of 109chains with length 0; histogram[2] is the number of chains with length 1101 and so on up to histogram[17] which counts the number of chains with 111length of 16 or greater. 112 113hits, negative_hits, misses, and table_clears are counters only zeroed 114at server start. The histogram is a snapshot of current cache 115condition. If you're running a really busy server you can overflow 116the hits counter in a few weeks; your server won't crash but values 117reported by these functions will be wrong. Yes, LambdaMOO executes 118*billions* of verbs in a typical run. 119 120If you start fretting about how much memory the lookup table is using, 121write a continuously running verb that forces one of the table clear 122conditions. 123 124extensions.c, db_tune.h: 125 126The functions in extensions.c that provide verb cache stats need to 127talk to the db layer's internals in order to gather information, but 128they aren't part of the db layer proper. db_tune.h was invented as a 129middle ground between db.h and db_private.h for source files that 130needed access to implementation-specific interfaces provided by the db 131layer. 132 133Comments (and suggestions on a better name!) on this are solicited. 134 135decompile.c, program.c: 136 137When errors are thrown, the line number of the error is included in 138the traceback information. Mapping between bytecode program counter 139and line number is expensive, so each Program now maintains a single 140pc->lineno cache entry---hopefully most programs that fail multiple 141times usually fail on the same line. 142 143eval_env.c, execute.c: 144 145To avoid calling malloc()/free() as often, the server now keeps a 146central pool of rt_stacks and rt_envs of given sizes. They revert to 147malloc()/free() for large requests. 148 149execute.c: 150 151General optimization; Ben can write more extensively about this. One of 152the more significant is that OP_IMM followed by OP_POP is "peephole 153optimized"; this makes verb comments like 154 155 "$string_utils:from_list(l, [, separator])"; 156 "Return a string etc"; 157 do_some_work(); 158 "and do some more work"; 159 do_more_work(); 160 161much cheaper. 162 163An important memory leak involving failed property lookups was closed. 164 165execute.c, options.h 166 167Because very few sites actually use protected builtin properties and 168using them is a very substantial performance hit, a new options.h 169define, IGNORE_PROP_PROTECTED, allows them to be disabled at 170compile-time. This is the default. 171 172functions.c, server.c: 173 174Doing property lookups per builtin function call to determine whether 175the function needs the $server_options.protect_foo treatment is 176extremely expensive. A protectedness flag was directly added to the 177builtin function struct; the value of these flags are loaded from the 178db at startup time, or whenever the new builtin function 179load_server_options() is called. 180 181list.c: 182 183There's now a canonical empty list. 184 185The regexp pattern cache wasn't storing the case_matters flag, causing 186many patterns to be impossible to find in the cache. 187 188decode_binary() was broken on systems where char is signed by default. 189 190doinsert reallocs lists with refcount 1 when appending rather than 191calling var_ref/free_var on all the elements. (The general case could 192be sped up with memcpy as well.) 193 194my-types.h: 195 196sys/time.h may be necessary for FD_ZERO et al definitions. 197 198parse_cmd.c, storage.h: 199 200parse_into_words was incorrectly allocating an array of (char *) as 201M_STRING. This caused a million unaligned memory access warnings on 202the Alpha. Created a new M_STRING_PTRS allocation class for this. 203 204pattern.c: 205 206fastmap was allocated with mymalloc() but freed with the normal 207free(). Fixed. 208 209ref_count.c: 210 211Refcounts are now allocated as part of objects that can be 212addref()'d. This allows macros to manipulate those counts and makes a 213request for the current refcount of an object much cheaper. This 214completely replaces the old hash table implementation. 215 216storage.c: 217 218There's now a canonical empty string. 219 220myrealloc(), the mymalloc/myfree analog of realloc() is now available. 221 222As a result of the changes, the memory debugging code is no longer 223available. Also, since we now hold pointers to only the interior of 224some allocated objects, tools such as Purify will claim a million 225possible memory leaks. 226 227tasks.c: 228 229If a forked task was killed before it ever started, it leaked some 230memory. Fixed. 231 232utils.c: 233 234var_refcount(Var v) added. Returns the refcount of any Var. 235 236From README.r6: 237 238The two big changes in r6 over r5 are: 239 240 o Bytecode optimizations to try to modify lists in-place whenever 241possible. List manipulation and mutation should be orders of 242magnitude faster in some cases. 243 244 o String "interning" during load; initially, there will be one and 245only one in-memory copy of each identical string. (In JHCore that 246means we only allocate memory for "do" once...) 247 248From README.r7: 249 250r7 fixes BYTECODE_REDUCE_REF. It's now safe to turn on. 251[This turned out to be false.] 252 253The default input and output buffer sizes in options.h are now 64k. 254 255From README.r8: 256 257r8 adds more fixes to BYTECODE_REDUCE_REF. It's now safe to turn on. 258However, suspended tasks are a problem for switchover. From options.h: 259 260 * This option affects the length of certain bytecode sequences. 261 * Suspended tasks in a database from a server built with this option 262 * are not guaranteed to work with a server built without this option, 263 * and vice versa. It is safe to flip this switch only if there are 264 * no suspended tasks in the database you are loading. (It might work 265 * anyway, but hey, it's your database.) This restriction will be 266 * lifted in a future version of the server software. Consider this 267 * option as being BETA QUALITY until then. 268