12 1mNAME0m
2 1mNAME0m
3       bibclean  - prettyprint and syntax check BibTeX and Scribe bibliography
4       data base files
5
62 1mSYNOPSIS0m
7 1mSYNOPSIS0m
8       1mbibclean 22m[ 1m-author 22m] [ 1m-copyleft 22m] [ 1m-copyright 22m]
9                [ 1m-error-log 4m22mfilename24m ] [ 1m-help 22m] [ 1m'-?'  22m]
10                [ 1m-init-file 4m22mfilename24m ] [ 1m-ISBN-file 4m22mfilename24m ]
11                [ 1m-keyword-file 4m22mfilename24m ] [ 1m-max-width 4m22mnnn24m ]
12                [ 1m-[no-]align-equals 22m] [ 1m-[no-]brace-protect 22m]
13                [ 1m-[no-]check-values 22m] [ 1m-[no-]debug-match-failures 22m]
14                [ 1m-[no-]delete-empty-values 22m] [ 1m-[no-]file-position 22m]
15                [ 1m-[no-]fix-accents 22m] [ 1m-[no-]fix-braces 22m]
16                [ 1m-[no-]fix-degrees 22m] [ 1m-[no-]fix-font-changes 22m]
17                [ 1m-[no-]fix-initials 22m] [ 1m-[no-]fix-math 22m] [ 1m-[no-]fix-names 22m]
18                [ 1m-[no-]German-style 22m] [ 1m-[no-]keep-linebreaks 22m]
19                [ 1m-[no-]keep-parbreaks 22m] [ 1m-[no-]keep-preamble-spaces 22m]
20                [ 1m-[no-]keep-spaces 22m] [ 1m-[no-]keep-string-spaces 22m]
21                [ 1m-[no-]parbreaks 22m] [ 1m-[no-]prettyprint 22m]
22                [ 1m-[no-]print-ISBN-table 22m] [ 1m-[no-]print-keyword-table 22m]
23                [ 1m-[no-]print-patterns 22m] [ 1m-[no-]quiet 22m]
24                [ 1m-[no-]read-init-files 22m] [ 1m-[no-]remove-OPT-prefixes 22m]
25                [ 1m-[no-]scribe 22m] [ 1m-[no-]trace-file-opening 22m]
26                [ 1m-[no-]warnings 22m] [ 1m-output-file 4m22mfilename24m ] [ 1m-version 22m]
27                4m<infile24m or  4mbibfile124m 4mbibfile224m 4mbibfile324m 4m...0m
28                4m>outfile0m
29
30       All options can be abbreviated to a unique leading prefix.
31
32       An explicit file name of ``-'' represents standard input; it is assumed
33       if no input files are specified.
34
35       On VAX VMS and IBM PC DOS, the leading ``-'' on option names may be re-
36       placed by a slash, ``/''; however, the ``-'' option  prefix  is  always
37       recognized.
38
392 1mDESCRIPTION0m
40 1mDESCRIPTION0m
41       1mbibclean 22mprettyprints input BibTeX files to 4mstdout24m, or to a user-speci-
42       fied file, and checks the brace balance and bibliography  entry  syntax
43       as  well.  It can be used to detect problems in BibTeX files that some-
44       times confuse even BibTeX itself, and importantly, can be used to  nor-
45       malize the appearance of collections of BibTeX files.
46
47       Here is a summary of the formatting actions:
48
49       o  BibTeX  items  are  formatted  into  a consistent structure with one
50          4mfield24m 4m=24m 4m"value"24m pair per line, and the initial @ and trailing  right
51          brace in column 1.
52
53       o  Tabs  are  expanded into blank strings; their use is discouraged be-
54          cause they inhibit portability, and can suffer corruption  in  elec-
55          tronic mail.
56
57       o  Long  string values are split at a blank and continued onto the next
58          line with leading indentation.
59
60       o  A single blank line separates adjacent bibliography entries.
61
62       o  Text outside BibTeX entries is passed through verbatim.
63
64       o  Outer parentheses around entries are converted to braces.
65
66       o  Personal names in 4mauthor24m and 4meditor24m field values are  normalized  to
67          the  form  ``P.  D.  Q.   Bach'',  from  ``P.D.Q. Bach'' and ``Bach,
68          P.D.Q.''.
69
70       o  Hyphen sequences in page numbers are converted to en-dashes.
71
72       o  Month values are converted to standard BibTeX string abbreviations.
73
74       o  In titles, sequences of upper-case characters at  brace  level  zero
75          are  braced  to protect them from being converted to lower-case let-
76          ters by some bibliography styles.
77
78       o  CODEN, ISBN (International Standard Book Number) and ISSN  (Interna-
79          tional  Standard  Serial Number) entry values are examined to verify
80          the checksums of each listed number, and correct ISBN hyphenation is
81          automatically supplied.
82
83       The standardized format of the output of 1mbibclean 22mfacilitates the later
84       application of simple filters, such as 1mbibcheck22m(1),  1mbibdup22m(1),  1mbibex-0m
85       1mtract22m(1),  1mbibindex22m(1),  1mbibjoin22m(1),  1mbiblabel22m(1),  1mbiblook22m(1),  1mbibor-0m
86       1mder22m(1), 1mbibsort22m(1), 1mcitefind22m(1), and 1mcitetags22m(1), to process the  text,
87       and also is the one expected by the GNU Emacs BibTeX support functions.
88
892 1mOPTIONS0m
90 1mOPTIONS0m
91       Command-line  switches  may  be abbreviated to a unique leading prefix,
92       and letter case is 4mnot24m significant.  All options are parsed before  any
93       input  bibliography  files  are read, no matter what their order on the
94       command line.  Options that correspond to a yes/no setting  of  a  flag
95       have  a  form  with a prefix "no-" to set the flag to 4mno24m.  For such op-
96       tions, the last setting determines the flag value used.  That  is  sig-
97       nificant  when  options are also specified in initialization files (see
98       the 1mINITIALIZATION FILES 22mmanual section).
99
100       The leading hyphen that distinguishes an option from a filename may  be
101       doubled,  for compatibility with GNU and POSIX conventions.  Thus, 1m-au-0m
102       1mthor 22mand 1m--author 22mare equivalent.
103
104       To avoid confusion with options, if a filename begins with a hyphen, it
105       must  be  disguised  by  a leading absolute or relative directory path,
106       e.g., 4m/tmp/-foo.bib24m or 4m./-foo.bib24m.
107
108       1m-author                     22mDisplay an author credit  on  the  standard
109                                   error unit, 4mstderr24m, and then terminate with
110                                   a success return code.  Sometimes  an  exe-
111                                   cutable program is separated from its docu-
112                                   mentation and source code; this option pro-
113                                   vides a way to recover from that.
114
115       1m-copyleft                   22mDisplay  copyright information on the stan-
116                                   dard error unit, 4mstderr24m, and then terminate
117                                   with a success return code.
118
119       1m-copyright                  22mDisplay  copyright information on the stan-
120                                   dard error unit, 4mstderr24m, and then terminate
121                                   with a success return code.
122
123       1m-error-log 4m22mfilename24m         Redirect  4mstderr24m  to  the  indicated  file,
124                                   which then contains all of  the  error  and
125                                   warning  messages.  This option is provided
126                                   for  those  systems  that  have  difficulty
127                                   redirecting 4mstderr24m.
128
129       1m-help 22mor 1m-?                 22mDisplay  a help message on 4mstderr24m, giving a
130                                   usage description, similar to this  section
131                                   of  the  manual  pages,  and then terminate
132                                   with a success return code.
133
134       1m-ISBN-file 4m22mfilename24m         Provide an explicit ISBN-range  initializa-
135                                   tion  file.  It is processed 4mafter24m any sys-
136                                   tem-wide and job-wide  ISBN  initialization
137                                   files  found  on  the  1mPATH  22m(for  VAX VMS,
138                                   1mSYS$SYSTEM22m) and 1mBIBINPUTS 22msearch paths, re-
139                                   spectively,  and  may  override  them.  The
140                                   ISBN  initialization  file  name   can   be
141                                   changed  at  compile  time,  or at run time
142                                   through a setting of the environment  vari-
143                                   able  1mBIBCLEANISBN22m,  but  defaults to 4m.bib-0m
144                                   4mclean.isbn24m on UNIX, and 4mbibclean.isb24m  else-
145                                   where.   For  further details, see the 1mISBN0m
146                                   1mINITIALIZATION FILES 22mmanual section.
147
148       1m-init-file 4m22mfilename24m         Provide an explicit value pattern  initial-
149                                   ization  file.   It  is processed 4mafter24m any
150                                   system-wide  and  job-wide   initialization
151                                   files  found  on  the  1mPATH  22m(for  VAX VMS,
152                                   1mSYS$SYSTEM22m) and 1mBIBINPUTS 22msearch paths, re-
153                                   spectively,  and  may override them.  It in
154                                   turn may  be  overridden  by  a  subsequent
155                                   file-specific   initialization  file.   The
156                                   initialization file name can be changed  at
157                                   compile time, or at run time through a set-
158                                   ting  of  the  environment  variable   1mBIB-0m
159                                   1mCLEANINI22m,  but  defaults  to 4m.bibcleanrc24m on
160                                   UNIX, and to 4mbibclean.ini24m  elsewhere.   For
161                                   further  details,  see  the  1mINITIALIZATION0m
162                                   1mFILES 22mmanual section.
163
164       1m-keyword-file 4m22mfilename24m      Provide an explicit keyword  initialization
165                                   file.   It  is  processed 4mafter24m any system-
166                                   wide and  job-wide  keyword  initialization
167                                   files  found  on  the  1mPATH  22m(for  VAX VMS,
168                                   1mSYS$SYSTEM22m) and 1mBIBINPUTS 22msearch paths, re-
169                                   spectively,  and  may  override  them.  The
170                                   keyword initialization  file  name  can  be
171                                   changed  at  compile  time,  or at run time
172                                   through a setting of the environment  vari-
173                                   able  1mBIBCLEANKEY22m,  but  defaults  to 4m.bib-0m
174                                   4mclean.key24m on UNIX, and  4mbibclean.key24m  else-
175                                   where.   For  further details, see the 1mKEY-0m
176                                   1mWORD INITIALIZATION FILES 22mmanual section.
177
178       1m-max-width 4m22mnnn24m              1mbibclean 22mnormally limits output line widths
179                                   to  72  characters, and in the interests of
180                                   consistency,  that  value  should  not   be
181                                   changed.  Occasionally, special-purpose ap-
182                                   plications may  require  different  maximum
183                                   line  widths,  so this option provides that
184                                   capability.  The number following  the  op-
185                                   tion  name can be specified in decimal, oc-
186                                   tal  (starting  with  0),  or   hexadecimal
187                                   (starting  with  0x).   A  zero or negative
188                                   value is interpreted to mean unlimited,  so
189                                   1m-max-width  4m22m024m  can  be  used to ensure that
190                                   each field/value pair appears on  a  single
191                                   line.
192
193                                   When  1m-no-prettyprint  22mrequests 1mbibclean 22mto
194                                   act as a lexical analyzer, the default line
195                                   width  is  unlimited,  unless overridden by
196                                   this option.
197
198                                   When 1mbibclean 22mis prettyprinting, line wrap-
199                                   ping is done only at a space. Consequently,
200                                   a long non-blank character sequence may re-
201                                   sult  in the output exceeding the requested
202                                   line width.
203
204                                   When 1mbibclean 22mis lexing, line  wrapping  is
205                                   done  by inserting a backslash-newline pair
206                                   when the specified maximum is  reached,  so
207                                   no line length ever exceeds the maximum.
208
209       1m-[no-]align-equals          22mWith  the  positive  form, align the equals
210                                   sign in key/value assignments at  the  same
211                                   column,  separated  by  a single space from
212                                   the value string.   Otherwise,  the  equals
213                                   sign follows the key, separated by a single
214                                   space.  Default: 4mno24m.
215
216       1m-[no-]brace-protect         22mProtect uppercase and  mixedcase  words  at
217                                   brace-level  zero  with  braces  to prevent
218                                   downcasing by some BibTeX styles.  Default:
219                                   4myes24m.
220
221       1m-[no-]check-values          22mWith  the  positive  form,  apply heuristic
222                                   pattern matching to field values  in  order
223                                   to  detect  possible errors (e.g., ``4myear24m 4m=0m
224                                   4m"192"24m'' instead of ``4myear24m 4m=24m 4m"1992"24m''),  and
225                                   issue warnings when unexpected patterns are
226                                   found.
227
228                                   That checking is usually beneficial, but if
229                                   it  produces  too many bogus warnings for a
230                                   particular bibliography file, you can  dis-
231                                   able  it with the negative form of this op-
232                                   tion.  Default: 4myes24m.
233
234       1m-[no-]debug-match-failures  22mWith the positive form, print out a warning
235                                   when a value pattern fails to match a value
236                                   string.
237
238                                   That is helpful in debugging new  patterns,
239                                   but  because  the output can be voluminous,
240                                   you should use this option only with  small
241                                   test  files,  and initialization files that
242                                   eliminate all patterns apart from the  ones
243                                   that you are testing.  Default: 4mno24m.
244
245       1m-[no-]delete-empty-values   22mWith   the   positive   form,   remove  all
246                                   field/value pairs for which the value is an
247                                   empty  string.  That is helpful in cleaning
248                                   up bibliographies generated from text  edi-
249                                   tor  templates.  Compare  this  option with
250                                   1m-[no-]remove-OPT-prefixes 22mdescribed  below.
251                                   Default: 4mno24m.
252
253       1m-[no-]file-position         22mWith  the positive form, give detailed file
254                                   position information in warning  and  error
255                                   messages.  Default: 4mno24m.
256
257       1m-[no-]fix-accents           22mWith  the  positive form, normalize TeX ac-
258                                   cents in annotes, authors, booktitles, edi-
259                                   tors, notes, remarks, and titles.  Default:
260                                   4mno24m.
261
262       1m-[no-]fix-braces            22mWith the positive form,  normalize  bracing
263                                   in  annotes,  authors, booktitles, editors,
264                                   notes, remarks, and titles, by removing un-
265                                   necessary levels of braces.  Default: 4mno24m.
266
267       1m-[no-]fix-degrees           22mWith  the  positive  form, remove spaces in
268                                   author/editor fields  inside  braces  after
269                                   letter-ending  periods.   That makes reduc-
270                                   tions from 4mJ.24m 4mJ.24m 4m{Thomson,24m  4mM.24m  4mA.,24m  4mF.24m  4mR.0m
271                                   4mS.}24m,  4mFrederick24m 4m{Soddy,24m 4mB.24m 4mA.24m 4m(Oxon.)}24m, and
272                                   4mJohn24m  4mA.24m  4m{Cable,24m  4mM.24m  4mA.,24m  4mM.24m  4mEd.,24m  4mDipl.0m
273                                   4mDeutsch24m  4m(Marburg),24m  4mA.24m  4mL.24m 4mC.24m 4mM.}24m to 1mJ. J.0m
274                                   1m{Thomson, M.A., F.R.S.}22m, 1mFrederick  {Soddy,0m
275                                   1mB.A.  (Oxon.)}22m,  and  1mJohn A. {Cable, M.A.,0m
276                                   1mM.Ed., Dipl.Deutsch  (Marburg),  A.L.C.M.}22m,
277                                   respectively.
278
279                                   In  journals  in the humanities and history
280                                   of science, as well as in  some  scientific
281                                   journals  until well into the 20th Century,
282                                   academic, honorary, and professional titles
283                                   and  degrees  are commonly attached to per-
284                                   sonal names.  Even though modern publishing
285                                   practice avoids such decorations, for accu-
286                                   racy, bibliography entries  should  prefer-
287                                   ably  retain  them.   Journal typographical
288                                   practice generally follows  the  reductions
289                                   described here.
290
291       1m-[no-]fix-font-changes      22mWith  the  positive  form,  supply an addi-
292                                   tional brace level around font  changes  in
293                                   titles  to  protect  against  downcasing by
294                                   some BibTeX styles.  Font changes that  al-
295                                   ready  have  more  than one level of braces
296                                   are not modified.
297
298                                   For example, if a title contains the  Latin
299                                   phrase  4m{\em24m  4mDictyostelium24m  4mdiscoideum}24m or
300                                   4m{\em24m  4m{D}ictyostelium24m   4mdiscoideum}24m,   then
301                                   downcasing  incorrectly converts the phrase
302                                   to lower-case letters.  Most  BibTeX  users
303                                   are surprised that bracing the initial let-
304                                   ters does not prevent the downcase  action.
305                                   The  correct  coding is 4m{{\em24m 4mDictyostelium0m
306                                   4mdiscoideum}}24m.  However, there are also  le-
307                                   gitimate  cases  where  an  extra  level of
308                                   bracing wrongly protects  from  downcasing.
309                                   Consequently,  1mbibclean  22mnormally  does 4mnot0m
310                                   supply an extra level of braces, but if you
311                                   have  a bibliography where the extra braces
312                                   are routinely missing, you can use this op-
313                                   tion to supply them.
314
315                                   If  you think that you need this option, it
316                                   is 4mstrongly24m recommended that you apply 1mbib-0m
317                                   1mclean  22mto  your  bibliography file with and
318                                   without 1m-fix-font-changes22m, then compare the
319                                   two  output  files  to  ensure  that  extra
320                                   braces are not  being  supplied  in  titles
321                                   where they should not be present.  You must
322                                   decide which of the two output files is the
323                                   better  choice,  then  repair the incorrect
324                                   title bracing by hand.
325
326                                   Because font changes in titles  are  uncom-
327                                   mon, except for cases of the type that this
328                                   option is designed to correct, it should do
329                                   more good than harm.  Default: 4mno24m.
330
331       1m-[no-]fix-initials          22mWith  the positive form, insert a space af-
332                                   ter a  period  following  author  initials.
333                                   Default: 4myes24m.
334
335       1m-[no-]fix-math              22mWith the positive form, improve readability
336                                   of math mode in titles by inserting  spaces
337                                   around  operators,  deleting other unneces-
338                                   sary space, and removing braces around sin-
339                                   gle-character  subscripts and superscripts.
340                                   Default: 4mno24m.
341
342       1m-[no-]fix-names             22mWith the positive form, reorder 4mauthor24m  and
343                                   4meditor24m name lists to remove commas at brace
344                                   level zero, placing first names or initials
345                                   before last names.  Default: 4myes24m.
346
347       1m-[no-]German-style          22mWith  the  positive  form,  interpret quote
348                                   characters ["] inside 4mbraced24m value  strings
349                                   at  brace  level 1 according to the conven-
350                                   tions of the  TeX  style  file  4mgerman.sty24m,
351                                   which overloads quote to simplify input and
352                                   representation of  German  umlaut  accents,
353                                   sharp-s  (es-zet), ligature separators, in-
354                                   visible  hyphens,  raised/lowered   quotes,
355                                   French  guillemets,  and  discretionary hy-
356                                   phens.  Recognized  character  combinations
357                                   are  braced  to  prevent BibTeX from inter-
358                                   preting the quote as a string delimiter.
359
360                                   Quoted strings receive no special  handling
361                                   from  this option, and because German nouns
362                                   in titles must anyway be protected from the
363                                   downcasing  operation of most BibTeX bibli-
364                                   ography styles, German value  strings  that
365                                   use  the overloaded quote character can al-
366                                   ways be entered in the form "{...}",  with-
367                                   out the need to specify this option at all.
368
369                                   Default: 4mno24m.
370
371       1m-[no-]keep-linebreaks       22mNormally,  line breaks inside value strings
372                                   are collapsed into a single space, so  that
373                                   long  value  strings can later be broken to
374                                   provide lines of reasonable length.
375
376                                   With the positive form, linebreaks are pre-
377                                   served  in value strings.  If 1m-max-width 22mis
378                                   set to zero, this  preserves  the  original
379                                   line breaks.  Spacing 4moutside24m value strings
380                                   remains under 1mbibclean22m's  control,  and  is
381                                   not affected by this option.
382
383                                   Default: 4mno24m.
384
385       1m-[no-]keep-parbreaks        22mWith  the positive form, preserve paragraph
386                                   breaks (either formfeeds, or lines contain-
387                                   ing  only  spaces)  in value strings.  Nor-
388                                   mally, paragraph breaks are collapsed  into
389                                   a  single  space.   Spacing  4moutside24m  value
390                                   strings remains under  1mbibclean22m's  control,
391                                   and  is  not  affected by this option.  De-
392                                   fault: 4mno24m.
393
394       1m-[no-]keep-preamble-spaces  22mWith the positive form, preserve all white-
395                                   space  in @Preamble{...} entries.  Default:
396                                   4mno24m.
397
398       1m-[no-]keep-spaces           22mWith the positive form, preserve all spaces
399                                   in  value strings.  Normally, multiple spa-
400                                   ces are  collapsed  into  a  single  space.
401                                   This  option  can  be  used  together  with
402                                   1m-keep-linebreaks22m,   1m-keep-parbreaks22m,    and
403                                   1m-max-width  4m22m024m to preserve the form of value
404                                   strings while still  providing  syntax  and
405                                   value   checking.   Spacing  4moutside24m  value
406                                   strings remains under  1mbibclean22m's  control,
407                                   and  is  not  affected by this option.  De-
408                                   fault: 4mno24m.
409
410       1m-[no-]keep-string-spaces    22mWith the positive form, preserve all white-
411                                   space  in  @String{...}  entries.  Default:
412                                   4mno24m.
413
414       1m-[no-]parbreaks             22mWith the negative form, a  paragraph  break
415                                   (either  a  formfeed,  or a line containing
416                                   only spaces)  is  not  permitted  in  value
417                                   strings,   or  between  field/value  pairs.
418                                   That may be useful to quickly trap  runaway
419                                   strings arising from mismatched delimiters.
420                                   Default: 4myes24m.
421
422       1m-[no-]prettyprint           22mNormally, 1mbibclean 22mfunctions as  a  pretty-
423                                   printer.   However,  with the negative form
424                                   of this option, it acts as a  lexical  ana-
425                                   lyzer  instead, producing a stream of lexi-
426                                   cal tokens.  See the 1mLEXICAL ANALYSIS  22mman-
427                                   ual  section for further details.  Default:
428                                   4myes24m.
429
430       1m-[no-]print-ISBN-table      22mWith the positive  form,  print  the  ISBN-
431                                   range  table on 4mstderr24m, then terminate with
432                                   a success return code.
433
434                                   That action is taken after all command-line
435                                   options are processed, and before any input
436                                   files are read (other than those  that  are
437                                   values of command-line options).
438
439                                   The  format  of the output ISBN-range table
440                                   is acceptable for input as an ISBN initial-
441                                   ization  file  (see the 1mISBN INITIALIZATION0m
442                                   1mFILES 22mmanual section).  Default: 4mno24m.
443
444       1m-[no-]print-keyword-table   22mWith the positive form, print  the  keyword
445                                   initialization table on 4mstderr24m, then termi-
446                                   nate with a success return code.
447
448                                   That action is taken after all command-line
449                                   options are processed, and before any input
450                                   files are read (other than those  that  are
451                                   values of command-line options).
452
453                                   The  format  of the output table is accept-
454                                   able for input as a keyword  initialization
455                                   file  (see the 1mKEYWORD INITIALIZATION FILES0m
456                                   manual section).  Default: 4mno24m.
457
458       1m-[no-]print-patterns        22mWith the positive  form,  print  the  value
459                                   patterns  read from initialization files as
460                                   they are added  to  internal  tables.   Use
461                                   this  option to check newly-added patterns,
462                                   or to see what patterns are being used.
463
464                                   When 1mbibclean 22mis compiled with native  pat-
465                                   tern-matching  code  (the  default),  those
466                                   patterns are the  ones  that  are  used  in
467                                   checking  value  strings  for valid syntax,
468                                   and all of them are specified  in  initial-
469                                   ization  files, rather than hard-coded into
470                                   the program.  For further details, see  the
471                                   1mINITIALIZATION  FILES  22mmanual section.  De-
472                                   fault: 4mno24m.
473
474       1m-[no-]quiet                 22mThis option is the opposite of  1m-[no-]warn-0m
475                                   1ming22m;  it  exists  for user convenience, and
476                                   for compatibility with other programs  that
477                                   use  1m-q  22mfor quiet operation, without warn-
478                                   ing messages.
479
480       1m-[no-]read-init-files       22mWith the negative form, suppress loading of
481                                   system-,  user-, and file-specific initial-
482                                   ization files.  Initializations  then  come
483                                   4monly24m  from  those files explicitly given by
484                                   1m-init-file 4m22mfilename24m options.  Default: 4myes24m.
485
486       1m-[no-]remove-OPT-prefixes   22mWith the positive form, remove the  ``OPT''
487                                   prefix  from each field name where the cor-
488                                   responding value is 4mnot24m  an  empty  string.
489                                   The  prefix ``OPT'' must be entirely in up-
490                                   per-case to be recognized.
491
492                                   This option is for bibliographies generated
493                                   with the help of the GNU Emacs BibTeX edit-
494                                   ing support, which generates templates with
495                                   optional  fields  identified by the ``OPT''
496                                   prefix.  Although the function 4mM-x24m  4mbibtex-0m
497                                   4mremove-OPT24m normally bound to the keystrokes
498                                   4mC-c24m 4mC-o24m does the job, users  often  forget,
499                                   with the result that BibTeX does not recog-
500                                   nize the field name, and ignores the  value
501                                   string.     Compare    this   option   with
502                                   1m-[no-]delete-empty-values 22mdescribed  above.
503                                   Default: 4mno24m.
504
505       1m-[no-]scribe                22mWith the positive form, accept input syntax
506                                   conforming to the Scribe  document  system.
507                                   The  output is converted to conform to Bib-
508                                   TeX syntax.  See  the  1mSCRIBE  BIBLIOGRAPHY0m
509                                   1mFORMAT  22mmanual section for further details.
510                                   Default: 4mno24m.
511
512       1m-[no-]trace-file-opening    22mWith the positive form, record in the error
513                                   log  file  the names of all files that 1mbib-0m
514                                   1mclean 22mattempts to open.  Use this option to
515                                   identify where initialization files are lo-
516                                   cated.  Default: 4mno24m.
517
518       1m-[no-]warnings              22mWith the positive form, allow  all  warning
519                                   messages.   The negative form is 4mnot24m recom-
520                                   mended because it may  mask  problems  that
521                                   should be repaired.  Default: 4myes24m.
522
523       1m-output-file 4m22mfilename24m       Supply  an alternate output file to replace
524                                   4mstdout24m.  If the filename cannot  be  opened
525                                   for  output,  execution  terminates immedi-
526                                   ately with a nonzero exit code.
527
528       1m-version                    22mDisplay  the  program  version  number   on
529                                   4mstderr24m,  and  then terminate with a success
530                                   return code.  That includes  an  indication
531                                   of  who compiled the program, the host name
532                                   on which it was compiled, the time of  com-
533                                   pilation,  and  the  type  of  string-value
534                                   matching code selected, when that  informa-
535                                   tion is available to the compiler.
536
5372 1mERROR-RECOVERY-AND-WARNINGS0m
538 1mERROR RECOVERY AND WARNINGS0m
539       When  1mbibclean  22mdetects  an  error,  it issues an error message to both
540       4mstderr24m and 4mstdout24m.  That way, the user is  clearly  notified,  and  the
541       output bibliography also contains the message at the point of error.
542
543       Error  messages begin with a distinctive pair of queries, ??, beginning
544       in column 1, followed by the input file name and line number.   If  the
545       1m-file-position  22moption  was  specified, they also contain the input and
546       output positions of the current file, entry, and value.  Each  position
547       includes  the file byte number, the line number, and the column number.
548       In the event of a runaway string argument, the entry  and  value  posi-
549       tions  should  precisely pinpoint the erroneous bibliography entry, and
550       the file positions indicate where it was detected, which may be  rather
551       later in the files.
552
553       Warning  messages  identify  possible  problems, and are therefore sent
554       only to 4mstderr24m, and not to 4mstdout24m, so they never appear in  the  output
555       file.   They  are identified by a distinctive pair of percents, %%, be-
556       ginning in column 1, and as with error messages,  may  be  followed  by
557       file position messages if the 1m-file-position 22moption was specified.
558
559       For  convenience, the first line of each error and warning message sent
560       to 4mstderr24m is formatted according to the expectations of the  GNU  Emacs
561       4mnext-error24m  command.   You  can invoke 1mbibclean 22mwith the Emacs 4mM-x24m 4mcom-0m
562       4mpile<RET>bibclean24m 4mfilename.bib24m  4m>filename.new24m  command,  then  use  the
563       4mnext-error24m  command,  normally bound to 4mC-x24m 4m`24m (that's a grave, or back,
564       accent), to move to the location of the error in the input file.
565
566       If error messages are ignored, and  left  in  the  output  bibliography
567       file,  they  precipitates  an  error when the bibliography is next pro-
568       cessed with BibTeX.
569
570       After issuing an error message, 1mbibclean 22mthen resynchronizes its  input
571       by copying it verbatim to 4mstdout24m until a new bibliography entry is rec-
572       ognized on a line in which the first non-blank character is an  at-sign
573       (@).   That ensures that nothing is lost from the input file(s), allow-
574       ing corrections to be made in either the input  or  the  output  files.
575       However,  if 1mbibclean 22mdetects an internal error in its data structures,
576       it terminates abruptly without further input or output processing; that
577       kind  of  error  should  never happen, and if it does, it should be re-
578       ported immediately to the author of the program.  Errors in initializa-
579       tion  files, and running out of dynamic memory, also immediately termi-
580       nate 1mbibclean22m.
581
5822 1mSEARCH-PATHS0m
583 1mSEARCH PATHS0m
584       Versions of 1mbibclean 22mbefore 3.00 found  some  of  their  initialization
585       files  in  the  same  directory as the executable program.  That design
586       choice means that those files can be copied anywhere in the  file  sys-
587       tem, and still be found at run time.  Some software distributions, how-
588       ever, prefer to follow the model where initialization and other related
589       files  are  instead stored in a directory whose name is related to that
590       of the executable by a conventional difference in filepath.  For  exam-
591       ple,  a program might be installed in 4m/opt/bin24m and its associated files
592       in   4m/opt/share/lib/PROGRAMNAME/24m   or   4m/opt/share/lib/PROGRAMNAME/PRO-0m
593       4mGRAMVERSION/24m.  The second form is preferable, because it permits multi-
594       ple versions of the same program to be installed, as long as  the  exe-
595       cutable  program  names carry a version suffix. Thus, a site might have
596       installed programs named 4mbibclean-1.0024m,  4mbibclean-2.0024m,  4mbibclean-2.1524m,
597       and  4mbibclean-3.0024m, with the versionless name 4mbibclean24m being a symbolic
598       link to whichever version is the desired local default.
599
600       With most software packages, the absolute path to  the  directory  con-
601       taining associated files is compiled into the program, making it impos-
602       sible to change the installation locations after the program  has  been
603       built from source code.
604
605       Some packages, however, instead use the location of the executable pro-
606       gram to find files by relative path at runtime.  In the above  example,
607       the  program  would  determine  its filesystem location at runtime, say
608       4m/opt/bin24m, then find its associated files relative to that  location  in
609       4m../share/lib/PROGRAMNAME/PROGRAMVERSION/24m.
610
611       From  version 3.00, 4mbibclean24m uses that second approach, with an associ-
612       ated directory like 4m../share/lib/bibclean/3.0024m.  That allows an instal-
613       lation  directory tree to be distributed to other systems and unbundled
614       4manywhere24m in the file system, as long as  the  relative  paths  are  not
615       changed.   4mbibclean24m tests whether its compiled-in library path is a di-
616       rectory on the local system, and if so, uses  it.   Otherwise,  it  re-
617       places  that  path  by a reconstructed one based on the location of the
618       executable program.  If the reconstructed path for the  library  direc-
619       tory  does  not exist, it uses a warning.  In either case, it continues
620       normally.
621
622       With the old approach, initialization files on Unix systems were  named
623       with  a  leading period, making them `hidden' files for the 4mls24m command.
624       With the new practice, initialization files are no longer named as hid-
625       den files.
626
6272 1mINITIALIZATION-FILES0m
628 1mINITIALIZATION FILES0m
629       1mbibclean  22mcan  be compiled with one of three different types of pattern
630       matching; the choice is made by the installer at compile time:
631
632              o  The original version uses explicit hand-coded tests of value-
633                 string syntax.
634
635              o  The  second  version uses regular-expression pattern-matching
636                 host library routines together with  regular-expression  pat-
637                 terns that come entirely from initialization files.
638
639              o  The  third  version  uses special patterns that come entirely
640                 from initialization files.
641
642       The second and third versions are the ones of most interest  here,  be-
643       cause they allow the user to control what values are considered accept-
644       able.  However, command-line options can also be specified in  initial-
645       ization files, no matter which pattern matching choice was selected.
646
647       When 1mbibclean 22mstarts, it searches for initialization files, finding the
648       first one in the system executable program search path (on UNIX and IBM
649       PC  DOS, 1mPATH22m) and the first one in the 1mBIBINPUTS 22msearch path, and pro-
650       cesses them in turn.  Then, when command-line arguments are  processed,
651       any  additional files specified by 1m-init-file 4m22mfilename24m options are also
652       processed.  Finally, immediately before each 4mnamed24m bibliography file is
653       processed,  an  attempt  is made to process an initialization file with
654       the same name, but with the extension changed to 4m.ini24m.  The default ex-
655       tension  can  be  changed by a setting of the environment variable 1mBIB-0m
656       1mCLEANEXT22m.  That scheme permits  system-wide,  user-wide,  session-wide,
657       and file-specific initialization files to be supported.
658
659       When  input  is taken from 4mstdin24m, there is no file-specific initializa-
660       tion.
661
662       For precise control, the 1m-no-read-init-files 22moption suppresses all ini-
663       tialization  files except those explicitly named by 1m-init-file 4m22mfilename0m
664       options, either on the command line,  or  in  requested  initialization
665       files.
666
667       Recursive  execution of initialization files with nested 1m-init-file 22mop-
668       tions is permitted; if the recursion is circular, 1mbibclean 22mfinally gets
669       a  non-fatal  initialization  file  open failure after opening too many
670       files.  That terminates further initialization file processing.  As the
671       recursion  unwinds,  the  files are all closed, then execution proceeds
672       normally.
673
674       An initialization file may contain empty lines, comments  from  percent
675       to  end  of line (just like TeX), option switches, and field/pattern or
676       field/pattern/message assignments.  Leading and trailing spaces are ig-
677       nored.  That is best illustrated by a short example:
678
679       % This is a small bibclean initialization file
680
681       -init-file /u/math/bib/.bibcleanrc  %% departmental patterns
682
683       chapter = "\"D\""                 %% 23
684
685       pages   = "\"D--D\""              %% 23--27
686
687       volume  = "\"D \\an\\d D\""       %% 11 and 12
688
689       year    = \
690          "\"dddd, dddd, dddd\"" \
691          "Multiple years specified."      %% 1989, 1990, 1991
692
693       -no-fix-names   %% do not modify author/editor lists
694
695       Long  logical lines can be split into multiple physical lines by break-
696       ing at a backslash-newline pair; the  backslash-newline  pair  is  dis-
697       carded.   That  processing happens while characters are being read, be-
698       fore any further interpretation of the input stream.
699
700       Each logical line must contain a complete option  (and  its  value,  if
701       any),  or  a  complete  field/pattern  pair, or a field/pattern/message
702       triple.
703
704       Comments are stripped during the parsing of  the  field,  pattern,  and
705       message  values.   The  comment  start  symbol is not recognized inside
706       quoted strings, so it can be freely used in such strings.
707
708       Comments on logical lines that were input as  multiple  physical  lines
709       via  the  backslash-newline convention must appear on the 4mlast24m physical
710       line; otherwise, the remaining physical lines become part of  the  com-
711       ment.
712
713       Pattern  strings  must  be  enclosed  in  quotation  marks; within such
714       strings, a backslash starts an escape mechanism that is  commonly  used
715       in UNIX software.  The recognized escape sequences are:
716
717              1m\a     22malarm bell (octal 007)
718
719              1m\b     22mbackspace (octal 010)
720
721              1m\f     22mformfeed (octal 014)
722
723              1m\n     22mnewline (octal 012)
724
725              1m\r     22mcarriage return (octal 015)
726
727              1m\t     22mhorizontal tab (octal 011)
728
729              1m\v     22mvertical tab (octal 013)
730
731              1m\ooo   22mcharacter number octal 4mooo24m (e.g 1m\012 22mis linefeed).  Up to
732                     3 octal digits may be used.
733
734              1m\0xhh  22mcharacter number hexadecimal 4mhh24m  (e.g.,  1m\0x0a  22mis  line-
735                     feed).   4mxhh24m may be in either letter case.  Any number of
736                     hexadecimal digits may be used.
737
738       Backslash followed by any other character produces just that character.
739       Thus, \% gets a literal percent into a string (preventing its interpre-
740       tation as a comment), \" produces a quotation mark, and \\  produces  a
741       single backslash.
742
743       An ASCII NUL 4m(\0)24m in a string terminates it; that is a feature of the C
744       programming language in which 1mbibclean 22mis implemented.
745
746       Field/pattern pairs can be separated by arbitrary  space,  and  option-
747       ally, either an equals sign or colon functioning as an assignment oper-
748       ator.  Thus, the following are equivalent:
749
750       pages="\"D--D\""
751       pages:"\"D--D\""
752       pages "\"D--D\""
753         pages = "\"D--D\""
754         pages : "\"D--D\""
755       pages   "\"D--D\""
756
757       Each field name can have an arbitrary  number  of  patterns  associated
758       with  it; however, they must be specified in separate field/pattern as-
759       signments.
760
761       An empty pattern string  causes  previously-loaded  patterns  for  that
762       field  name  to  be  forgotten.  That feature permits an initialization
763       file to completely discard patterns from earlier initialization files.
764
765       Patterns for value strings are represented in  a  tiny  special-purpose
766       language  that  is both convenient and suitable for bibliography value-
767       string syntax checking.  While not as powerful as the language of regu-
768       lar-expression  patterns,  its  parsing  can be portably implemented in
769       less than 3% of the code in  a  widely-used  regular-expression  parser
770       (the GNU 1mregexp 22mpackage).
771
772       The patterns are represented by the following special characters:
773
774              1m<space>  22mone or more spaces
775
776              1ma        22mexactly one letter
777
778              1mA        22mone or more letters
779
780              1md        22mexactly one digit
781
782              1mD        22mone or more digits
783
784              1mr        22mexactly one Roman numeral
785
786              1mR        22mone or more Roman numerals (i.e. a Roman number)
787
788              1mw        22mexactly one word (one or more letters and digits)
789
790              1mW        22mone or more space-separated words, beginning and ending
791                       with a word
792
793              1m.        22mone  `special'  character,  one   of   the   characters
794                       <space>!#()*+,-./:;?[]~,  a subset of punctuation char-
795                       acters that are typically used in string values
796
797              1m:        22mone or more `special' characters
798
799              1mX        22mone or more `special'-separated  words,  beginning  and
800                       ending with a word
801
802              1m\x       22mexactly  one  x  (x is any character), possibly with an
803                       escape sequence interpretation given earlier
804
805              1mx        22mexactly the character x (x is anything but one of these
806                       pattern characters: aAdDrRwW.:<space>\)
807
808       The  1mX  22mpattern  character is very powerful, but generally inadvisable,
809       because it matches almost anything likely to be found in a BibTeX value
810       string.  The reason for providing pattern matching on the value strings
811       is to uncover possible errors, not mask them.
812
813       There is no provision for specifying ranges or repetitions  of  charac-
814       ters,  but  that  can  usually be done with separate patterns.  It is a
815       good idea to accompany the pattern with a comment showing the  kind  of
816       thing  it is expected to match.  Here is a portion of an initialization
817       file giving a few of the patterns used to match 4mnumber24m value strings:
818
819       number  =       "\"D\""         %% 23
820       number  =       "\"A AD\""      %% PN LPS5001
821       number  =       "\"A D(D)\""    %% RJ 34(49)
822       number  =       "\"A D\""       %% XNSS 288811
823       number  =       "\"A D\\.D\""   %% Version 3.20
824       number  =       "\"A-A-D-D\""   %% UMIAC-TR-89-11
825       number  =       "\"A-A-D\""     %% CS-TR-2189
826       number  =       "\"A-A-D\\.D\"" %% CS-TR-21.7
827
828       For a bibliography that contains only 4marticle24m entries, that list should
829       probably  be  reduced to just the first pattern, so that anything other
830       than a digit string fails the pattern-match test.  That is easily  done
831       by  keeping bibliography-specific patterns in a corresponding file with
832       extension 4m.ini24m, because that file is read automatically.
833
834       You should be sure to use empty pattern strings in the pattern file  to
835       discard patterns from earlier initialization files.
836
837       The  value  strings  passed  to the pattern matcher contain surrounding
838       quotes, so the patterns should also.  However, you could use a  pattern
839       specification  like  "\"D" to match an initial digit string followed by
840       anything else; the omission of the final quotation mark \" in the  pat-
841       tern allows the match to succeed without checking that the next charac-
842       ter in the value string is a quotation mark.
843
844       Because the value strings are intended to be processed by TeX, the pat-
845       tern  matching ignores braces, and TeX control sequences, together with
846       any space following those control sequences.  Spaces around braces  are
847       preserved.  That convention allows the pattern fragment 4mA-AD-D24m to match
848       the value string 4mTN-K\slash24m 4m27-7024m, because the value is implicitly col-
849       lapsed to 4mTN-K27-7024m during the matching operation.
850
851       1mbibclean22m's  normal action when a string value fails to match any of the
852       corresponding patterns is to issue a  4mwarning24m  message  something  like
853       this:  4m"Unexpected24m  4mvalue24m  4min24m 4m``year24m 4m=24m 4m"192"''24m.  In most cases, that is
854       sufficient to alert the user to a problem.  In some cases, however,  it
855       may  be  desirable  to  associate a different message with a particular
856       pattern.  That can be done by supplying a message string following  the
857       pattern  string.  Format items 4m%%24m (single percent), 4m%e24m (entry name), 4m%f0m
858       (field name), 4m%k24m (citation key), and 4m%v24m (string value) are available to
859       get current values expanded in the messages.  Here is an example:
860
861       chapter = "\"D:D\"" "Colon found in ``%f = %v''" %% 23:2
862
863       To  be  consistent  with other messages output by 1mbibclean22m, the message
864       string should 4mnot24m end with punctuation.
865
866       If you wish to make the message an error, rather than just  a  warning,
867       begin it with a query (?), like this:
868
869       chapter = "\"D:D\"" "?Colon found in ``%f = %v''" %% 23:2
870
871       The query is be included in the output message.
872
873       Escape  sequences are supported in message strings, just as they are in
874       pattern strings.  You can use that to advantage for fancy things,  such
875       as  terminal display mode control.  If you rewrite the previous example
876       as
877
878       chapter = "\"D:D\"" \
879                 "?\033[7mColon found in ``%f = %v''\033[0m" %% 23:2
880
881       the error message appears in inverse video on display screens that sup-
882       port  ANSI  terminal  control sequences.  Such practice is not normally
883       recommended, because it may have undesirable effects on some output de-
884       vices.   Nevertheless,  you  may find it useful for restricted applica-
885       tions.
886
887       For some types of bibliography fields, 1mbibclean  22mcontains  special-pur-
888       pose code to supplement or replace the pattern matching:
889
890              o  4mCODEN24m,  4mISBN24m  and  4mISSN24m field values are handled that way be-
891                 cause their validation requires evaluation of checksums  that
892                 cannot  be expressed by simple patterns; no patterns are even
893                 used in these three cases.
894
895              o  When 1mbibclean 22mis compiled with pattern-matching code support,
896                 4mchapter24m, 4mnumber24m, 4mpages24m, and 4mvolume24m values are checked only by
897                 pattern matching.
898
899              o  4mmonth24m values are first checked against  the  standard  BibTeX
900                 month  abbreviations,  and only if no match is found are pat-
901                 terns then used.
902
903              o  4myear24m values are first checked against patterns,  then  if  no
904                 match  is  found, the year numbers are found and converted to
905                 integer values for testing against reasonable bounds.
906
907       Values for other fields are checked only  against  patterns.   You  can
908       provide  patterns  for  4many24m field you like, even ones 1mbibclean 22mdoes not
909       already know about.  New ones are simply added  to  an  internal  table
910       that is searched for each string to be validated.
911
912       The  special field, 4mkey24m, represents the bibliographic citation key.  It
913       can be given patterns, like any other field.  Here is an initialization
914       file  pattern  assignment that matches an author name, a colon, a four-
915       digit year, a colon, and an alphabetic string, in  the  BibNet  Project
916       style:
917
918       key = "A:dddd:A"                     %% Knuth:1986:TB
919
920       Notice that no quotation marks are included in the pattern, because the
921       citation keys are not quoted.  You can use such patterns  to  help  en-
922       force  uniform  naming conventions for citation keys, which is increas-
923       ingly important as your bibliography data base grows.
924
9252 1mISBN-INITIALIZATION-FILES0m
926 1mISBN INITIALIZATION FILES0m
927       1mbibclean 22mcontains a compiled-in table of ISBN ranges  and  country/lan-
928       guage settings that is suitable for most applications.
929
930       However,  ISBN data change yearly, as new countries adopt ISBNs, and as
931       publishers are granted new, or additional, ISBN prefixes.
932
933       Thus, from version 2.12, 1mbibclean 22msupports  reading  of  run-time  ISBN
934       initialization  files  found  on the 1mPATH 22m(for VAX VMS, 1mSYS$SYSTEM22m) and
935       1mBIBINPUTS 22msearch paths, and then any specified by  1m-ISBN-file  4m22mfilename0m
936       options.
937
938       That  feature  makes  it  possible to incorporate new ISBN data without
939       having to produce a new 1mbibclean 22mrelease and reinstall the software  at
940       end-user sites.
941
942       The  format  of  an  ISBN initialization file is similar to that of the
943       1mbibclean 22minitialization files described in the preceding section:  com-
944       ments  begin  with percent and continue to end of line, blank and empty
945       lines are ignored, backslash-newline joins adjacent lines,  and  other-
946       wise,  lines  are  expected  to  contain  a required pair of ISBN coun-
947       try/language-publisher prefixes forming a non-decreasing range, option-
948       ally  followed  by  one  or  more words of text that are treated as the
949       country/language group value.  The latter value plays no part  in  ISBN
950       validation,  but its presence is strongly recommended, in order to make
951       the ISBN table more understandable for humans.
952
953       Here is a short example:
954              %% The Faeroes got ISBN assignments between 1993 and 1998
955              99918-0         99918-3        Faeroes
956              99918-40        99918-61
957              99918-900       99918-938
958       It is not necessary to repeat the country names on  succeeding  entries
959       with  the  same initial number (99918 in that example); that is handled
960       internally.
961
962       Data from ISBN files normally augment the compiled-in  data.   However,
963       if  the  first  prefix  begins with a hyphen, then 1mbibclean 22mdeletes the
964       first entry in the table matching that first prefix (ignoring the lead-
965       ing hyphen):
966              %% Latvia got ISBN ranges between 1993 and 1998
967              %% so we remove the old placeholder, then add the
968              %% new ranges.
969              -9984-0         9984-9         This one is no longer valid
970
971              9984-00         9984-20        Latvia
972              9984-500        9984-770
973              9984-9000       9984-9984
974
9752 1mKEYWORD-INITIALIZATION-FILES0m
976 1mKEYWORD INITIALIZATION FILES0m
977       1mbibclean 22mcontains a compiled-in table of keyword mappings that is suit-
978       able for most applications.  The default settings merely adjust letter-
979       case  in certain keyword names, so that, for example, 4misbn24m is output as
980       4mISBN24m.
981
982       From version 2.12, 1mbibclean 22msupports reading of run-time  keyword  ini-
983       tialization  files found on the 1mPATH 22m(for VAX VMS, 1mSYS$SYSTEM22m) and 1mBIB-0m
984       1mINPUTS 22msearch paths, and then any specified by  1m-keyword-file  4m22mfilename0m
985       options.
986
987       That  feature makes it possible to incorporate special spellings of new
988       keywords without having to produce a new 1mbibclean 22mrelease and reinstall
989       the software at end-user sites.
990
991       The  format  of a keyword initialization file is similar to that of the
992       other 1mbibclean 22minitialization files described  in  the  preceding  sec-
993       tions:  comments  begin with percent and continue to end of line, blank
994       and empty lines are ignored, backslash-newline  joins  adjacent  lines,
995       and otherwise, lines are expected to contain a required pair of old and
996       new keyword names.
997
998       Here is a short example:
999              %% We want special handling of MathReviews keywords
1000              mrclass         MRclass
1001              mrnumber        MRnumber
1002              mrreviewer      MRreviewer
1003
1004       Data from keywords files normally augment the compiled-in  data.   How-
1005       ever,  if the first keyword begins with a hyphen, then 1mbibclean 22mdeletes
1006       the first entry in the table matching that keyword (ignoring the  lead-
1007       ing hyphen):
1008              %% Remove special handling of ISBN, ISSN, and LCCN values.
1009              -issn           ISSN
1010              -isbn           ISBN
1011              -lccn           LCCN
1012       Even  though  the  second keyword in each deletion pair is not used, it
1013       still must be specified.
1014
1015       Notice that this feature can be used to regularize keyword  names,  but
1016       use  it  with  care, in order to avoid producing duplicate key names in
1017       output BibTeX entries:
1018              %% Map variations of keywords into a common name:
1019              keys            keywords
1020              keywds          keywords
1021              keyword         keywords
1022              keywrd          keywords
1023              keywrds         keywords
1024              searchkey       keywords
1025
10262 1mLEXICAL-ANALYSIS0m
1027 1mLEXICAL ANALYSIS0m
1028       When 1m-no-prettyprint 22mis specified, 1mbibclean 22macts as a lexical  analyzer
1029       instead of a prettyprinter, producing output in lines of the form
1030
1031              <token-number><tab><token-name><tab>"<token-value>"
1032
1033       Each  output  line  contains  a  single complete token, identified by a
1034       small integer number for use by a computer program, a token  type  name
1035       for human readers, and a string value in quotes.
1036
1037       Special  characters  in  the  token  value  string are represented with
1038       ANSI/ISO Standard C escape sequences, so all characters other than  NUL
1039       are representable, and multi-line values can be represented in a single
1040       line.
1041
1042       Here are the token numbers and token type names that can appear in  the
1043       output when 1m-prettyprint 22mis specified:
1044
1045               0   UNKNOWN
1046               1   ABBREV
1047               2   AT
1048               3   COMMA
1049               4   COMMENT
1050               5   ENTRY
1051               6   EQUALS
1052               7   FIELD
1053               8   INCLUDE
1054               9   INLINE
1055              10   KEY
1056              11   LBRACE
1057              12   LITERAL
1058              13   NEWLINE
1059              14   PREAMBLE
1060              15   RBRACE
1061              16   SHARP
1062              17   SPACE
1063              18   STRING
1064              19   VALUE
1065
1066       Programs  that  parse such output should also be prepared for lines be-
1067       ginning with the warning prefix, %%, or the error prefix, ??,  and  for
1068       ANSI/ISO Standard C line-number directives of the form
1069              # line 273 "texbook1.bib"
1070       that record the line number and file name of the current input file.
1071
1072       If  a  1m-max-width  4m22mnnn24m  command-line  option was specified, long output
1073       lines are wrapped at a backslash-newline pair, and consequently,  soft-
1074       ware that processes the lexical token stream should be prepared to col-
1075       lapse such wrapped lines back into single lines.
1076
1077       As an example of the use of 1m-no-prettyprint22m, the UNIX command pipeline
1078              1mbibclean -no-prettyprint 4m22mmylib.bib24m | \
1079                  1mawk 22m'$2 == "KEY" {print $3}' | \
1080                  1msed 22m-e 's/"//g' | \
1081                  1msort0m
1082       extracts a sorted list of all citation keys in the file 4mmylib.bib24m.
1083
1084       A certain amount of processing has been done on the tokens.  In partic-
1085       ular, delimiters equivalent to braces have been replaced by braces, and
1086       braced strings have become quoted strings.
1087
1088       The LITERAL token type is used for arbitrary text  that  1mbibclean  22mdoes
1089       not  examine  further,  such  as  the contents of a @Preamble{...} or a
1090       @Comment{...}.
1091
1092       The UNKNOWN token type should never appear in the output stream.  It is
1093       used internally to initialize token type variables.
1094
10952 1mSCRIBE-BIBLIOGRAPHY-FORMAT0m
1096 1mSCRIBE BIBLIOGRAPHY FORMAT0m
1097       1mbibclean22m's  support  for the Scribe bibliography format is based on the
1098       syntax description in the Scribe Introductory User's Manual,  3rd  Edi-
1099       tion,  May  1980.   Scribe  was  originally  developed by Brian Reid at
1100       Carnegie-Mellon University, and was marketed by Unilogic,  Ltd.,  later
1101       renamed to Scribe Systems, and apparently now long defunct.
1102
1103       The  BibTeX  bibliography format was strongly influenced by Scribe, and
1104       indeed, with care, it is possible to share bibliography  files  between
1105       the  two systems.  Nevertheless, there are some differences, so here is
1106       a summary of features of the Scribe bibliography file format:
1107
1108       (1)   Letter case is not significant in field names  and  entry  names,
1109             but case is preserved in value strings.
1110
1111       (2)   In field/value pairs, the field and value may be separated by one
1112             of three characters: =, /, or space.  Space may  optionally  sur-
1113             round these separators.
1114
1115       (3)   Value  delimiters  are  any of these seven pairs: { }   [ ]   ( )
1116             < >   ' '   " "   ` `
1117
1118       (4)   Value delimiters may not be nested, even though  with  the  first
1119             four  delimiter  pairs, nested balanced delimiters would be unam-
1120             biguous.
1121
1122       (5)   Delimiters can be omitted around values that  contain  only  let-
1123             ters,  digits,  sharp (#), ampersand (&), period (.), and percent
1124             (%).
1125
1126       (6)   Outside of delimited values, a literal at-sign (@) is represented
1127             by doubled at-signs (@@).
1128
1129       (7)   Bibliography  entries begin with @name, as for BibTeX, but any of
1130             the seven Scribe value delimiter pairs may be  used  to  surround
1131             the  values  in  field/value pairs.  As in (4), nested delimiters
1132             are forbidden.
1133
1134       (8)   Arbitrary space may separate entry names from the  following  de-
1135             limiters.
1136
1137       (9)   @Comment is a special command whose delimited value is discarded.
1138             As in (4), nested delimiters are forbidden.
1139
1140       (10)  The special form
1141
1142             @Begin{comment}
1143              ...
1144             @End{comment}
1145
1146             permits encapsulating arbitrary text containing any characters or
1147             delimiters,  other  than ``@End{comment}''.  Any of the seven de-
1148             limiter pairs may be used around the word  ``comment''  following
1149             the  ``@Begin'' or ``@End''; the delimiters in the two cases need
1150             not   be    the    same,    and    consequently,    ``@Begin{com-
1151             ment}''/``@End{comment}'' pairs may 4mnot24m be nested.
1152
1153       (11)  The 4mkey24m field is required in each bibliography entry.
1154
1155       (12)  A  backslashed  quote  in a string is assumed to be a TeX accent,
1156             and braced appropriately.  While such accents do not  conform  to
1157             Scribe  syntax, Scribe-format bibliographies have been found that
1158             appear to be intended for TeX processing.
1159
1160       Because of that loose syntax, 1mbibclean22m's normal error detection heuris-
1161       tics are less effective, and consequently, Scribe mode input is not the
1162       default; it must be explicitly requested.
1163
11642 1mENVIRONMENT-VARIABLES0m
1165 1mENVIRONMENT VARIABLES0m
1166       1mBIBCLEANEXT   22mFile extension  of  bibliography-specific  initialization
1167                     files.  Default: 4m.ini24m.
1168
1169       1mBIBCLEANINI   22mName  of  1mbibclean  22minitialization files.  Default: 4m.bib-0m
1170                     4mcleanrc24m (UNIX), 4mbibclean.ini24m (non-UNIX).
1171
1172       1mBIBCLEANISBN  22mName of 1mbibclean  22mISBN  initialization  files.   Default:
1173                     4m.bibclean.isbn24m (UNIX), 4mbibclean.isb24m (non-UNIX).
1174
1175       1mBIBCLEANKEY   22mName  of 1mbibclean 22mkeyword initialization files.  Default:
1176                     4m.bibclean.key24m (UNIX), 4mbibclean.key24m (non-UNIX).
1177
1178       1mBIBINPUTS     22mSearch path for 1mbibclean  22mand  BibTeX  input  files.   On
1179                     UNIX,  it  is  a colon-separated list of directories that
1180                     are searched in order from first to last.  It is  not  an
1181                     error for a specified directory to not exist.
1182
1183                     On other operating systems, the directory names should be
1184                     separated by whatever character is used in system  search
1185                     path specifications, such as a semicolon on IBM PC DOS.
1186
1187       1mPATH          22mOn  Atari TOS, IBM PC DOS, IBM PC OS/2, Microsoft NT, and
1188                     UNIX, search path for system executable files.  The  sys-
1189                     tem-wide  1mbibclean 22minitialization file is searched for in
1190                     that path.
1191
1192       1mSYS$SYSTEM    22mOn VAX VMS, search path for system executable  files  and
1193                     the system-wide 1mbibclean 22minitialization file.
1194
11952 1mFILES0m
1196 1mFILES0m
1197       4m*.bib24m          BibTeX and Scribe bibliography data base files.
1198
1199       4m*.ini24m          File-specific initialization files.
1200
1201       4m.bibclean.isbn24m UNIX  system-wide  and user-specific ISBN initialization
1202                      files.
1203
1204       4m.bibclean.key24m  UNIX system-wide and user-specific  keyword  initializa-
1205                      tion files.
1206
1207       4m.bibcleanrc24m    UNIX system-wide and user-specific initialization files.
1208
1209       4mbibclean.ini24m   Non-UNIX  system-wide  and  user-specific initialization
1210                      files.
1211
1212       4mbibclean.isb24m   Non-UNIX system-wide and user-specific ISBN  initializa-
1213                      tion files.
1214
1215       4mbibclean.key24m   Non-UNIX  system-wide and user-specific keyword initial-
1216                      ization files.
1217
12182 1mSEE-ALSO0m
1219 1mSEE ALSO0m
1220       1mbibcheck22m(1), 1mbibdup22m(1), 1mbibextract22m(1), 1mbibindex22m(1), 1mbibjoin22m(1),  1mbibla-0m
1221       1mbel22m(1),  1mbiblex22m(1), 1mbiblook22m(1), 1mbiborder22m(1), 1mbibparse22m(1), 1mbibsearch22m(1),
1222       1mbibsort22m(1),  1mbibtex22m(1),  1mbibunlex22m(1),  1mcitefind22m(1),  1mcitesub22m(1),  1mcite-0m
1223       1mtags22m(1), 1mlatex22m(1), 1mscribe22m(1), 1mtex22m(1).
1224
12252 1mAUTHOR0m
1226 1mAUTHOR0m
1227       Nelson H. F. Beebe
1228       University of Utah
1229       Department of Mathematics, 110 LCB
1230       155 S 1400 E RM 233
1231       Salt Lake City, UT 84112-0090
1232       USA
1233       Tel: +1 801 581 5254
1234       FAX: +1 801 581 4148
1235       Email: beebe@math.utah.edu, beebe@acm.org, beebe@computer.org (Internet)
1236       URL: http://www.math.utah.edu/~beebe
1237
12382 1mCOPYRIGHT0m
1239 1mCOPYRIGHT0m
1240       ########################################################################
1241       ########################################################################
1242       ########################################################################
1243       ###                                                                  ###
1244       ###     bibclean: prettyprint and syntax check BibTeX and Scribe     ###
1245       ###                   bibliography data base files                   ###
1246       ###                                                                  ###
1247       ###           Copyright (C) 1990--2016 Nelson H. F. Beebe            ###
1248       ###                                                                  ###
1249       ### This program is covered by the GNU General Public License (GPL), ###
1250       ### version 2 or later, available as the file COPYING in the program ###
1251       ### source distribution, and on the Internet at                      ###
1252       ###                                                                  ###
1253       ###               ftp://ftp.gnu.org/gnu/GPL                          ###
1254       ###                                                                  ###
1255       ###               http://www.gnu.org/copyleft/gpl.html               ###
1256       ###                                                                  ###
1257       ### This program is free software; you can redistribute it and/or    ###
1258       ### modify it under the terms of the GNU General Public License as   ###
1259       ### published by the Free Software Foundation; either version 2 of   ###
1260       ### the License, or (at your option) any later version.              ###
1261       ###                                                                  ###
1262       ### This program is distributed in the hope that it will be useful,  ###
1263       ### but WITHOUT ANY WARRANTY; without even the implied warranty of   ###
1264       ### MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the    ###
1265       ### GNU General Public License for more details.                     ###
1266       ###                                                                  ###
1267       ### You should have received a copy of the GNU General Public        ###
1268       ### License along with this program; if not, write to the Free       ###
1269       ### Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,   ###
1270       ### MA 02111-1307 USA                                                ###
1271       ########################################################################
1272       ########################################################################
1273       ########################################################################
1274
12752 Version-3.05----------------------18-May-2020----------------------BIBCLEAN(1)
1276 Version 3.05                      18 May 2020                      BIBCLEAN(1)
1277