1[manpage_begin textutil n 0.8]
2[see_also regexp(n)]
3[see_also split(n)]
4[see_also string(n)]
5[keywords formatting]
6[keywords hyphenation]
7[keywords indenting]
8[keywords paragraph]
9[keywords {regular expression}]
10[keywords string]
11[keywords TeX]
12[keywords trimming]
13[moddesc   {Text and string utilities, macro processing}]
14[titledesc {Procedures to manipulate texts and strings.}]
15[category  {Text processing}]
16[require Tcl 8.2]
17[require textutil [opt 0.8]]
18[description]
19
20The package [package textutil] provides commands that manipulate
21strings or texts (a.k.a. long strings or string with embedded newlines
22or paragraphs).
23
24It is actually a bundle providing the commands of the six packages
25
26[list_begin definitions]
27[def [package textutil::adjust]]
28[def [package textutil::repeat]]
29[def [package textutil::split]]
30[def [package textutil::string]]
31[def [package textutil::tabify]]
32[def [package textutil::trim]]
33[list_end]
34
35in the namespace [namespace textutil].
36
37[para]
38
39The bundle is [emph deprecated], and it will be removed in a future
40release of Tcllib, after the next release. It is recommended to use the
41relevant sub packages instead for whatever functionality is needed by
42the using package or application.
43
44[para]
45
46The complete set of procedures is described below.
47
48[list_begin definitions]
49
50[call [cmd ::textutil::adjust] [arg "string args"]]
51
52Do a justification on the [arg string] according to [arg args].  The
53string is taken as one big paragraph, ignoring any newlines.  Then the
54line is formatted according to the options used, and the command
55return a new string with enough lines to contain all the printable
56chars in the input string. A line is a set of chars between the
57beginning of the string and a newline, or between 2 newlines, or
58between a newline and the end of the string. If the input string is
59small enough, the returned string won't contain any newlines.
60
61[para]
62
63Together with [cmd ::textutil::indent] it is possible to create
64properly wrapped paragraphs with arbitrary indentations.
65
66[para]
67
68By default, any occurrence of spaces characters or tabulation are
69replaced by a single space so each word in a line is separated from
70the next one by exactly one space char, and this forms a [emph real]
71line. Each [emph real] line is placed in a [emph logical] line, which
72have exactly a given length (see [option -length] option below). The
73[emph real] line may have a lesser length. Again by default, any
74trailing spaces are ignored before returning the string (see
75
76[option -full] option below). The following options may be used after the
77[arg string] parameter, and change the way the command place a
78
79[emph real] line in a [emph logical] line.
80
81[list_begin definitions]
82
83[def "-full [arg boolean]"]
84
85If set to [const false], any trailing space chars are deleted before
86returning the string. If set to [const true], any trailing space
87chars are left in the string. Default to [const false].
88
89[def "[option -hyphenate] [arg boolean]"]
90
91if set to [const false], no hyphenation will be done. If set to
92[const true], the last word of a line is tried to be hyphenated.
93Defaults to [const false]. Note: hyphenation patterns must be loaded
94prior, using the command [cmd ::textutil::adjust::readPatterns].
95
96[def "[option -justify] [const center|left|plain|right]"]
97
98Set the justification of the returned string to [const center],
99
100[const left], [const plain] or [const right]. By default, it is set to
101[const left].  The justification means that any line in the returned
102string but the last one is build according to the value. If the
103justification is set to [const plain] and the number of printable
104chars in the last line is less than 90% of the length of a line (see
105[option -length]), then this line is justified with the [const left]
106value, avoiding the expansion of this line when it is too small. The
107meaning of each value is:
108
109[list_begin definitions]
110
111[def [const center]]
112
113The real line is centered in the logical line. If needed, a set of
114space characters are added at the beginning (half of the needed set)
115and at the end (half of the needed set) of the line if required (see
116the option [option -full]).
117
118[def [const left]]
119
120The real line is set on the left of the logical line. It means that
121there are no space chars at the beginning of this line. If required,
122all needed space chars are added at the end of the line (see the
123option [option -full]).
124
125[def [const plain]]
126
127The real line is exactly set in the logical line. It means that there
128are no leading or trailing space chars. All the needed space chars are
129added in the [emph real] line, between 2 (or more) words.
130
131[def [const right]]
132
133The real line is set on the right of the logical line. It means that
134there are no space chars at the end of this line, and there may be
135some space chars at the beginning, despite of the [option -full] option.
136
137[list_end]
138
139[def "[option -length] [arg integer]"]
140
141Set the length of the [emph logical] line in the string to
142[arg integer].  [arg integer] must be a positive integer
143value. Defaults to [const 72].
144
145[def "[option -strictlength] [arg boolean]"]
146
147If set to [const false], a line can exceed the specified
148
149[option -length] if a single word is longer than [option -length]. If
150set to [const true], words that are longer than [option -length] are
151split so that no line exceeds the specified [option -length]. Defaults
152to [const false].
153
154[list_end]
155
156[call [cmd ::textutil::adjust::readPatterns] [arg filename]]
157
158Loads the internal storage for hyphenation patterns with the contents
159of the file [arg filename]. This has to be done prior to calling
160command [cmd ::textutil::adjust] with
161
162"[option -hyphenate] [const true]", or the hyphenation process will
163not work correctly.
164
165[para]
166
167The package comes with a number of predefined pattern files, and the
168command [cmd ::textutil::adjust::listPredefined] can be used to find
169out their names.
170
171[call [cmd ::textutil::adjust::listPredefined]]
172
173This command returns a list containing the names of the hyphenation
174files coming with this package.
175
176[call [cmd ::textutil::adjust::getPredefined] [arg filename]]
177
178Use this command to query the package for the full path name of the
179hyphenation file [arg filename] coming with the package. Only the
180filenames found in the list returned by
181
182[cmd ::textutil::adjust::listPredefined] are legal arguments for this
183command.
184
185[call [cmd ::textutil::indent] [arg string] [arg prefix] [opt [arg skip]]]
186
187Each line in the [arg string] indented by adding the string
188[arg prefix] at its beginning. The modified string is returned
189as the result of the command.
190
191[para]
192
193If [arg skip] is specified the first [arg skip] lines are left
194untouched. The default for [arg skip] is [const 0], causing the
195modification of all lines. Negative values for [arg skip] are treated
196like [const 0]. In other words, [arg skip] > [const 0] creates a
197hanging indentation.
198
199[para]
200
201Together with [cmd ::textutil::adjust] it is possible to create
202properly wrapped paragraphs with arbitrary indentations.
203
204[call [cmd ::textutil::undent] [arg string]]
205
206The command computes the common prefix for all
207lines in [arg string] consisting solely out of whitespace,
208removes this from each line and returns the modified string.
209
210[para]
211
212Lines containing only whitespace are always reduced to completely
213empty lines. They and empty lines are also ignored when computing the
214prefix to remove.
215
216[para]
217
218Together with [cmd ::textutil::adjust] it is possible to create
219properly wrapped paragraphs with arbitrary indentations.
220
221[call [cmd ::textutil::splitn] [arg string] [opt [arg len]]]
222
223This command splits the given [arg string] into chunks of [arg len]
224characters and returns a list containing these chunks. The argument
225[arg len] defaults to [const 1] if none is specified. A negative
226length is not allowed and will cause the command to throw an
227error. Providing an empty string as input is allowed, the command will
228then return an empty list. If the length of the [arg string] is not an
229entire multiple of the chunk length, then the last chunk in the
230generated list will be shorter than [arg len].
231
232[call [cmd ::textutil::splitx] [arg string] [opt [arg regexp]]]
233
234Split the [arg string] and return a list. The string is split
235according to the regular expression [arg regexp] instead of a simple
236list of chars. Note that if you add parenthesis into the [arg regexp],
237the parentheses part of separator would be added into list as
238additional element. If the [arg string] is empty the result is the
239empty list, like for [cmd split]. If [arg regexp] is empty the
240
241[arg string] is split at every character, like [cmd split] does.
242
243The regular expression [arg regexp] defaults to "[lb]\\t \\r\\n[rb]+".
244
245[call [cmd ::textutil::tabify] [arg string] [opt [arg num]]]
246
247Tabify the [arg string] by replacing any substring of [arg num] space
248chars by a tabulation and return the result as a new string. [arg num]
249defaults to 8.
250
251[call [cmd ::textutil::tabify2] [arg string] [opt [arg num]]]
252
253Similar to [cmd ::textutil::tabify] this command tabifies the
254
255[arg string] and returns the result as a new string. A different
256algorithm is used however. Instead of replacing any substring of
257
258[arg num] spaces this command works more like an editor. [arg num]
259defaults to 8.
260
261[para]
262
263Each line of the text in [arg string] is treated as if there are
264tabstops every [arg num] columns. Only sequences of space characters
265containing more than one space character and found immediately before
266a tabstop are replaced with tabs.
267
268[call [cmd ::textutil::trim] [arg string] [opt [arg regexp]]]
269
270Remove in [arg string] any leading and trailing substring according to
271the regular expression [arg regexp] and return the result as a new
272string.  This apply on any [emph line] in the string, that is any
273substring between 2 newline chars, or between the beginning of the
274string and a newline, or between a newline and the end of the string,
275or, if the string contain no newline, between the beginning and the
276end of the string.
277
278The regular expression [arg regexp] defaults to "[lb] \\t[rb]+".
279
280[call [cmd ::textutil::trimleft] [arg string] [opt [arg regexp]]]
281
282Remove in [arg string] any leading substring according to the regular
283expression [arg regexp] and return the result as a new string. This
284apply on any [emph line] in the string, that is any substring between
2852 newline chars, or between the beginning of the string and a newline,
286or between a newline and the end of the string, or, if the string
287contain no newline, between the beginning and the end of the string.
288
289The regular expression [arg regexp] defaults to "[lb] \\t[rb]+".
290
291[call [cmd ::textutil::trimright] [arg string] [opt [arg regexp]]]
292
293Remove in [arg string] any trailing substring according to the regular
294expression [arg regexp] and return the result as a new string. This
295apply on any [emph line] in the string, that is any substring between
2962 newline chars, or between the beginning of the string and a newline,
297or between a newline and the end of the string, or, if the string
298contain no newline, between the beginning and the end of the string.
299
300The regular expression [arg regexp] defaults to "[lb] \\t[rb]+".
301
302[call [cmd ::textutil::trimPrefix] [arg string] [arg prefix]]
303
304Removes the [arg prefix] from the beginning of [arg string] and
305returns the result. The [arg string] is left unchanged if it doesn't
306have [arg prefix] at its beginning.
307
308[call [cmd ::textutil::trimEmptyHeading] [arg string]]
309
310Looks for empty lines (including lines consisting of only whitespace)
311at the beginning of the [arg string] and removes it. The modified
312string is returned as the result of the command.
313
314[call [cmd ::textutil::untabify] [arg string] [opt [arg num]]]
315
316Untabify the [arg string] by replacing any tabulation char by a
317substring of [arg num] space chars and return the result as a new
318string. [arg num] defaults to 8.
319
320[call [cmd ::textutil::untabify2] [arg string] [opt [arg num]]]
321
322Untabify the [arg string] by replacing any tabulation char by a
323substring of at most [arg num] space chars and return the result as a
324new string. Unlike [cmd textutil::untabify] each tab is not replaced
325by a fixed number of space characters.  The command overlays each line
326in the [arg string] with tabstops every [arg num] columns instead and
327replaces tabs with just enough space characters to reach the next
328tabstop. This is the complement of the actions taken by
329
330[cmd ::textutil::tabify2]. [arg num] defaults to 8.
331
332[para]
333
334There is one asymmetry though: A tab can be replaced with a single
335space, but not the other way around.
336
337[call [cmd ::textutil::strRepeat] [arg "text num"]]
338
339The implementation depends on the core executing the package. Used
340[cmd "string repeat"] if it is present, or a fast tcl implementation
341if it is not. Returns a string containing the [arg text] repeated
342
343[arg num] times. The repetitions are joined without characters between
344them. A value of [arg num] <= 0 causes the command to return an empty
345string.
346
347[call [cmd ::textutil::blank] [arg num]]
348
349A convenience command. Returns a string of [arg num] spaces.
350
351[call [cmd ::textutil::chop] [arg string]]
352
353A convenience command. Removes the last character of [arg string] and
354returns the shortened string.
355
356[call [cmd ::textutil::tail] [arg string]]
357
358A convenience command. Removes the first character of [arg string] and
359returns the shortened string.
360
361[call [cmd ::textutil::cap] [arg string]]
362
363Capitalizes the first character of [arg string] and returns the modified string.
364
365[call [cmd ::textutil::uncap] [arg string]]
366
367The complementary operation to [cmd ::textutil::cap]. Forces the first
368character of [arg string] to lower case and returns the modified
369string.
370
371[call [cmd ::textutil::longestCommonPrefixList] [arg list]]
372[call [cmd ::textutil::longestCommonPrefix] [opt [arg string]...]]
373
374Computes the longest common prefix for either the [arg string]s given
375to the command, or the strings specified in the single [arg list], and
376returns it as the result of the command.
377
378[para]
379
380If no strings were specified the result is the empty string.  If only
381one string was specified, the string itself is returned, as it is its
382own longest common prefix.
383
384[list_end]
385
386[vset CATEGORY textutil]
387[include ../common-text/feedback.inc]
388[manpage_end]
389