1[manpage_begin textutil n 0.8] 2[see_also regexp(n)] 3[see_also split(n)] 4[see_also string(n)] 5[keywords formatting] 6[keywords hyphenation] 7[keywords indenting] 8[keywords paragraph] 9[keywords {regular expression}] 10[keywords string] 11[keywords TeX] 12[keywords trimming] 13[moddesc {Text and string utilities, macro processing}] 14[titledesc {Procedures to manipulate texts and strings.}] 15[category {Text processing}] 16[require Tcl 8.2] 17[require textutil [opt 0.8]] 18[description] 19 20The package [package textutil] provides commands that manipulate 21strings or texts (a.k.a. long strings or string with embedded newlines 22or paragraphs). 23 24It is actually a bundle providing the commands of the six packages 25 26[list_begin definitions] 27[def [package textutil::adjust]] 28[def [package textutil::repeat]] 29[def [package textutil::split]] 30[def [package textutil::string]] 31[def [package textutil::tabify]] 32[def [package textutil::trim]] 33[list_end] 34 35in the namespace [namespace textutil]. 36 37[para] 38 39The bundle is [emph deprecated], and it will be removed in a future 40release of Tcllib, after the next release. It is recommended to use the 41relevant sub packages instead for whatever functionality is needed by 42the using package or application. 43 44[para] 45 46The complete set of procedures is described below. 47 48[list_begin definitions] 49 50[call [cmd ::textutil::adjust] [arg "string args"]] 51 52Do a justification on the [arg string] according to [arg args]. The 53string is taken as one big paragraph, ignoring any newlines. Then the 54line is formatted according to the options used, and the command 55return a new string with enough lines to contain all the printable 56chars in the input string. A line is a set of chars between the 57beginning of the string and a newline, or between 2 newlines, or 58between a newline and the end of the string. If the input string is 59small enough, the returned string won't contain any newlines. 60 61[para] 62 63Together with [cmd ::textutil::indent] it is possible to create 64properly wrapped paragraphs with arbitrary indentations. 65 66[para] 67 68By default, any occurrence of spaces characters or tabulation are 69replaced by a single space so each word in a line is separated from 70the next one by exactly one space char, and this forms a [emph real] 71line. Each [emph real] line is placed in a [emph logical] line, which 72have exactly a given length (see [option -length] option below). The 73[emph real] line may have a lesser length. Again by default, any 74trailing spaces are ignored before returning the string (see 75 76[option -full] option below). The following options may be used after the 77[arg string] parameter, and change the way the command place a 78 79[emph real] line in a [emph logical] line. 80 81[list_begin definitions] 82 83[def "-full [arg boolean]"] 84 85If set to [const false], any trailing space chars are deleted before 86returning the string. If set to [const true], any trailing space 87chars are left in the string. Default to [const false]. 88 89[def "[option -hyphenate] [arg boolean]"] 90 91if set to [const false], no hyphenation will be done. If set to 92[const true], the last word of a line is tried to be hyphenated. 93Defaults to [const false]. Note: hyphenation patterns must be loaded 94prior, using the command [cmd ::textutil::adjust::readPatterns]. 95 96[def "[option -justify] [const center|left|plain|right]"] 97 98Set the justification of the returned string to [const center], 99 100[const left], [const plain] or [const right]. By default, it is set to 101[const left]. The justification means that any line in the returned 102string but the last one is build according to the value. If the 103justification is set to [const plain] and the number of printable 104chars in the last line is less than 90% of the length of a line (see 105[option -length]), then this line is justified with the [const left] 106value, avoiding the expansion of this line when it is too small. The 107meaning of each value is: 108 109[list_begin definitions] 110 111[def [const center]] 112 113The real line is centered in the logical line. If needed, a set of 114space characters are added at the beginning (half of the needed set) 115and at the end (half of the needed set) of the line if required (see 116the option [option -full]). 117 118[def [const left]] 119 120The real line is set on the left of the logical line. It means that 121there are no space chars at the beginning of this line. If required, 122all needed space chars are added at the end of the line (see the 123option [option -full]). 124 125[def [const plain]] 126 127The real line is exactly set in the logical line. It means that there 128are no leading or trailing space chars. All the needed space chars are 129added in the [emph real] line, between 2 (or more) words. 130 131[def [const right]] 132 133The real line is set on the right of the logical line. It means that 134there are no space chars at the end of this line, and there may be 135some space chars at the beginning, despite of the [option -full] option. 136 137[list_end] 138 139[def "[option -length] [arg integer]"] 140 141Set the length of the [emph logical] line in the string to 142[arg integer]. [arg integer] must be a positive integer 143value. Defaults to [const 72]. 144 145[def "[option -strictlength] [arg boolean]"] 146 147If set to [const false], a line can exceed the specified 148 149[option -length] if a single word is longer than [option -length]. If 150set to [const true], words that are longer than [option -length] are 151split so that no line exceeds the specified [option -length]. Defaults 152to [const false]. 153 154[list_end] 155 156[call [cmd ::textutil::adjust::readPatterns] [arg filename]] 157 158Loads the internal storage for hyphenation patterns with the contents 159of the file [arg filename]. This has to be done prior to calling 160command [cmd ::textutil::adjust] with 161 162"[option -hyphenate] [const true]", or the hyphenation process will 163not work correctly. 164 165[para] 166 167The package comes with a number of predefined pattern files, and the 168command [cmd ::textutil::adjust::listPredefined] can be used to find 169out their names. 170 171[call [cmd ::textutil::adjust::listPredefined]] 172 173This command returns a list containing the names of the hyphenation 174files coming with this package. 175 176[call [cmd ::textutil::adjust::getPredefined] [arg filename]] 177 178Use this command to query the package for the full path name of the 179hyphenation file [arg filename] coming with the package. Only the 180filenames found in the list returned by 181 182[cmd ::textutil::adjust::listPredefined] are legal arguments for this 183command. 184 185[call [cmd ::textutil::indent] [arg string] [arg prefix] [opt [arg skip]]] 186 187Each line in the [arg string] indented by adding the string 188[arg prefix] at its beginning. The modified string is returned 189as the result of the command. 190 191[para] 192 193If [arg skip] is specified the first [arg skip] lines are left 194untouched. The default for [arg skip] is [const 0], causing the 195modification of all lines. Negative values for [arg skip] are treated 196like [const 0]. In other words, [arg skip] > [const 0] creates a 197hanging indentation. 198 199[para] 200 201Together with [cmd ::textutil::adjust] it is possible to create 202properly wrapped paragraphs with arbitrary indentations. 203 204[call [cmd ::textutil::undent] [arg string]] 205 206The command computes the common prefix for all 207lines in [arg string] consisting solely out of whitespace, 208removes this from each line and returns the modified string. 209 210[para] 211 212Lines containing only whitespace are always reduced to completely 213empty lines. They and empty lines are also ignored when computing the 214prefix to remove. 215 216[para] 217 218Together with [cmd ::textutil::adjust] it is possible to create 219properly wrapped paragraphs with arbitrary indentations. 220 221[call [cmd ::textutil::splitn] [arg string] [opt [arg len]]] 222 223This command splits the given [arg string] into chunks of [arg len] 224characters and returns a list containing these chunks. The argument 225[arg len] defaults to [const 1] if none is specified. A negative 226length is not allowed and will cause the command to throw an 227error. Providing an empty string as input is allowed, the command will 228then return an empty list. If the length of the [arg string] is not an 229entire multiple of the chunk length, then the last chunk in the 230generated list will be shorter than [arg len]. 231 232[call [cmd ::textutil::splitx] [arg string] [opt [arg regexp]]] 233 234Split the [arg string] and return a list. The string is split 235according to the regular expression [arg regexp] instead of a simple 236list of chars. Note that if you add parenthesis into the [arg regexp], 237the parentheses part of separator would be added into list as 238additional element. If the [arg string] is empty the result is the 239empty list, like for [cmd split]. If [arg regexp] is empty the 240 241[arg string] is split at every character, like [cmd split] does. 242 243The regular expression [arg regexp] defaults to "[lb]\\t \\r\\n[rb]+". 244 245[call [cmd ::textutil::tabify] [arg string] [opt [arg num]]] 246 247Tabify the [arg string] by replacing any substring of [arg num] space 248chars by a tabulation and return the result as a new string. [arg num] 249defaults to 8. 250 251[call [cmd ::textutil::tabify2] [arg string] [opt [arg num]]] 252 253Similar to [cmd ::textutil::tabify] this command tabifies the 254 255[arg string] and returns the result as a new string. A different 256algorithm is used however. Instead of replacing any substring of 257 258[arg num] spaces this command works more like an editor. [arg num] 259defaults to 8. 260 261[para] 262 263Each line of the text in [arg string] is treated as if there are 264tabstops every [arg num] columns. Only sequences of space characters 265containing more than one space character and found immediately before 266a tabstop are replaced with tabs. 267 268[call [cmd ::textutil::trim] [arg string] [opt [arg regexp]]] 269 270Remove in [arg string] any leading and trailing substring according to 271the regular expression [arg regexp] and return the result as a new 272string. This apply on any [emph line] in the string, that is any 273substring between 2 newline chars, or between the beginning of the 274string and a newline, or between a newline and the end of the string, 275or, if the string contain no newline, between the beginning and the 276end of the string. 277 278The regular expression [arg regexp] defaults to "[lb] \\t[rb]+". 279 280[call [cmd ::textutil::trimleft] [arg string] [opt [arg regexp]]] 281 282Remove in [arg string] any leading substring according to the regular 283expression [arg regexp] and return the result as a new string. This 284apply on any [emph line] in the string, that is any substring between 2852 newline chars, or between the beginning of the string and a newline, 286or between a newline and the end of the string, or, if the string 287contain no newline, between the beginning and the end of the string. 288 289The regular expression [arg regexp] defaults to "[lb] \\t[rb]+". 290 291[call [cmd ::textutil::trimright] [arg string] [opt [arg regexp]]] 292 293Remove in [arg string] any trailing substring according to the regular 294expression [arg regexp] and return the result as a new string. This 295apply on any [emph line] in the string, that is any substring between 2962 newline chars, or between the beginning of the string and a newline, 297or between a newline and the end of the string, or, if the string 298contain no newline, between the beginning and the end of the string. 299 300The regular expression [arg regexp] defaults to "[lb] \\t[rb]+". 301 302[call [cmd ::textutil::trimPrefix] [arg string] [arg prefix]] 303 304Removes the [arg prefix] from the beginning of [arg string] and 305returns the result. The [arg string] is left unchanged if it doesn't 306have [arg prefix] at its beginning. 307 308[call [cmd ::textutil::trimEmptyHeading] [arg string]] 309 310Looks for empty lines (including lines consisting of only whitespace) 311at the beginning of the [arg string] and removes it. The modified 312string is returned as the result of the command. 313 314[call [cmd ::textutil::untabify] [arg string] [opt [arg num]]] 315 316Untabify the [arg string] by replacing any tabulation char by a 317substring of [arg num] space chars and return the result as a new 318string. [arg num] defaults to 8. 319 320[call [cmd ::textutil::untabify2] [arg string] [opt [arg num]]] 321 322Untabify the [arg string] by replacing any tabulation char by a 323substring of at most [arg num] space chars and return the result as a 324new string. Unlike [cmd textutil::untabify] each tab is not replaced 325by a fixed number of space characters. The command overlays each line 326in the [arg string] with tabstops every [arg num] columns instead and 327replaces tabs with just enough space characters to reach the next 328tabstop. This is the complement of the actions taken by 329 330[cmd ::textutil::tabify2]. [arg num] defaults to 8. 331 332[para] 333 334There is one asymmetry though: A tab can be replaced with a single 335space, but not the other way around. 336 337[call [cmd ::textutil::strRepeat] [arg "text num"]] 338 339The implementation depends on the core executing the package. Used 340[cmd "string repeat"] if it is present, or a fast tcl implementation 341if it is not. Returns a string containing the [arg text] repeated 342 343[arg num] times. The repetitions are joined without characters between 344them. A value of [arg num] <= 0 causes the command to return an empty 345string. 346 347[call [cmd ::textutil::blank] [arg num]] 348 349A convenience command. Returns a string of [arg num] spaces. 350 351[call [cmd ::textutil::chop] [arg string]] 352 353A convenience command. Removes the last character of [arg string] and 354returns the shortened string. 355 356[call [cmd ::textutil::tail] [arg string]] 357 358A convenience command. Removes the first character of [arg string] and 359returns the shortened string. 360 361[call [cmd ::textutil::cap] [arg string]] 362 363Capitalizes the first character of [arg string] and returns the modified string. 364 365[call [cmd ::textutil::uncap] [arg string]] 366 367The complementary operation to [cmd ::textutil::cap]. Forces the first 368character of [arg string] to lower case and returns the modified 369string. 370 371[call [cmd ::textutil::longestCommonPrefixList] [arg list]] 372[call [cmd ::textutil::longestCommonPrefix] [opt [arg string]...]] 373 374Computes the longest common prefix for either the [arg string]s given 375to the command, or the strings specified in the single [arg list], and 376returns it as the result of the command. 377 378[para] 379 380If no strings were specified the result is the empty string. If only 381one string was specified, the string itself is returned, as it is its 382own longest common prefix. 383 384[list_end] 385 386[vset CATEGORY textutil] 387[include ../common-text/feedback.inc] 388[manpage_end] 389