1% This is for checking how the underlying Lisp copes with
2% printing (and exploding) strings and symbols that contain
3% multi-byte characters - ie utf-8 sequences for characters with
4% code over U+007f.
5
6% The output is a little tedious to decode, but this is intended to
7% illustrate a collection of cases both as regards the actual
8% output generated and the calculation of "output columns" and hence
9% the way in which lines get wrapped.
10
11% At present this is only expected to give even partly sensible
12% results with CSL.
13
14lisp;
15on echo;
16
17% test line overflow
18
19<<
20% This test dispays a sequence of characters in clumps of 7 interleaved
21% with numbers showing the output column that has been reached. The
22% three instances use a letter "a" which provides a simple reference case.
23% then there is a "pi", a "forall" symbol and a double-struck capical B:
24% those use two, three and four bytes. Note that the #Bopf; may not be
25% available in the font you use unless it is somewhat specialised.
26% Note that #Zopf;, #Qopf; and #Ropf; are often used to denote the integers,
27% rationals an dreals, and that #Bopf; is a similar font effect.
28%
29% If things work well then the display should be similar in all cases,
30% both in term of the column values printed and the position
31% where line-breaks are inserted. If (eg) a sequence of utf-8 bytes ends up
32% counted as multiple "columns" that could lead to differences.
33%
34% First try printing strings.
35  linelength 72;
36  terpri(); terpri();
37  prin2 "Check linelength effect with strings";
38  terpri();
39  prin2 ".. each of the following 4 blocks should show the sama layout";
40  foreach x in list("a", "#pi;", "#ForAll;", "#Bopf;") do <<
41     terpri();
42     for i := 1:11 do <<
43       for j := 1:7 do prin2 x;
44       prin2 posn() >> >>;
45  terpri(); terpri();
46% Now the same but printing symbols (using prin2).
47  prin2 "Check linelength effect with symbols";
48  terpri();
49  prin2 ".. each of the following 4 blocks should show the sama layout";
50  foreach x in list('a, '#pi;, '#ForAll;, '#Bopf;) do <<
51     terpri();
52     for i := 1:11 do <<
53       for j := 1:7 do prin2 x;
54       prin2 posn() >> >>;
55
56% This section uses prin1 and variations on explode to process first strings
57% and then symbols with various contents. For prin1 the requirement is that
58% the output be re-inputable.
59% The string here is intended to contain a jolly mix of potential issues.
60  w1 := "2AbCd #pi; #ForAll; #Bopf; #hash;pi; #quot; #gamma; #Gamma;";
61  foreach x in list(w1, intern w1) do <<
62     terpri();
63     prin2 "Test using ";
64     if stringp x then prin2 "strings" else prin2 "symbols";
65     terpri();
66% prin2 is used just to display the information "naturally" (at least
67% if you have an utf-8 capable terminal with enough fonts installed.
68     prin2 "Raw: "; prin2 x; print posn();
69
70% prin1 should generate re-inputable material, and to assure that it
71% renders extended characters as hex sequence such as "#1234;". Within a
72% string if such a sequence literally occured then the initial "#" is expanded
73% to be "#hash;". In strings any double quote mark is doubled, while in
74% symbols special characters are preceeded by an exclamation mark.
75     prin2 "prin1: "; prin1 x; print posn();
76
77% explode2 should be rather like prin2 except that it generates a list of
78% characters. Note that this means that multi-byte sequences in the data will
79% need to be rendered as single multi-byte character objects. E.g.
80% explode2 "#alpha;" => (#alpha;), a list of length 1.
81% spaces) it must explode2 as
82     prin2 "explode2: "; prin1 explode2 x; print posn();
83
84% explode is like prin1 except that it can end up with extended characters...
85% thus
86% explode "#alpha;" => (!" !#alpha; !"), a list of length 3. The only joker
87% here is that if the string contains a literal sequence "# w o r d ;" (without
88% the spaces) then that has to end up as (!" !# h a s h !; w o r d !; !")
89% so it can be re-inputable.
90     prin2 "explode: "; prin1 explode x; print posn();
91% explodecn is like explodec but returns a list of the numeric codes of
92% the characters involved. E.g.
93% explodecn "#alpha;" => (945)
94     princ "explodecn: "; prin1 explodecn x; print posn();
95% exploden is like explode but returns a list of integer codes.
96% Note some codes can be bigger than 0xff.
97     princ "exploden: "; prin1 exploden x; print posn();
98% explode2uc (and explode2lc, explode2ucn, explode2lcn) are like
99% explode2 except that they folds characters to upper or lower case.
100% There are two issues here. The first is whether #alpha; will change to
101% #Alpha; (and similarly for all other non-Latin letters), the second
102% is that the names for special characters will need to retain their
103% regular case, so for instance #Alpha; must appear not #ALPHA; even
104% after conversion to upper case. If in fact extended characters are
105% printed in hex not using names much of that worry evaporates.
106% In some - perhaps all - locales only a-x and A-Z will be changed
107% by case folding...
108     princ "explode2uc: "; prin1 explode2uc x; print posn();
109     princ "explode2lc: "; prin1 explode2lc x; print posn() >>;
110  terpri() >>;
111
112end;
113