1Since some preprocessor directives insert raw HTML, it would be good to
2specify, per-format, how to pass HTML so that it goes through the format
3OK. With Markdown we cross our fingers; with reST we use the "raw"
4directive.
5
6I added an extra named parameter to the htmlize hook, which feels sort of
7wrong, since none of the other hooks take parameters. Let me know what
8you think. --Ethan
9
10Seems fairly reasonable, actually. Shouldn't the `$type` come from `$page`
11instead of `$destpage` though? Only other obvious change is to make the
12escape parameter optional, and only call it if set. --[[Joey]]
13
14> I couldn't figure out what to make it from, but thinking it through,
15> yeah, it should be $page. Revised patch follows. --Ethan
16
17>> I've updated the patch some more, but I think it's incomplete. ikiwiki
18>> emits raw html when expanding WikiLinks too, and it would need to escape
19>> those. Assuming that escaping html embedded in the middle of a sentence
20>> works.. --[[Joey]]
21
22>>> Revised again. I get around this by making another hook, htmlescapelink,
23>>> which is called to generate links in whatever language. In addition, it
24>>> doesn't (can't?) generate
25>>> spans, and it doesn't handle inlineable image links. If these were
26>>> desired, the approach to take would probably be to use substitution
27>>> definitions, which would require generating two bits of code for each
28>>> link/html snippet, and putting one at the end of the paragraph (or maybe
29>>> the document?).
30>>> To specify that (for example) Discussion links are meant to be HTML and
31>>> not rst or whatever, I added a "genhtml" parameter to htmllink. It seems
32>>> to work -- see <http://ikidev.betacantrips.com/blah.html> for an example.
33>>> --Ethan
34
35## Alternative solution
36
37[Here](http://www.jk.fr.eu.org/ikiwiki/format-escapes-2.diff) is a patch
38largely inspired from the one below, which is up to date and written with
39[[todo/multiple_output_formats]] in mind. "htmlize" hooks are generalized
40to "convert" ones, which can be registered for any pair of filename
41extensions.
42
43Preprocessor directives are allowed to return the content to be inserted
44as a hash, in any format they want, provided they provide htmlize hooks for it.
45Pseudo filename extensions (such as `"_link"`) can also be introduced,
46which aren't used as real extensions but provide useful intermediate types.
47
48--[[JeremieKoenig]]
49
50> Wow, this is in many ways a beautiful patch. I did notice one problem,
51> if a link is converted to rst and then from there to a hyperlink, the
52> styling info usially added to such a link is lost. I wonder if it would
53> be better to lose _link stuff and just create link html that is fed into
54> the rst,html converter. Other advantage to doing that is that link
55> creation has a rather complex interface, with selflink, attrs, url, and
56> content parameters.
57>
58> --[[Joey]]
59
60>> Thanks for the compliment. I must confess that I'm not too familiar with
61>> rst. I am using this todo item somewhat as a pretext to get the conversion
62>> stuff in, which I need to implement some other stuff. As a result I was
63>> less careful with the rst plugin than with the rest of the patch.
64>> I just updated the patch to fix some other problems which I found with
65>> more testing, and document the current limitations.
66
67>> Rst cannot embed raw html in the middle of a paragraph, which is why
68>> "_link" was necessary. Rst links are themselves tricky and can't be made to
69>> work inside of words without knowledge about the context.
70>> Both problems could be fixed by inserting marks instead of the html/link,
71>> which would be replaced at a later stage (htmlize, format), somewhat
72>> similiar to the way the toc plugin works. When I get more time I will
73>> try to fix the remaining glitches this way.
74
75>> Also, I think it would be useful if ikiwiki had an option to export
76>> the preprocessed source. This way you can use docutils to convert your
77>> rst documents to other formats. Raw html would be loosed in such a
78>> process (both with directives and marks), which is another
79>> argument for `"_link"` and other intermediate forms. I think I can
80>> come up with a way for rst's convert_link to be used only for export
81>> purposes, though.
82
83>> --[[JeremieKoenig]]
84
85> Another problem with this approach is when there is some html (say a
86> table), that contains a wikilink. If the link is left up to the markup
87> lamguage to handle, it will never convert it to a link, since the table
88> will be processed as a chunk of raw html.
89> --[[Joey]]
90
91### Updated patch
92
93I've created an updated [patch](http://www.idletheme.org/code/patches/ikiwiki-format-escapes-rlk-2007-09-24.diff) against the current revision.  No real functionality changes, except for a small test script, one minor bugfix (put a "join" around a scalar-context "map" in convert_link), and some wrangling to get it merged properly; I thought it might be helpful for anyone else who wants to work on the code.
94
95(With that out of the way, I think I'm going to take a stab at Jeremie's plan to use marks which would be replaced post-htmlization.  I've also got an eye towards [[todo/multiple_output_formats]].)
96
97--Ryan Koppenhaver
98
99## Original patch
100[[!tag patch patch/core plugins/rst]]
101
102<pre>
103Index: debian/changelog
104===================================================================
105--- debian/changelog	(revision 3197)
106+++ debian/changelog	(working copy)
107@@ -24,6 +24,9 @@
108     than just a suggests, since OpenID is enabled by default.
109   * Fix a bug that caused link(foo) to succeed if page foo did not exist.
110   * Fix tags to page names that contain special characters.
111+  * Based on a patch by Ethan, add a new htmlescape hook, that is called
112+    when a preprocssor directive emits inline html. The rst plugin uses this
113+    hook to support inlined raw html.
114
115   [ Josh Triplett ]
116   * Use pngcrush and optipng on all PNG files.
117Index: IkiWiki/Render.pm
118===================================================================
119--- IkiWiki/Render.pm	(revision 3197)
120+++ IkiWiki/Render.pm	(working copy)
121@@ -96,7 +96,7 @@
122 		if ($page !~ /.*\/\Q$discussionlink\E$/ &&
123 		   (length $config{cgiurl} ||
124 		    exists $links{$page."/".$discussionlink})) {
125-			$template->param(discussionlink => htmllink($page, $page, gettext("Discussion"), noimageinline => 1, forcesubpage => 1));
126+			$template->param(discussionlink => htmllink($page, $page, gettext("Discussion"), noimageinline => 1, forcesubpage => 1, genhtml => 1));
127 			$actions++;
128 		}
129 	}
130Index: IkiWiki/Plugin/rst.pm
131===================================================================
132--- IkiWiki/Plugin/rst.pm	(revision 3197)
133+++ IkiWiki/Plugin/rst.pm	(working copy)
134@@ -30,15 +30,36 @@
135 html = publish_string(stdin.read(), writer_name='html',
136        settings_overrides = { 'halt_level': 6,
137                               'file_insertion_enabled': 0,
138-                              'raw_enabled': 0 }
139+                              'raw_enabled': 1 }
140 );
141 print html[html.find('<body>')+6:html.find('</body>')].strip();
142 ";
143
144 sub import {
145 	hook(type => "htmlize", id => "rst", call => \&htmlize);
146+	hook(type => "htmlescape", id => "rst", call => \&htmlescape);
147+	hook(type => "htmlescapelink", id => "rst", call => \&htmlescapelink);
148 }
149
150+sub htmlescapelink ($$;@) {
151+	my $url = shift;
152+	my $text = shift;
153+	my %params = @_;
154+
155+	if ($params{broken}){
156+		return "`? <$url>`_\ $text";
157+	}
158+	else {
159+		return "`$text <$url>`_";
160+	}
161+}
162+
163+sub htmlescape ($) {
164+	my $html=shift;
165+	$html=~s/^/  /mg;
166+	return ".. raw:: html\n\n".$html;
167+}
168+
169 sub htmlize (@) {
170 	my %params=@_;
171 	my $content=$params{content};
172Index: doc/plugins/write.mdwn
173===================================================================
174--- doc/plugins/write.mdwn	(revision 3197)
175+++ doc/plugins/write.mdwn	(working copy)
176@@ -121,6 +121,26 @@
177 The function is passed named parameters: "page" and "content" and should
178 return the htmlized content.
179
180+### htmlescape
181+
182+	hook(type => "htmlescape", id => "ext", call => \&htmlescape);
183+
184+Some markup languages do not allow raw html to be mixed in with the markup
185+language, and need it to be escaped in some way. This hook is a companion
186+to the htmlize hook, and is called when ikiwiki detects that a preprocessor
187+directive is inserting raw html. It is passed the chunk of html in
188+question, and should return the escaped chunk.
189+
190+### htmlescapelink
191+
192+	hook(type => "htmlescapelink", id => "ext", call => \&htmlescapelink);
193+
194+Some markup languages have special syntax to link to other pages. This hook
195+is a companion to the htmlize and htmlescape hooks, and it is called when a
196+link is inserted. It is passed the target of the link and the text of the
197+link, and an optional named parameter "broken" if a broken link is being
198+generated. It should return the correctly-formatted link.
199+
200 ### pagetemplate
201
202 	hook(type => "pagetemplate", id => "foo", call => \&pagetemplate);
203@@ -355,6 +375,7 @@
204 * forcesubpage  - set to force a link to a subpage
205 * linktext - set to force the link text to something
206 * anchor - set to make the link include an anchor
207+* genhtml - set to generate HTML and not escape for correct format
208
209 #### `readfile($;$)`
210
211Index: doc/plugins/rst.mdwn
212===================================================================
213--- doc/plugins/rst.mdwn	(revision 3197)
214+++ doc/plugins/rst.mdwn	(working copy)
215@@ -10,10 +10,8 @@
216 Note that this plugin does not interoperate very well with the rest of
217 ikiwiki. Limitations include:
218
219-* reStructuredText does not allow raw html to be inserted into
220-  documents, but ikiwiki does so in many cases, including
221-  [[WikiLinks|ikiwiki/WikiLink]] and many
222-  [[Directives|ikiwiki/Directive]].
223+* Some bits of ikiwiki may still assume that markdown is used or embed html
224+  in ways that break reStructuredText. (Report bugs if you find any.)
225 * It's slow; it forks a copy of python for each page. While there is a
226   perl version of the reStructuredText processor, it is not being kept in
227   sync with the standard version, so is not used.
228Index: IkiWiki.pm
229===================================================================
230--- IkiWiki.pm	(revision 3197)
231+++ IkiWiki.pm	(working copy)
232@@ -469,6 +469,10 @@
233 	my $page=shift; # the page that will contain the link (different for inline)
234 	my $link=shift;
235 	my %opts=@_;
236+	# we are processing $lpage and so we need to format things in accordance
237+	# with the formatting language of $lpage. inline generates HTML so links
238+	# will be escaped seperately.
239+	my $type=pagetype($pagesources{$lpage});
240
241 	my $bestlink;
242 	if (! $opts{forcesubpage}) {
243@@ -494,12 +498,17 @@
244 	}
245 	if (! grep { $_ eq $bestlink } map { @{$_} } values %renderedfiles) {
246 		return $linktext unless length $config{cgiurl};
247-		return "<span><a href=\"".
248-			cgiurl(
249-				do => "create",
250-				page => pagetitle(lc($link), 1),
251-				from => $lpage
252-			).
253+		my $url = cgiurl(
254+				 do => "create",
255+				 page => pagetitle(lc($link), 1),
256+				 from => $lpage
257+				);
258+
259+		if ($hooks{htmlescapelink}{$type} && ! $opts{genhtml}){
260+			return $hooks{htmlescapelink}{$type}{call}->($url, $linktext,
261+							       broken => 1);
262+		}
263+		return "<span><a href=\"". $url.
264 			"\">?</a>$linktext</span>"
265 	}
266
267@@ -514,6 +523,9 @@
268 		$bestlink.="#".$opts{anchor};
269 	}
270
271+	if ($hooks{htmlescapelink}{$type} && !$opts{genhtml}) {
272+	  return $hooks{htmlescapelink}{$type}{call}->($bestlink, $linktext);
273+	}
274 	return "<a href=\"$bestlink\">$linktext</a>";
275 }
276
277@@ -628,6 +640,14 @@
278 				preview => $preprocess_preview,
279 			);
280 			$preprocessing{$page}--;
281+
282+			# Handle escaping html if the htmlizer needs it.
283+			if ($ret =~ /[<>]/ && $pagesources{$page}) {
284+				my $type=pagetype($pagesources{$page});
285+				if ($hooks{htmlescape}{$type}) {
286+					return $hooks{htmlescape}{$type}{call}->($ret);
287+				}
288+			}
289 			return $ret;
290 		}
291 		else {
292</pre>
293