1# Go XML Formatter
2
3[![MIT License](http://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
4[![Go Doc](https://img.shields.io/badge/godoc-reference-4b68a3.svg)](https://godoc.org/github.com/go-xmlfmt/xmlfmt)
5[![Go Report Card](https://goreportcard.com/badge/github.com/go-xmlfmt/xmlfmt)](https://goreportcard.com/report/github.com/go-xmlfmt/xmlfmt)
6[![Codeship Status](https://codeship.com/projects/c49f02b0-a384-0134-fb20-2e0351080565/status?branch=master)](https://codeship.com/projects/190297)
7
8## Synopsis
9
10The Go XML Formatter, xmlfmt, will format the XML string in a readable way.
11
12```go
13package main
14
15import "github.com/go-xmlfmt/xmlfmt"
16
17func main() {
18	xml1 := `<root><this><is>a</is><test /><message><org><cn>Some org-or-other</cn><ph>Wouldnt you like to know</ph></org><contact><fn>Pat</fn><ln>Califia</ln></contact></message></this></root>`
19	x := xmlfmt.FormatXML(xml1, "\t", "  ")
20	print(x)
21}
22
23```
24
25Output:
26
27```xml
28	<root>
29	  <this>
30	    <is>a
31	    </is>
32	    <test />
33	    <message>
34	      <!-- with comment -->
35	      <org>
36	        <cn>Some org-or-other
37	        </cn>
38	        <ph>Wouldnt you like to know
39	        </ph>
40	      </org>
41	      <contact>
42	        <fn>Pat
43	        </fn>
44	        <ln>Califia
45	        </ln>
46	      </contact>
47	    </message>
48	  </this>
49	</root>
50```
51
52There is no XML decoding and encoding involved, only pure regular expression matching and replacing. So it is much faster than going through decoding and encoding procedures. Moreover, the exact XML source string is preserved, instead of being changed by the encoder. This is why this package exists in the first place.
53
54## Command
55
56To use it on command line, check out [xmlfmt](https://github.com/AntonioSun/xmlfmt):
57
58
59```
60$ xmlfmt
61XML Formatter
62built on 2019-12-08
63
64The xmlfmt will format the XML string without rewriting the document
65
66Options:
67
68  -h, --help          display help information
69  -f, --file         *The xml file to read from (or stdin)
70  -p, --prefix        each element begins on a new line and this prefix
71  -i, --indent[=  ]   indent string for nested elements
72```
73
74
75## Justification
76
77### The format
78
79The Go XML Formatter is not called XML Beautifier because the result is not *exactly* as what people would expect -- some, but not all, closing tags stays on the same line, just as shown above. Having been looking at the result and thinking over it, I now think it is actually a better way to present it, as those closing tags on the same line are better stay that way in my opinion. I.e.,
80
81When it comes to very big XML strings, which is what I’m dealing every day, saving spaces by not allowing those closing tags taking extra lines is plus instead of negative to me.
82
83### The alternative
84
85To format it “properly”, i.e., as what people would normally see, is very hard using pure regular expression. In fact, according to Sam Whited from the go-nuts mlist,
86
87> Regular expression is, well, regular. This means that they can parse regular grammars, but can't parse context free grammars (like XML). It is actually impossible to use a regex to do this task; it will always be fragile, unfortunately.
88
89So if the output format is so important to you, then unfortunately you have to go through decoding and encoding procedures. But there are some drawbacks as well, as put by James McGill, in http://stackoverflow.com/questions/21117161, besides such method being slow:
90
91> I like this solution, but am still in search of a Golang XML formatter/prettyprinter that doesn't rewrite the document (other than formatting whitespace). Marshalling or using the Encoder will change namespace declarations.
92>
93> For example an element like "< ns1:Element />" will be translated to something like '< Element xmlns="http://bla...bla/ns1" >< /Element >' which seems harmless enough except when the intent is to not alter the xml other than formatting. -- James McGill Nov 12 '15
94
95Using Sam's code as an example,
96
97https://play.golang.org/p/JUqQY3WpW5
98
99The above code formats the following XML
100
101```xml
102<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
103  xmlns:ns="http://example.com/ns">
104   <soapenv:Header/>
105   <soapenv:Body>
106     <ns:request>
107      <ns:customer>
108       <ns:id>123</ns:id>
109       <ns:name type="NCHZ">John Brown</ns:name>
110      </ns:customer>
111     </ns:request>
112   </soapenv:Body>
113</soapenv:Envelope>
114```
115
116into this:
117
118```xml
119<Envelope xmlns="http://schemas.xmlsoap.org/soap/envelope/" xmlns:_xmlns="xmlns" _xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" _xmlns:ns="http://example.com/ns">
120 <Header xmlns="http://schemas.xmlsoap.org/soap/envelope/"></Header>
121 <Body xmlns="http://schemas.xmlsoap.org/soap/envelope/">
122  <request xmlns="http://example.com/ns">
123   <customer xmlns="http://example.com/ns">
124    <id xmlns="http://example.com/ns">123</id>
125    <name xmlns="http://example.com/ns" type="NCHZ">John Brown</name>
126   </customer>
127  </request>
128 </Body>
129</Envelope>
130```
131
132I know they are syntactically the same, however the problem is that they *look* totally different.
133
134That's why there is this package, an XML Beautifier that doesn't rewrite the document.
135
136## Credit
137
138The credit goes to **diotalevi** from his post at http://www.perlmonks.org/?node_id=261292.
139
140However, it does not work for all cases. For example,
141
142```sh
143$ echo '<Envelope xmlns=http://schemas.xmlsoap.org/soap/envelope/ xmlns:_xmlns=xmlns _xmlns:soapenv=http://schemas.xmlsoap.org/soap/envelope/ _xmlns:ns=http://example.com/ns><Header xmlns=http://schemas.xmlsoap.org/soap/envelope/></Header><Body xmlns=http://schemas.xmlsoap.org/soap/envelope/><request xmlns=http://example.com/ns><customer xmlns=http://example.com/ns><id xmlns=http://example.com/ns>123</id><name xmlns=http://example.com/ns type=NCHZ>John Brown</name></customer></request></Body></Envelope>' | perl -pe 's/(?<=>)\s+(?=<)//g; s(<(/?)([^/>]+)(/?)>\s*(?=(</?))?)($indent+=$3?0:$1?-1:1;"<$1$2$3>".($1&&($4 eq"</")?"\n".("  "x$indent):$4?"\n".("  "x$indent):""))ge'
144```
145```xml
146<Envelope xmlns=http://schemas.xmlsoap.org/soap/envelope/ xmlns:_xmlns=xmlns _xmlns:soapenv=http://schemas.xmlsoap.org/soap/envelope/ _xmlns:ns=http://example.com/ns><Header xmlns=http://schemas.xmlsoap.org/soap/envelope/></Header>
147<Body xmlns=http://schemas.xmlsoap.org/soap/envelope/><request xmlns=http://example.com/ns><customer xmlns=http://example.com/ns><id xmlns=http://example.com/ns>123</id>
148<name xmlns=http://example.com/ns type=NCHZ>John Brown</name>
149</customer>
150</request>
151</Body>
152</Envelope>
153```
154
155I simplified the algorithm, and now it should work for all cases:
156
157```sh
158echo '<Envelope xmlns="http://schemas.xmlsoap.org/soap/envelope/" xmlns:_xmlns="xmlns" _xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" _xmlns:ns="http://example.com/ns"><Header xmlns="http://schemas.xmlsoap.org/soap/envelope/"></Header><Body xmlns="http://schemas.xmlsoap.org/soap/envelope/"><request xmlns="http://example.com/ns"><customer xmlns="http://example.com/ns"><id xmlns="http://example.com/ns">123</id><name xmlns="http://example.com/ns" type="NCHZ">John Brown</name></customer></request></Body></Envelope>' | perl -pe 's/(?<=>)\s+(?=<)//g; s(<(/?)([^>]+)(/?)>)($indent+=$3?0:$1?-1:1;"<$1$2$3>"."\n".("  "x$indent))ge'
159```
160```xml
161<Envelope xmlns="http://schemas.xmlsoap.org/soap/envelope/" xmlns:_xmlns="xmlns" _xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" _xmlns:ns="http://example.com/ns">
162  <Header xmlns="http://schemas.xmlsoap.org/soap/envelope/">
163    </Header>
164  <Body xmlns="http://schemas.xmlsoap.org/soap/envelope/">
165    <request xmlns="http://example.com/ns">
166      <customer xmlns="http://example.com/ns">
167        <id xmlns="http://example.com/ns">
168          123</id>
169        <name xmlns="http://example.com/ns" type="NCHZ">
170          John Brown</name>
171        </customer>
172      </request>
173    </Body>
174  </Envelope>
175```
176
177This package is a direct translate from above Perl code into Go,
178then further enhanced by @ruandao.
179