README.md
1csvutil [![GoDoc](https://godoc.org/github.com/jszwec/csvutil?status.svg)](http://godoc.org/github.com/jszwec/csvutil) [![Build Status](https://travis-ci.org/jszwec/csvutil.svg?branch=master)](https://travis-ci.org/jszwec/csvutil) [![Build status](https://ci.appveyor.com/api/projects/status/eiyx0htjrieoo821/branch/master?svg=true)](https://ci.appveyor.com/project/jszwec/csvutil/branch/master) [![Go Report Card](https://goreportcard.com/badge/github.com/jszwec/csvutil)](https://goreportcard.com/report/github.com/jszwec/csvutil) [![codecov](https://codecov.io/gh/jszwec/csvutil/branch/master/graph/badge.svg)](https://codecov.io/gh/jszwec/csvutil)
2=================
3
4<p align="center">
5 <img style="float: right;" src="https://user-images.githubusercontent.com/3941256/33054906-52b4bc08-ce4a-11e7-9651-b70c5a47c921.png"/ width=200>
6</p>
7
8Package csvutil provides fast and idiomatic mapping between CSV and Go (golang) values.
9
10This package does not provide a CSV parser itself, it is based on the [Reader](https://godoc.org/github.com/jszwec/csvutil#Reader) and [Writer](https://godoc.org/github.com/jszwec/csvutil#Writer)
11interfaces which are implemented by eg. std Go (golang) [csv package](https://golang.org/pkg/encoding/csv). This gives a possibility
12of choosing any other CSV writer or reader which may be more performant.
13
14Installation
15------------
16
17 go get github.com/jszwec/csvutil
18
19Requirements
20-------------
21
22* Go1.7+
23
24Index
25------
26
271. [Examples](#examples)
28 1. [Unmarshal](#examples_unmarshal)
29 2. [Marshal](#examples_marshal)
30 3. [Unmarshal and metadata](#examples_unmarshal_and_metadata)
31 4. [But my CSV file has no header...](#examples_but_my_csv_has_no_header)
32 5. [Decoder.Map - data normalization](#examples_decoder_map)
33 6. [Different separator/delimiter](#examples_different_separator)
34 7. [Decoder and interface values](#examples_decoder_interface_values)
35 8. [Custom time.Time format](#examples_time_format)
36 9. [Custom struct tags](#examples_struct_tags)
37 10. [Slice and Map fields](#examples_slice_and_map_field)
38 11. [Nested/Embedded structs](#examples_nested_structs)
39 12. [Inline tag](#examples_inlined_structs)
402. [Performance](#performance)
41 1. [Unmarshal](#performance_unmarshal)
42 2. [Marshal](#performance_marshal)
43
44Example <a name="examples"></a>
45--------
46
47### Unmarshal <a name="examples_unmarshal"></a>
48
49Nice and easy Unmarshal is using the Go std [csv.Reader](https://golang.org/pkg/encoding/csv/#Reader) with its default options. Use [Decoder](https://godoc.org/github.com/jszwec/csvutil#Decoder) for streaming and more advanced use cases.
50
51```go
52 var csvInput = []byte(`
53name,age,CreatedAt
54jacek,26,2012-04-01T15:00:00Z
55john,,0001-01-01T00:00:00Z`,
56 )
57
58 type User struct {
59 Name string `csv:"name"`
60 Age int `csv:"age,omitempty"`
61 CreatedAt time.Time
62 }
63
64 var users []User
65 if err := csvutil.Unmarshal(csvInput, &users); err != nil {
66 fmt.Println("error:", err)
67 }
68
69 for _, u := range users {
70 fmt.Printf("%+v\n", u)
71 }
72
73 // Output:
74 // {Name:jacek Age:26 CreatedAt:2012-04-01 15:00:00 +0000 UTC}
75 // {Name:john Age:0 CreatedAt:0001-01-01 00:00:00 +0000 UTC}
76```
77
78### Marshal <a name="examples_marshal"></a>
79
80Marshal is using the Go std [csv.Writer](https://golang.org/pkg/encoding/csv/#Writer) with its default options. Use [Encoder](https://godoc.org/github.com/jszwec/csvutil#Encoder) for streaming or to use a different Writer.
81
82```go
83 type Address struct {
84 City string
85 Country string
86 }
87
88 type User struct {
89 Name string
90 Address
91 Age int `csv:"age,omitempty"`
92 CreatedAt time.Time
93 }
94
95 users := []User{
96 {
97 Name: "John",
98 Address: Address{"Boston", "USA"},
99 Age: 26,
100 CreatedAt: time.Date(2010, 6, 2, 12, 0, 0, 0, time.UTC),
101 },
102 {
103 Name: "Alice",
104 Address: Address{"SF", "USA"},
105 },
106 }
107
108 b, err := csvutil.Marshal(users)
109 if err != nil {
110 fmt.Println("error:", err)
111 }
112 fmt.Println(string(b))
113
114 // Output:
115 // Name,City,Country,age,CreatedAt
116 // John,Boston,USA,26,2010-06-02T12:00:00Z
117 // Alice,SF,USA,,0001-01-01T00:00:00Z
118```
119
120### Unmarshal and metadata <a name="examples_unmarshal_and_metadata"></a>
121
122It may happen that your CSV input will not always have the same header. In addition
123to your base fields you may get extra metadata that you would still like to store.
124[Decoder](https://godoc.org/github.com/jszwec/csvutil#Decoder) provides
125[Unused](https://godoc.org/github.com/jszwec/csvutil#Decoder.Unused) method, which after each call to
126[Decode](https://godoc.org/github.com/jszwec/csvutil#Decoder.Decode) can report which header indexes
127were not used during decoding. Based on that, it is possible to handle and store all these extra values.
128
129```go
130 type User struct {
131 Name string `csv:"name"`
132 City string `csv:"city"`
133 Age int `csv:"age"`
134 OtherData map[string]string `csv:"-"`
135 }
136
137 csvReader := csv.NewReader(strings.NewReader(`
138name,age,city,zip
139alice,25,la,90005
140bob,30,ny,10005`))
141
142 dec, err := csvutil.NewDecoder(csvReader)
143 if err != nil {
144 log.Fatal(err)
145 }
146
147 header := dec.Header()
148 var users []User
149 for {
150 u := User{OtherData: make(map[string]string)}
151
152 if err := dec.Decode(&u); err == io.EOF {
153 break
154 } else if err != nil {
155 log.Fatal(err)
156 }
157
158 for _, i := range dec.Unused() {
159 u.OtherData[header[i]] = dec.Record()[i]
160 }
161 users = append(users, u)
162 }
163
164 fmt.Println(users)
165
166 // Output:
167 // [{alice la 25 map[zip:90005]} {bob ny 30 map[zip:10005]}]
168```
169
170### But my CSV file has no header... <a name="examples_but_my_csv_has_no_header"></a>
171
172Some CSV files have no header, but if you know how it should look like, it is
173possible to define a struct and generate it. All that is left to do, is to pass
174it to a decoder.
175
176```go
177 type User struct {
178 ID int
179 Name string
180 Age int `csv:",omitempty"`
181 City string
182 }
183
184 csvReader := csv.NewReader(strings.NewReader(`
1851,John,27,la
1862,Bob,,ny`))
187
188 // in real application this should be done once in init function.
189 userHeader, err := csvutil.Header(User{}, "csv")
190 if err != nil {
191 log.Fatal(err)
192 }
193
194 dec, err := csvutil.NewDecoder(csvReader, userHeader...)
195 if err != nil {
196 log.Fatal(err)
197 }
198
199 var users []User
200 for {
201 var u User
202 if err := dec.Decode(&u); err == io.EOF {
203 break
204 } else if err != nil {
205 log.Fatal(err)
206 }
207 users = append(users, u)
208 }
209
210 fmt.Printf("%+v", users)
211
212 // Output:
213 // [{ID:1 Name:John Age:27 City:la} {ID:2 Name:Bob Age:0 City:ny}]
214```
215
216### Decoder.Map - data normalization <a name="examples_decoder_map"></a>
217
218The Decoder's [Map](https://godoc.org/github.com/jszwec/csvutil#Decoder.Map) function is a powerful tool that can help clean up or normalize
219the incoming data before the actual decoding takes place.
220
221Lets say we want to decode some floats and the csv input contains some NaN values, but these values are represented by the 'n/a' string. An attempt to decode 'n/a' into float will end up with error, because strconv.ParseFloat expects 'NaN'. Knowing that, we can implement a Map function that will normalize our 'n/a' string and turn it to 'NaN' only for float types.
222
223```go
224 dec, err := NewDecoder(r)
225 if err != nil {
226 log.Fatal(err)
227 }
228
229 dec.Map = func(field, column string, v interface{}) string {
230 if _, ok := v.(float64); ok && field == "n/a" {
231 return "NaN"
232 }
233 return field
234 }
235```
236
237Now our float64 fields will be decoded properly into NaN. What about float32, float type aliases and other NaN formats? Look at the full example [here](https://gist.github.com/jszwec/2bb94f8f3612e0162eb16003701f727e).
238
239### Different separator/delimiter <a name="examples_different_separator"></a>
240
241Some files may use different value separators, for example TSV files would use `\t`. The following examples show how to set up a Decoder and Encoder for such use case.
242
243#### Decoder:
244```go
245 csvReader := csv.NewReader(r)
246 csvReader.Comma = '\t'
247
248 dec, err := NewDecoder(csvReader)
249 if err != nil {
250 log.Fatal(err)
251 }
252
253 var users []User
254 for {
255 var u User
256 if err := dec.Decode(&u); err == io.EOF {
257 break
258 } else if err != nil {
259 log.Fatal(err)
260 }
261 users = append(users, u)
262 }
263
264```
265
266#### Encoder:
267```go
268 var buf bytes.Buffer
269
270 w := csv.NewWriter(&buf)
271 w.Comma = '\t'
272 enc := csvutil.NewEncoder(w)
273
274 for _, u := range users {
275 if err := enc.Encode(u); err != nil {
276 log.Fatal(err)
277 }
278 }
279
280 w.Flush()
281 if err := w.Error(); err != nil {
282 log.Fatal(err)
283 }
284```
285
286### Decoder and interface values <a name="examples_decoder_interface_values"></a>
287
288In the case of interface struct fields data is decoded into strings. However, if Decoder finds out that
289these fields were initialized with pointer values of a specific type prior to decoding, it will try to decode data into that type.
290
291Why only pointer values? Because these values must be both addressable and settable, otherwise Decoder
292will have to initialize these types on its own, which could result in losing some unexported information.
293
294If interface stores a non-pointer value it will be replaced with a string.
295
296This example will show how this feature could be useful:
297```go
298package main
299
300import (
301 "bytes"
302 "encoding/csv"
303 "fmt"
304 "io"
305 "log"
306
307 "github.com/jszwec/csvutil"
308)
309
310// Value defines one record in the csv input. In this example it is important
311// that Type field is defined before Value. Decoder reads headers and values
312// in the same order as struct fields are defined.
313type Value struct {
314 Type string `csv:"type"`
315 Value interface{} `csv:"value"`
316}
317
318func main() {
319 // lets say our csv input defines variables with their types and values.
320 data := []byte(`
321type,value
322string,string_value
323int,10
324`)
325
326 dec, err := csvutil.NewDecoder(csv.NewReader(bytes.NewReader(data)))
327 if err != nil {
328 log.Fatal(err)
329 }
330
331 // we would like to read every variable and store their already parsed values
332 // in the interface field. We can use Decoder.Map function to initialize
333 // interface with proper values depending on the input.
334 var value Value
335 dec.Map = func(field, column string, v interface{}) string {
336 if column == "type" {
337 switch field {
338 case "int": // csv input tells us that this variable contains an int.
339 var n int
340 value.Value = &n // lets initialize interface with an initialized int pointer.
341 default:
342 return field
343 }
344 }
345 return field
346 }
347
348 for {
349 value = Value{}
350 if err := dec.Decode(&value); err == io.EOF {
351 break
352 } else if err != nil {
353 log.Fatal(err)
354 }
355
356 if value.Type == "int" {
357 // our variable type is int, Map func already initialized our interface
358 // as int pointer, so we can safely cast it and use it.
359 n, ok := value.Value.(*int)
360 if !ok {
361 log.Fatal("expected value to be *int")
362 }
363 fmt.Printf("value_type: %s; value: (%T) %d\n", value.Type, value.Value, *n)
364 } else {
365 fmt.Printf("value_type: %s; value: (%T) %v\n", value.Type, value.Value, value.Value)
366 }
367 }
368
369 // Output:
370 // value_type: string; value: (string) string_value
371 // value_type: int; value: (*int) 10
372}
373```
374
375### Custom time.Time format <a name="examples_time_format"></a>
376
377Type [time.Time](https://golang.org/pkg/time/#Time) can be used as is in the struct fields by both Decoder and Encoder
378due to the fact that both have builtin support for [encoding.TextUnmarshaler](https://golang.org/pkg/encoding/#TextUnmarshaler) and [encoding.TextMarshaler](https://golang.org/pkg/encoding/#TextMarshaler). This means that by default
379Time has a specific format; look at [MarshalText](https://golang.org/pkg/time/#Time.MarshalText) and [UnmarshalText](https://golang.org/pkg/time/#Time.UnmarshalText). There are two ways to override it, which one you choose depends on your use case:
380
3811. Via Register func (based on encoding/json)
382```go
383const format = "2006/01/02 15:04:05"
384
385marshalTime := func(t time.Time) ([]byte, error) {
386 return t.AppendFormat(nil, format), nil
387}
388
389unmarshalTime := func(data []byte, t *time.Time) error {
390 tt, err := time.Parse(format, string(data))
391 if err != nil {
392 return err
393 }
394 *t = tt
395 return nil
396}
397
398enc := csvutil.NewEncoder(w)
399enc.Register(marshalTime)
400
401dec, err := csvutil.NewDecoder(r)
402if err != nil {
403 return err
404}
405dec.Register(unmarshalTime)
406```
407
4082. With custom type:
409```go
410type Time struct {
411 time.Time
412}
413
414const format = "2006/01/02 15:04:05"
415
416func (t Time) MarshalCSV() ([]byte, error) {
417 var b [len(format)]byte
418 return t.AppendFormat(b[:0], format), nil
419}
420
421func (t *Time) UnmarshalCSV(data []byte) error {
422 tt, err := time.Parse(format, string(data))
423 if err != nil {
424 return err
425 }
426 *t = Time{Time: tt}
427 return nil
428}
429```
430
431### Custom struct tags <a name="examples_struct_tags"></a>
432
433Like in other Go encoding packages struct field tags can be used to set
434custom names or options. By default encoders and decoders are looking at `csv` tag.
435However, this can be overriden by manually setting the Tag field.
436
437```go
438 type Foo struct {
439 Bar int `custom:"bar"`
440 }
441```
442
443```go
444 dec, err := csvutil.NewDecoder(r)
445 if err != nil {
446 log.Fatal(err)
447 }
448 dec.Tag = "custom"
449```
450
451```go
452 enc := csvutil.NewEncoder(w)
453 enc.Tag = "custom"
454```
455
456### Slice and Map fields <a name="examples_slice_and_map_field"></a>
457
458There is no default encoding/decoding support for slice and map fields because there is no CSV spec for such values.
459In such case, it is recommended to create a custom type alias and implement Marshaler and Unmarshaler interfaces.
460Please note that slice and map aliases behave differently than aliases of other types - there is no need for type casting.
461
462```go
463 type Strings []string
464
465 func (s Strings) MarshalCSV() ([]byte, error) {
466 return []byte(strings.Join(s, ",")), nil // strings.Join takes []string but it will also accept Strings
467 }
468
469 type StringMap map[string]string
470
471 func (sm StringMap) MarshalCSV() ([]byte, error) {
472 return []byte(fmt.Sprint(sm)), nil
473 }
474
475 func main() {
476 b, err := csvutil.Marshal([]struct {
477 Strings Strings `csv:"strings"`
478 Map StringMap `csv:"map"`
479 }{
480 {[]string{"a", "b"}, map[string]string{"a": "1"}}, // no type casting is required for slice and map aliases
481 {Strings{"c", "d"}, StringMap{"b": "1"}},
482 })
483
484 if err != nil {
485 log.Fatal(err)
486 }
487
488 fmt.Printf("%s\n", b)
489
490 // Output:
491 // strings,map
492 // "a,b",map[a:1]
493 // "c,d",map[b:1]
494 }
495```
496
497### Nested/Embedded structs <a name="examples_nested_structs"></a>
498
499Both Encoder and Decoder support nested or embedded structs.
500
501Playground: https://play.golang.org/p/ZySjdVkovbf
502
503```go
504package main
505
506import (
507 "fmt"
508
509 "github.com/jszwec/csvutil"
510)
511
512type Address struct {
513 Street string `csv:"street"`
514 City string `csv:"city"`
515}
516
517type User struct {
518 Name string `csv:"name"`
519 Address
520}
521
522func main() {
523 users := []User{
524 {
525 Name: "John",
526 Address: Address{
527 Street: "Boylston",
528 City: "Boston",
529 },
530 },
531 }
532
533 b, err := csvutil.Marshal(users)
534 if err != nil {
535 panic(err)
536 }
537
538 fmt.Printf("%s\n", b)
539
540 var out []User
541 if err := csvutil.Unmarshal(b, &out); err != nil {
542 panic(err)
543 }
544
545 fmt.Printf("%+v\n", out)
546
547 // Output:
548 //
549 // name,street,city
550 // John,Boylston,Boston
551 //
552 // [{Name:John Address:{Street:Boylston City:Boston}}]
553}
554```
555
556### Inline tag <a name="examples_inlined_structs"></a>
557
558Fields with inline tag behave similarly to embedded struct fields. However,
559it gives a possibility to specify the prefix for all underlying fields. This
560can be useful when one structure can define multiple CSV columns because they
561are different from each other only by a certain prefix. Look at the example below.
562
563Playground: https://play.golang.org/p/jyEzeskSnj7
564
565```go
566package main
567
568import (
569 "fmt"
570
571 "github.com/jszwec/csvutil"
572)
573
574func main() {
575 type Address struct {
576 Street string `csv:"street"`
577 City string `csv:"city"`
578 }
579
580 type User struct {
581 Name string `csv:"name"`
582 Address Address `csv:",inline"`
583 HomeAddress Address `csv:"home_address_,inline"`
584 WorkAddress Address `csv:"work_address_,inline"`
585 Age int `csv:"age,omitempty"`
586 }
587
588 users := []User{
589 {
590 Name: "John",
591 Address: Address{"Washington", "Boston"},
592 HomeAddress: Address{"Boylston", "Boston"},
593 WorkAddress: Address{"River St", "Cambridge"},
594 Age: 26,
595 },
596 }
597
598 b, err := csvutil.Marshal(users)
599 if err != nil {
600 fmt.Println("error:", err)
601 }
602
603 fmt.Printf("%s\n", b)
604
605 // Output:
606 // name,street,city,home_address_street,home_address_city,work_address_street,work_address_city,age
607 // John,Washington,Boston,Boylston,Boston,River St,Cambridge,26
608}
609```
610
611Performance
612------------
613
614csvutil provides the best encoding and decoding performance with small memory usage.
615
616### Unmarshal <a name="performance_unmarshal"></a>
617
618[benchmark code](https://gist.github.com/jszwec/e8515e741190454fa3494bcd3e1f100f)
619
620#### csvutil:
621```
622BenchmarkUnmarshal/csvutil.Unmarshal/1_record-12 280696 4516 ns/op 7332 B/op 26 allocs/op
623BenchmarkUnmarshal/csvutil.Unmarshal/10_records-12 95750 11517 ns/op 8356 B/op 35 allocs/op
624BenchmarkUnmarshal/csvutil.Unmarshal/100_records-12 14997 83146 ns/op 18532 B/op 125 allocs/op
625BenchmarkUnmarshal/csvutil.Unmarshal/1000_records-12 1485 750143 ns/op 121094 B/op 1025 allocs/op
626BenchmarkUnmarshal/csvutil.Unmarshal/10000_records-12 154 7587205 ns/op 1136662 B/op 10025 allocs/op
627BenchmarkUnmarshal/csvutil.Unmarshal/100000_records-12 14 76126616 ns/op 11808744 B/op 100025 allocs/op
628```
629
630#### gocsv:
631```
632BenchmarkUnmarshal/gocsv.Unmarshal/1_record-12 141330 7499 ns/op 7795 B/op 97 allocs/op
633BenchmarkUnmarshal/gocsv.Unmarshal/10_records-12 54252 21664 ns/op 13891 B/op 307 allocs/op
634BenchmarkUnmarshal/gocsv.Unmarshal/100_records-12 6920 159662 ns/op 72644 B/op 2380 allocs/op
635BenchmarkUnmarshal/gocsv.Unmarshal/1000_records-12 752 1556083 ns/op 650248 B/op 23083 allocs/op
636BenchmarkUnmarshal/gocsv.Unmarshal/10000_records-12 72 17086623 ns/op 7017469 B/op 230092 allocs/op
637BenchmarkUnmarshal/gocsv.Unmarshal/100000_records-12 7 163610749 ns/op 75004923 B/op 2300105 allocs/op
638```
639
640#### easycsv:
641```
642BenchmarkUnmarshal/easycsv.ReadAll/1_record-12 101527 10662 ns/op 8855 B/op 81 allocs/op
643BenchmarkUnmarshal/easycsv.ReadAll/10_records-12 23325 51437 ns/op 24072 B/op 391 allocs/op
644BenchmarkUnmarshal/easycsv.ReadAll/100_records-12 2402 447296 ns/op 170538 B/op 3454 allocs/op
645BenchmarkUnmarshal/easycsv.ReadAll/1000_records-12 272 4370854 ns/op 1595683 B/op 34057 allocs/op
646BenchmarkUnmarshal/easycsv.ReadAll/10000_records-12 24 47502457 ns/op 18861808 B/op 340068 allocs/op
647BenchmarkUnmarshal/easycsv.ReadAll/100000_records-12 3 468974170 ns/op 189427066 B/op 3400082 allocs/op
648```
649
650### Marshal <a name="performance_marshal"></a>
651
652[benchmark code](https://gist.github.com/jszwec/31980321e1852ebb5615a44ccf374f17)
653
654#### csvutil:
655```
656BenchmarkMarshal/csvutil.Marshal/1_record-12 279558 4390 ns/op 9952 B/op 12 allocs/op
657BenchmarkMarshal/csvutil.Marshal/10_records-12 82478 15608 ns/op 10800 B/op 21 allocs/op
658BenchmarkMarshal/csvutil.Marshal/100_records-12 10275 117288 ns/op 28208 B/op 112 allocs/op
659BenchmarkMarshal/csvutil.Marshal/1000_records-12 1075 1147473 ns/op 168508 B/op 1014 allocs/op
660BenchmarkMarshal/csvutil.Marshal/10000_records-12 100 11985382 ns/op 1525973 B/op 10017 allocs/op
661BenchmarkMarshal/csvutil.Marshal/100000_records-12 9 113640813 ns/op 22455873 B/op 100021 allocs/op
662```
663
664#### gocsv:
665```
666BenchmarkMarshal/gocsv.Marshal/1_record-12 203052 6077 ns/op 5914 B/op 81 allocs/op
667BenchmarkMarshal/gocsv.Marshal/10_records-12 50132 24585 ns/op 9284 B/op 360 allocs/op
668BenchmarkMarshal/gocsv.Marshal/100_records-12 5480 212008 ns/op 51916 B/op 3151 allocs/op
669BenchmarkMarshal/gocsv.Marshal/1000_records-12 514 2053919 ns/op 444506 B/op 31053 allocs/op
670BenchmarkMarshal/gocsv.Marshal/10000_records-12 52 21066666 ns/op 4332377 B/op 310064 allocs/op
671BenchmarkMarshal/gocsv.Marshal/100000_records-12 5 207408929 ns/op 51169419 B/op 3100077 allocs/op
672```
673