1csvutil [![GoDoc](https://godoc.org/github.com/jszwec/csvutil?status.svg)](http://godoc.org/github.com/jszwec/csvutil) [![Build Status](https://travis-ci.org/jszwec/csvutil.svg?branch=master)](https://travis-ci.org/jszwec/csvutil) [![Build status](https://ci.appveyor.com/api/projects/status/eiyx0htjrieoo821/branch/master?svg=true)](https://ci.appveyor.com/project/jszwec/csvutil/branch/master) [![Go Report Card](https://goreportcard.com/badge/github.com/jszwec/csvutil)](https://goreportcard.com/report/github.com/jszwec/csvutil) [![codecov](https://codecov.io/gh/jszwec/csvutil/branch/master/graph/badge.svg)](https://codecov.io/gh/jszwec/csvutil) 2================= 3 4<p align="center"> 5 <img style="float: right;" src="https://user-images.githubusercontent.com/3941256/33054906-52b4bc08-ce4a-11e7-9651-b70c5a47c921.png"/ width=200> 6</p> 7 8Package csvutil provides fast and idiomatic mapping between CSV and Go (golang) values. 9 10This package does not provide a CSV parser itself, it is based on the [Reader](https://godoc.org/github.com/jszwec/csvutil#Reader) and [Writer](https://godoc.org/github.com/jszwec/csvutil#Writer) 11interfaces which are implemented by eg. std Go (golang) [csv package](https://golang.org/pkg/encoding/csv). This gives a possibility 12of choosing any other CSV writer or reader which may be more performant. 13 14Installation 15------------ 16 17 go get github.com/jszwec/csvutil 18 19Requirements 20------------- 21 22* Go1.7+ 23 24Index 25------ 26 271. [Examples](#examples) 28 1. [Unmarshal](#examples_unmarshal) 29 2. [Marshal](#examples_marshal) 30 3. [Unmarshal and metadata](#examples_unmarshal_and_metadata) 31 4. [But my CSV file has no header...](#examples_but_my_csv_has_no_header) 32 5. [Decoder.Map - data normalization](#examples_decoder_map) 33 6. [Different separator/delimiter](#examples_different_separator) 34 7. [Decoder and interface values](#examples_decoder_interface_values) 35 8. [Custom time.Time format](#examples_time_format) 36 9. [Custom struct tags](#examples_struct_tags) 37 10. [Slice and Map fields](#examples_slice_and_map_field) 38 11. [Nested/Embedded structs](#examples_nested_structs) 39 12. [Inline tag](#examples_inlined_structs) 402. [Performance](#performance) 41 1. [Unmarshal](#performance_unmarshal) 42 2. [Marshal](#performance_marshal) 43 44Example <a name="examples"></a> 45-------- 46 47### Unmarshal <a name="examples_unmarshal"></a> 48 49Nice and easy Unmarshal is using the Go std [csv.Reader](https://golang.org/pkg/encoding/csv/#Reader) with its default options. Use [Decoder](https://godoc.org/github.com/jszwec/csvutil#Decoder) for streaming and more advanced use cases. 50 51```go 52 var csvInput = []byte(` 53name,age,CreatedAt 54jacek,26,2012-04-01T15:00:00Z 55john,,0001-01-01T00:00:00Z`, 56 ) 57 58 type User struct { 59 Name string `csv:"name"` 60 Age int `csv:"age,omitempty"` 61 CreatedAt time.Time 62 } 63 64 var users []User 65 if err := csvutil.Unmarshal(csvInput, &users); err != nil { 66 fmt.Println("error:", err) 67 } 68 69 for _, u := range users { 70 fmt.Printf("%+v\n", u) 71 } 72 73 // Output: 74 // {Name:jacek Age:26 CreatedAt:2012-04-01 15:00:00 +0000 UTC} 75 // {Name:john Age:0 CreatedAt:0001-01-01 00:00:00 +0000 UTC} 76``` 77 78### Marshal <a name="examples_marshal"></a> 79 80Marshal is using the Go std [csv.Writer](https://golang.org/pkg/encoding/csv/#Writer) with its default options. Use [Encoder](https://godoc.org/github.com/jszwec/csvutil#Encoder) for streaming or to use a different Writer. 81 82```go 83 type Address struct { 84 City string 85 Country string 86 } 87 88 type User struct { 89 Name string 90 Address 91 Age int `csv:"age,omitempty"` 92 CreatedAt time.Time 93 } 94 95 users := []User{ 96 { 97 Name: "John", 98 Address: Address{"Boston", "USA"}, 99 Age: 26, 100 CreatedAt: time.Date(2010, 6, 2, 12, 0, 0, 0, time.UTC), 101 }, 102 { 103 Name: "Alice", 104 Address: Address{"SF", "USA"}, 105 }, 106 } 107 108 b, err := csvutil.Marshal(users) 109 if err != nil { 110 fmt.Println("error:", err) 111 } 112 fmt.Println(string(b)) 113 114 // Output: 115 // Name,City,Country,age,CreatedAt 116 // John,Boston,USA,26,2010-06-02T12:00:00Z 117 // Alice,SF,USA,,0001-01-01T00:00:00Z 118``` 119 120### Unmarshal and metadata <a name="examples_unmarshal_and_metadata"></a> 121 122It may happen that your CSV input will not always have the same header. In addition 123to your base fields you may get extra metadata that you would still like to store. 124[Decoder](https://godoc.org/github.com/jszwec/csvutil#Decoder) provides 125[Unused](https://godoc.org/github.com/jszwec/csvutil#Decoder.Unused) method, which after each call to 126[Decode](https://godoc.org/github.com/jszwec/csvutil#Decoder.Decode) can report which header indexes 127were not used during decoding. Based on that, it is possible to handle and store all these extra values. 128 129```go 130 type User struct { 131 Name string `csv:"name"` 132 City string `csv:"city"` 133 Age int `csv:"age"` 134 OtherData map[string]string `csv:"-"` 135 } 136 137 csvReader := csv.NewReader(strings.NewReader(` 138name,age,city,zip 139alice,25,la,90005 140bob,30,ny,10005`)) 141 142 dec, err := csvutil.NewDecoder(csvReader) 143 if err != nil { 144 log.Fatal(err) 145 } 146 147 header := dec.Header() 148 var users []User 149 for { 150 u := User{OtherData: make(map[string]string)} 151 152 if err := dec.Decode(&u); err == io.EOF { 153 break 154 } else if err != nil { 155 log.Fatal(err) 156 } 157 158 for _, i := range dec.Unused() { 159 u.OtherData[header[i]] = dec.Record()[i] 160 } 161 users = append(users, u) 162 } 163 164 fmt.Println(users) 165 166 // Output: 167 // [{alice la 25 map[zip:90005]} {bob ny 30 map[zip:10005]}] 168``` 169 170### But my CSV file has no header... <a name="examples_but_my_csv_has_no_header"></a> 171 172Some CSV files have no header, but if you know how it should look like, it is 173possible to define a struct and generate it. All that is left to do, is to pass 174it to a decoder. 175 176```go 177 type User struct { 178 ID int 179 Name string 180 Age int `csv:",omitempty"` 181 City string 182 } 183 184 csvReader := csv.NewReader(strings.NewReader(` 1851,John,27,la 1862,Bob,,ny`)) 187 188 // in real application this should be done once in init function. 189 userHeader, err := csvutil.Header(User{}, "csv") 190 if err != nil { 191 log.Fatal(err) 192 } 193 194 dec, err := csvutil.NewDecoder(csvReader, userHeader...) 195 if err != nil { 196 log.Fatal(err) 197 } 198 199 var users []User 200 for { 201 var u User 202 if err := dec.Decode(&u); err == io.EOF { 203 break 204 } else if err != nil { 205 log.Fatal(err) 206 } 207 users = append(users, u) 208 } 209 210 fmt.Printf("%+v", users) 211 212 // Output: 213 // [{ID:1 Name:John Age:27 City:la} {ID:2 Name:Bob Age:0 City:ny}] 214``` 215 216### Decoder.Map - data normalization <a name="examples_decoder_map"></a> 217 218The Decoder's [Map](https://godoc.org/github.com/jszwec/csvutil#Decoder.Map) function is a powerful tool that can help clean up or normalize 219the incoming data before the actual decoding takes place. 220 221Lets say we want to decode some floats and the csv input contains some NaN values, but these values are represented by the 'n/a' string. An attempt to decode 'n/a' into float will end up with error, because strconv.ParseFloat expects 'NaN'. Knowing that, we can implement a Map function that will normalize our 'n/a' string and turn it to 'NaN' only for float types. 222 223```go 224 dec, err := NewDecoder(r) 225 if err != nil { 226 log.Fatal(err) 227 } 228 229 dec.Map = func(field, column string, v interface{}) string { 230 if _, ok := v.(float64); ok && field == "n/a" { 231 return "NaN" 232 } 233 return field 234 } 235``` 236 237Now our float64 fields will be decoded properly into NaN. What about float32, float type aliases and other NaN formats? Look at the full example [here](https://gist.github.com/jszwec/2bb94f8f3612e0162eb16003701f727e). 238 239### Different separator/delimiter <a name="examples_different_separator"></a> 240 241Some files may use different value separators, for example TSV files would use `\t`. The following examples show how to set up a Decoder and Encoder for such use case. 242 243#### Decoder: 244```go 245 csvReader := csv.NewReader(r) 246 csvReader.Comma = '\t' 247 248 dec, err := NewDecoder(csvReader) 249 if err != nil { 250 log.Fatal(err) 251 } 252 253 var users []User 254 for { 255 var u User 256 if err := dec.Decode(&u); err == io.EOF { 257 break 258 } else if err != nil { 259 log.Fatal(err) 260 } 261 users = append(users, u) 262 } 263 264``` 265 266#### Encoder: 267```go 268 var buf bytes.Buffer 269 270 w := csv.NewWriter(&buf) 271 w.Comma = '\t' 272 enc := csvutil.NewEncoder(w) 273 274 for _, u := range users { 275 if err := enc.Encode(u); err != nil { 276 log.Fatal(err) 277 } 278 } 279 280 w.Flush() 281 if err := w.Error(); err != nil { 282 log.Fatal(err) 283 } 284``` 285 286### Decoder and interface values <a name="examples_decoder_interface_values"></a> 287 288In the case of interface struct fields data is decoded into strings. However, if Decoder finds out that 289these fields were initialized with pointer values of a specific type prior to decoding, it will try to decode data into that type. 290 291Why only pointer values? Because these values must be both addressable and settable, otherwise Decoder 292will have to initialize these types on its own, which could result in losing some unexported information. 293 294If interface stores a non-pointer value it will be replaced with a string. 295 296This example will show how this feature could be useful: 297```go 298package main 299 300import ( 301 "bytes" 302 "encoding/csv" 303 "fmt" 304 "io" 305 "log" 306 307 "github.com/jszwec/csvutil" 308) 309 310// Value defines one record in the csv input. In this example it is important 311// that Type field is defined before Value. Decoder reads headers and values 312// in the same order as struct fields are defined. 313type Value struct { 314 Type string `csv:"type"` 315 Value interface{} `csv:"value"` 316} 317 318func main() { 319 // lets say our csv input defines variables with their types and values. 320 data := []byte(` 321type,value 322string,string_value 323int,10 324`) 325 326 dec, err := csvutil.NewDecoder(csv.NewReader(bytes.NewReader(data))) 327 if err != nil { 328 log.Fatal(err) 329 } 330 331 // we would like to read every variable and store their already parsed values 332 // in the interface field. We can use Decoder.Map function to initialize 333 // interface with proper values depending on the input. 334 var value Value 335 dec.Map = func(field, column string, v interface{}) string { 336 if column == "type" { 337 switch field { 338 case "int": // csv input tells us that this variable contains an int. 339 var n int 340 value.Value = &n // lets initialize interface with an initialized int pointer. 341 default: 342 return field 343 } 344 } 345 return field 346 } 347 348 for { 349 value = Value{} 350 if err := dec.Decode(&value); err == io.EOF { 351 break 352 } else if err != nil { 353 log.Fatal(err) 354 } 355 356 if value.Type == "int" { 357 // our variable type is int, Map func already initialized our interface 358 // as int pointer, so we can safely cast it and use it. 359 n, ok := value.Value.(*int) 360 if !ok { 361 log.Fatal("expected value to be *int") 362 } 363 fmt.Printf("value_type: %s; value: (%T) %d\n", value.Type, value.Value, *n) 364 } else { 365 fmt.Printf("value_type: %s; value: (%T) %v\n", value.Type, value.Value, value.Value) 366 } 367 } 368 369 // Output: 370 // value_type: string; value: (string) string_value 371 // value_type: int; value: (*int) 10 372} 373``` 374 375### Custom time.Time format <a name="examples_time_format"></a> 376 377Type [time.Time](https://golang.org/pkg/time/#Time) can be used as is in the struct fields by both Decoder and Encoder 378due to the fact that both have builtin support for [encoding.TextUnmarshaler](https://golang.org/pkg/encoding/#TextUnmarshaler) and [encoding.TextMarshaler](https://golang.org/pkg/encoding/#TextMarshaler). This means that by default 379Time has a specific format; look at [MarshalText](https://golang.org/pkg/time/#Time.MarshalText) and [UnmarshalText](https://golang.org/pkg/time/#Time.UnmarshalText). There are two ways to override it, which one you choose depends on your use case: 380 3811. Via Register func (based on encoding/json) 382```go 383const format = "2006/01/02 15:04:05" 384 385marshalTime := func(t time.Time) ([]byte, error) { 386 return t.AppendFormat(nil, format), nil 387} 388 389unmarshalTime := func(data []byte, t *time.Time) error { 390 tt, err := time.Parse(format, string(data)) 391 if err != nil { 392 return err 393 } 394 *t = tt 395 return nil 396} 397 398enc := csvutil.NewEncoder(w) 399enc.Register(marshalTime) 400 401dec, err := csvutil.NewDecoder(r) 402if err != nil { 403 return err 404} 405dec.Register(unmarshalTime) 406``` 407 4082. With custom type: 409```go 410type Time struct { 411 time.Time 412} 413 414const format = "2006/01/02 15:04:05" 415 416func (t Time) MarshalCSV() ([]byte, error) { 417 var b [len(format)]byte 418 return t.AppendFormat(b[:0], format), nil 419} 420 421func (t *Time) UnmarshalCSV(data []byte) error { 422 tt, err := time.Parse(format, string(data)) 423 if err != nil { 424 return err 425 } 426 *t = Time{Time: tt} 427 return nil 428} 429``` 430 431### Custom struct tags <a name="examples_struct_tags"></a> 432 433Like in other Go encoding packages struct field tags can be used to set 434custom names or options. By default encoders and decoders are looking at `csv` tag. 435However, this can be overriden by manually setting the Tag field. 436 437```go 438 type Foo struct { 439 Bar int `custom:"bar"` 440 } 441``` 442 443```go 444 dec, err := csvutil.NewDecoder(r) 445 if err != nil { 446 log.Fatal(err) 447 } 448 dec.Tag = "custom" 449``` 450 451```go 452 enc := csvutil.NewEncoder(w) 453 enc.Tag = "custom" 454``` 455 456### Slice and Map fields <a name="examples_slice_and_map_field"></a> 457 458There is no default encoding/decoding support for slice and map fields because there is no CSV spec for such values. 459In such case, it is recommended to create a custom type alias and implement Marshaler and Unmarshaler interfaces. 460Please note that slice and map aliases behave differently than aliases of other types - there is no need for type casting. 461 462```go 463 type Strings []string 464 465 func (s Strings) MarshalCSV() ([]byte, error) { 466 return []byte(strings.Join(s, ",")), nil // strings.Join takes []string but it will also accept Strings 467 } 468 469 type StringMap map[string]string 470 471 func (sm StringMap) MarshalCSV() ([]byte, error) { 472 return []byte(fmt.Sprint(sm)), nil 473 } 474 475 func main() { 476 b, err := csvutil.Marshal([]struct { 477 Strings Strings `csv:"strings"` 478 Map StringMap `csv:"map"` 479 }{ 480 {[]string{"a", "b"}, map[string]string{"a": "1"}}, // no type casting is required for slice and map aliases 481 {Strings{"c", "d"}, StringMap{"b": "1"}}, 482 }) 483 484 if err != nil { 485 log.Fatal(err) 486 } 487 488 fmt.Printf("%s\n", b) 489 490 // Output: 491 // strings,map 492 // "a,b",map[a:1] 493 // "c,d",map[b:1] 494 } 495``` 496 497### Nested/Embedded structs <a name="examples_nested_structs"></a> 498 499Both Encoder and Decoder support nested or embedded structs. 500 501Playground: https://play.golang.org/p/ZySjdVkovbf 502 503```go 504package main 505 506import ( 507 "fmt" 508 509 "github.com/jszwec/csvutil" 510) 511 512type Address struct { 513 Street string `csv:"street"` 514 City string `csv:"city"` 515} 516 517type User struct { 518 Name string `csv:"name"` 519 Address 520} 521 522func main() { 523 users := []User{ 524 { 525 Name: "John", 526 Address: Address{ 527 Street: "Boylston", 528 City: "Boston", 529 }, 530 }, 531 } 532 533 b, err := csvutil.Marshal(users) 534 if err != nil { 535 panic(err) 536 } 537 538 fmt.Printf("%s\n", b) 539 540 var out []User 541 if err := csvutil.Unmarshal(b, &out); err != nil { 542 panic(err) 543 } 544 545 fmt.Printf("%+v\n", out) 546 547 // Output: 548 // 549 // name,street,city 550 // John,Boylston,Boston 551 // 552 // [{Name:John Address:{Street:Boylston City:Boston}}] 553} 554``` 555 556### Inline tag <a name="examples_inlined_structs"></a> 557 558Fields with inline tag behave similarly to embedded struct fields. However, 559it gives a possibility to specify the prefix for all underlying fields. This 560can be useful when one structure can define multiple CSV columns because they 561are different from each other only by a certain prefix. Look at the example below. 562 563Playground: https://play.golang.org/p/jyEzeskSnj7 564 565```go 566package main 567 568import ( 569 "fmt" 570 571 "github.com/jszwec/csvutil" 572) 573 574func main() { 575 type Address struct { 576 Street string `csv:"street"` 577 City string `csv:"city"` 578 } 579 580 type User struct { 581 Name string `csv:"name"` 582 Address Address `csv:",inline"` 583 HomeAddress Address `csv:"home_address_,inline"` 584 WorkAddress Address `csv:"work_address_,inline"` 585 Age int `csv:"age,omitempty"` 586 } 587 588 users := []User{ 589 { 590 Name: "John", 591 Address: Address{"Washington", "Boston"}, 592 HomeAddress: Address{"Boylston", "Boston"}, 593 WorkAddress: Address{"River St", "Cambridge"}, 594 Age: 26, 595 }, 596 } 597 598 b, err := csvutil.Marshal(users) 599 if err != nil { 600 fmt.Println("error:", err) 601 } 602 603 fmt.Printf("%s\n", b) 604 605 // Output: 606 // name,street,city,home_address_street,home_address_city,work_address_street,work_address_city,age 607 // John,Washington,Boston,Boylston,Boston,River St,Cambridge,26 608} 609``` 610 611Performance 612------------ 613 614csvutil provides the best encoding and decoding performance with small memory usage. 615 616### Unmarshal <a name="performance_unmarshal"></a> 617 618[benchmark code](https://gist.github.com/jszwec/e8515e741190454fa3494bcd3e1f100f) 619 620#### csvutil: 621``` 622BenchmarkUnmarshal/csvutil.Unmarshal/1_record-12 280696 4516 ns/op 7332 B/op 26 allocs/op 623BenchmarkUnmarshal/csvutil.Unmarshal/10_records-12 95750 11517 ns/op 8356 B/op 35 allocs/op 624BenchmarkUnmarshal/csvutil.Unmarshal/100_records-12 14997 83146 ns/op 18532 B/op 125 allocs/op 625BenchmarkUnmarshal/csvutil.Unmarshal/1000_records-12 1485 750143 ns/op 121094 B/op 1025 allocs/op 626BenchmarkUnmarshal/csvutil.Unmarshal/10000_records-12 154 7587205 ns/op 1136662 B/op 10025 allocs/op 627BenchmarkUnmarshal/csvutil.Unmarshal/100000_records-12 14 76126616 ns/op 11808744 B/op 100025 allocs/op 628``` 629 630#### gocsv: 631``` 632BenchmarkUnmarshal/gocsv.Unmarshal/1_record-12 141330 7499 ns/op 7795 B/op 97 allocs/op 633BenchmarkUnmarshal/gocsv.Unmarshal/10_records-12 54252 21664 ns/op 13891 B/op 307 allocs/op 634BenchmarkUnmarshal/gocsv.Unmarshal/100_records-12 6920 159662 ns/op 72644 B/op 2380 allocs/op 635BenchmarkUnmarshal/gocsv.Unmarshal/1000_records-12 752 1556083 ns/op 650248 B/op 23083 allocs/op 636BenchmarkUnmarshal/gocsv.Unmarshal/10000_records-12 72 17086623 ns/op 7017469 B/op 230092 allocs/op 637BenchmarkUnmarshal/gocsv.Unmarshal/100000_records-12 7 163610749 ns/op 75004923 B/op 2300105 allocs/op 638``` 639 640#### easycsv: 641``` 642BenchmarkUnmarshal/easycsv.ReadAll/1_record-12 101527 10662 ns/op 8855 B/op 81 allocs/op 643BenchmarkUnmarshal/easycsv.ReadAll/10_records-12 23325 51437 ns/op 24072 B/op 391 allocs/op 644BenchmarkUnmarshal/easycsv.ReadAll/100_records-12 2402 447296 ns/op 170538 B/op 3454 allocs/op 645BenchmarkUnmarshal/easycsv.ReadAll/1000_records-12 272 4370854 ns/op 1595683 B/op 34057 allocs/op 646BenchmarkUnmarshal/easycsv.ReadAll/10000_records-12 24 47502457 ns/op 18861808 B/op 340068 allocs/op 647BenchmarkUnmarshal/easycsv.ReadAll/100000_records-12 3 468974170 ns/op 189427066 B/op 3400082 allocs/op 648``` 649 650### Marshal <a name="performance_marshal"></a> 651 652[benchmark code](https://gist.github.com/jszwec/31980321e1852ebb5615a44ccf374f17) 653 654#### csvutil: 655``` 656BenchmarkMarshal/csvutil.Marshal/1_record-12 279558 4390 ns/op 9952 B/op 12 allocs/op 657BenchmarkMarshal/csvutil.Marshal/10_records-12 82478 15608 ns/op 10800 B/op 21 allocs/op 658BenchmarkMarshal/csvutil.Marshal/100_records-12 10275 117288 ns/op 28208 B/op 112 allocs/op 659BenchmarkMarshal/csvutil.Marshal/1000_records-12 1075 1147473 ns/op 168508 B/op 1014 allocs/op 660BenchmarkMarshal/csvutil.Marshal/10000_records-12 100 11985382 ns/op 1525973 B/op 10017 allocs/op 661BenchmarkMarshal/csvutil.Marshal/100000_records-12 9 113640813 ns/op 22455873 B/op 100021 allocs/op 662``` 663 664#### gocsv: 665``` 666BenchmarkMarshal/gocsv.Marshal/1_record-12 203052 6077 ns/op 5914 B/op 81 allocs/op 667BenchmarkMarshal/gocsv.Marshal/10_records-12 50132 24585 ns/op 9284 B/op 360 allocs/op 668BenchmarkMarshal/gocsv.Marshal/100_records-12 5480 212008 ns/op 51916 B/op 3151 allocs/op 669BenchmarkMarshal/gocsv.Marshal/1000_records-12 514 2053919 ns/op 444506 B/op 31053 allocs/op 670BenchmarkMarshal/gocsv.Marshal/10000_records-12 52 21066666 ns/op 4332377 B/op 310064 allocs/op 671BenchmarkMarshal/gocsv.Marshal/100000_records-12 5 207408929 ns/op 51169419 B/op 3100077 allocs/op 672``` 673