CSV
CSV (comma-separated values) is a file format for easily reading tabular data such as numbers and text. It is defined in the RFC 4180 specification, but not all CSV files follow the spec. Some might delimit data with pipes, tabs, or semicolons.
CSV values are text
CSV is stored in a text file, so all values are assumed to be text. If you want to use the value as another type, you have to cast it as the desired type.
Go provides tools to work with CSV in its encoding/csv package.
Read entire file as 2D slice
ReadAll reads all records from the given Reader and stores its values in a 2D array of strings, where each array is a row:
- Open the CSV file.
Openreturns a file handle, which is a Reader. - Create a new Reader that reads from the given Reader.
ReadAllreads all remaining records from the Reader caller.- Do some work with the records. Here, we print each row in the 2D array on its own line.
func main() {
file, err := os.Open("email.csv") // 1
if err != nil {
log.Println("Cannot open CSV file: ", err)
}
defer file.Close()
csvReader := csv.NewReader(file) // 2
rows, err := csvReader.ReadAll() // 3
if err != nil {
log.Println("Cannot read CSV file: ", err)
}
for _, row := range rows { // 4
fmt.Printf("%q\n", row)
}
}
Read one row at a time
For large CSV files, it might be easier to read one row at a time rather than all at once. The Read method reads one record (a slice of fields) from the reader. You have to call it in an infinite for loop with a break condition to read all records:
- Open the CSV file.
Openreturns a file handle, which is a Reader. - Create a new Reader that reads from the given Reader.
- Start an infinite loop to read each record individually.
Readreads the record.- Check for an
EOF. If you reach the end of the file, break out of the infinite loop and continue executingmain. - Print each value in the record with a
for...rangeloop. - Optional pattern to print each record with its index.
func main() {
file, err := os.Open("email.csv") // 1
if err != nil {
log.Println("Cannot open CSV file: ", err)
}
defer file.Close()
csvReader := csv.NewReader(file) // 2
for { // 3
record, err := csvReader.Read() // 4
if err == io.EOF { // 5
break
}
if err != nil {
log.Println("Cannot read CSV file: ", err)
}
for _, val := range record { // 6
fmt.Println(val)
}
// for i, _ := range record { // 7
// fmt.Printf("%s\n", record[i])
// }
}
}
Marshal into structs
Instead of dealing with the 2D slice of strings that ReadAll returns, you can marshal the CSV fields into structs. This example uses the following CSV file:
email,id,first_name,last_name
rachel@yourcompany.com,9012,Rachel,Booker
laura@yourcompany.com,2070,Laura,Grey
craig@yourcompany.com,4081,Craig,Johnson
mary@yourcompany.com,9346,Mary,Jenkins
jamie@yourcompany.com,5079,Jamie,Smith
First, create a struct that models the CSV file:
type User struct {
Email string
Id int
FirstName string
LastName string
}
marshalCSVToUsers takes a Reader and returns a slice of User structs:
- Create a new CSV Reader that reads data from the given Reader.
ReadAllreads all remaining records from the Reader.- Create a slice of
Userstructs. - Iterate through the 2D rows returned from the reader with a
for...rangeloop. - In the CSV file, the
Idis the second field and its of typeint. Convert the value to anintwithAtoi. You have to do this separately becauseAtoireturns both anintand anerror. - The remaining fields are of type
string, so assign each struct field to its corresponding row field with the row index. Assign the convertedidvariable to theIdfield. - Append the new
userstruct to the slice. - Return the slice of
Userstructs.
func marshalCSVToUsers(r io.Reader) ([]User, error) {
csvReader := csv.NewReader(r) // 1
rows, err := csvReader.ReadAll() // 2
if err != nil {
log.Println("Cannot read CSV file: ", err)
}
var users []User // 3
for _, row := range rows { // 4
id, _ := strconv.Atoi(row[1]) // 5
user := User{ // 6
Email: row[0],
Id: id,
FirstName: row[2],
LastName: row[3],
}
users = append(users, user)
}
return users, nil
}
Finally, call the function in main:
- Open the CSV file.
Openreturns a file handle, which is a Reader. - Pass the
fileReader tomarshalCSVToUsersto get a slice ofUserstructs that contain the data fromfile. - Do some work with the
usersslice.
func main() {
file, err := os.Open("email.csv")
if err != nil {
log.Println("Cannot open CSV file: ", err)
}
defer file.Close()
users, _ := marshalCSVToUsers(file)
for _, user := range users {
fmt.Println(user)
}
}
Remove the column headers
CSV files usually start with a header row that contains the column labels. You can remove this by calling Read to discard the first line, then reading the remainder of the file with ReadAll:
- Open the CSV file.
Openreturns a file handle, which is a Reader. - Create a new Reader that reads from the given Reader.
- Call
Readto read the first row that contains the column headers. You don’t assign the return value to anything, so it is discarded. ReadAllreads all remaining records from thecsvReader.- Do some work with the records. Here, we print each row in the 2D array on its own line.
func main() {
file, err := os.Open("email.csv") // 1
if err != nil {
log.Println("Cannot open CSV file: ", err)
}
defer file.Close()
csvReader := csv.NewReader(file) // 2
csvReader.Read() // 3
rows, err := csvReader.ReadAll() // 4
if err != nil {
log.Println("Cannot read CSV file: ", err)
}
for _, row := range rows { // 5
fmt.Printf("%q\n", row)
}
}
Custom delimiters
When you create a CSV Reader, it uses comma delimiters by default. To change the delimiter, set the Comma variable in the Reader instance:
- Create the CSV Reader.
- Set the Reader’s
Commavariable. BecauseCommais arune, you need to set it with a single quote.
func main() {
file, err := os.Open("semicolon.csv")
if err != nil {
log.Println("Cannot open CSV file: ", err)
}
defer file.Close()
csvReader := csv.NewReader(file) // 1
csvReader.Comma = ';' // 2
rows, err := csvReader.ReadAll()
if err != nil {
log.Println("Cannot read CSV file: ", err)
}
// do something with rows
}
Ignore rows
In many text formats, there is a special character that indicates the line is a comment and should be ignored. For example, if you want to ignore a line of text in a bash script, you place a # symbol at the start of the line. CSV files do not have a standardized “comment” character, but the CSV Reader type has a Comment variable you can set to ignore lines that begin with the given character.
The process is similar to the custom delimiter example, where you set the variable with a rune:
- Create the CSV Reader.
- Set the Reader’s
Commentvariable. BecauseCommentis arune, you need to set it with a single quote.
func main() {
file, err := os.Open("semicolon.csv")
if err != nil {
log.Println("Cannot open CSV file: ", err)
}
defer file.Close()
csvReader := csv.NewReader(file) // 1
csvReader.Comma = ';'
csvReader.Comment = '#' // 2
rows, err := csvReader.ReadAll()
if err != nil {
log.Println("Cannot read CSV file: ", err)
}
// do something with rows
}
Writing CSV data
WriteAll CSV data
Write a complete CSV file with the WriteAll function. The data that you write must be a 2D array:
- Create a file that you want to write data to.
Createreturns a file handle, which is a Writer. - Create a 2D array that models the CSV data that you want to write to the file.
- Create a new CSV Writer, and pass it the
fileWriter. The CSV Writer is a wrapper around thefileWriter. Thefilewrites directly to the output stream, and the CSV Writer handles CSV-specific tasks, such as escaping fields, quoting, etc. WriteAllwrites all data in the 2D array to the underlying Writer. It returns only anerror.
func main() {
file, err := os.Create("users.csv") // 1
if err != nil {
log.Println("Cannot create CSV file: ", err)
}
defer file.Close()
data := [][]string{ // 2
{"email", "id", "first_name", "last_name"},
{"rachel@yourcompany.com", "9012", "Rachel,Booker"},
{"laura@yourcompany.com", "2070", "Laura,Grey"},
{"craig@yourcompany.com", "4081", "Craig,Johnson"},
{"mary@yourcompany.com", "9346", "Mary,Jenkins"},
{"jamie@yourcompany.com", "5079", "Jamie,Smith"},
}
csvWriter := csv.NewWriter(file) // 3
err = csvWriter.WriteAll(data) // 4
if err != nil {
log.Println("Cannot write to CSV file", err)
}
}
Write one row at a time
If you want more control over the writing operation, you can write one row at a time to the file with the Write function. This requires that you use a for...range loop to iterate over the data in the 2D array and call Write during each iteration:
- Create a file that you want to write data to.
Createreturns a file handle, which is a Writer. - Create a 2D array that models the CSV data that you want to write to the file.
- Create a new CSV Writer, and pass it the
fileWriter. The CSV Writer is a wrapper around thefileWriter. Thefilewrites directly to the output stream, and the CSV Writer handles CSV-specific tasks, such as escaping fields, quoting, etc. - Create a
for...rangeloop that ranges over the 2Ddataslice. - Each iteration visits a
rowin the 2D slice. During each iteration, callWriteto write therowto the underlyingfileWriter. - After the
for...rangeloop exits, callFlushto flush the underlyingfilewriter and write any remaining data to the output stream.
func main() {
file, err := os.Create("users2.csv") // 1
if err != nil {
log.Println("Cannot create CSV file: ", err)
}
defer file.Close()
data := [][]string{ // 2
{"email", "id", "first_name", "last_name"},
{"rachel@yourcompany.com", "9012", "Rachel,Booker"},
{"laura@yourcompany.com", "2070", "Laura,Grey"},
{"craig@yourcompany.com", "4081", "Craig,Johnson"},
{"mary@yourcompany.com", "9346", "Mary,Jenkins"},
{"jamie@yourcompany.com", "5079", "Jamie,Smith"},
}
csvWriter := csv.NewWriter(file) // 3
for _, row := range data { // 4
err = csvWriter.Write(row) // 5
if err != nil {
log.Println("Cannot write to CSV file: ", err)
}
}
csvWriter.Flush() // 6
}