Back to Blog

New liberal_parsing option for parsing bad CSV data

on November 22, 2016
This blog is part of our Ruby 2.4 series.

Comma-Separated Values (CSV) is a widely used data format and almost every language has a module to parse it. In Ruby, we have CSV class to do that.

According to RFC 4180, we cannot have unescaped double quotes in CSV input since such data can't be parsed.

We get MalformedCSVError error when the CSV data does not conform to RFC 4180.

Ruby 2.4 has added a liberal parsing option to parse such bad data. When it is set to true, Ruby will try to parse the data even when the data does not conform to RFC 4180.

1
2# Before Ruby 2.4
3
4> CSV.parse_line('one,two",three,four')
5
6CSV::MalformedCSVError: Illegal quoting in line 1.
7
8
9# With Ruby 2.4
10
11> CSV.parse_line('one,two",three,four', liberal_parsing: true)
12
13=> ["one", "two\"", "three", "four"]
14

You might also like

If you liked this blog post, check out similar ones from BigBinary