class CSV - RDoc Documentation (original) (raw)

CSV

CSV (comma-separated variables) data is a text representation of a table:

This CSV String, with row separator "\n" and column separator ",", has three rows and two columns:

"foo,0\nbar,1\nbaz,2\n"

Despite the name CSV, a CSV representation can use different separators.

For more about tables, see the Wikipedia article “Table (information)”, especially its section “Simple table

Class CSV

Class CSV provides methods for:

To make CSV available:

require 'csv'

All examples here assume that this has been done.

Keeping It Simple

A CSV object has dozens of instance methods that offer fine-grained control of parsing and generating CSV data. For many needs, though, simpler approaches will do.

This section summarizes the singleton methods in CSV that allow you to parse and generate without explicitly creating CSV objects. For details, follow the links.

Simple Parsing

Parsing methods commonly return either of:

Parsing a String

The input to be parsed can be a string:

string = "foo,0\nbar,1\nbaz,2\n"

Method CSV.parse returns the entire CSV data:

CSV.parse(string)

Method CSV.parse_line returns only the first row:

CSV.parse_line(string)

CSV extends class String with instance method String#parse_csv, which also returns only the first row:

string.parse_csv

Parsing Via a File Path

The input to be parsed can be in a file:

string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string)

Method CSV.read returns the entire CSV data:

CSV.read(path)

Method CSV.foreach iterates, passing each row to the given block:

CSV.foreach(path) do |row| p row end

Output:

["foo", "0"] ["bar", "1"] ["baz", "2"]

Method CSV.table returns the entire CSV data as a CSV::Table object:

CSV.table(path)

Parsing from an Open IO Stream

The input to be parsed can be in an open IO stream:

Method CSV.read returns the entire CSV data:

File.open(path) do |file| CSV.read(file) end

As does method CSV.parse:

File.open(path) do |file| CSV.parse(file) end

Method CSV.parse_line returns only the first row:

File.open(path) do |file| CSV.parse_line(file) end

Method CSV.foreach iterates, passing each row to the given block:

File.open(path) do |file| CSV.foreach(file) do |row| p row end end

Output:

["foo", "0"] ["bar", "1"] ["baz", "2"]

Method CSV.table returns the entire CSV data as a CSV::Table object:

File.open(path) do |file| CSV.table(file) end

Simple Generating

Method CSV.generate returns a String; this example uses method CSV#<< to append the rows that are to be generated:

output_string = CSV.generate do |csv| csv << ['foo', 0] csv << ['bar', 1] csv << ['baz', 2] end output_string

Method CSV.generate_line returns a String containing the single row constructed from an Array:

CSV.generate_line(['foo', '0'])

CSV extends class Array with instance method Array#to_csv, which forms an Array into a String:

['foo', '0'].to_csv

“Filtering” CSV

Method CSV.filter provides a Unix-style filter for CSV data. The input data is processed to form the output data:

in_string = "foo,0\nbar,1\nbaz,2\n" out_string = '' CSV.filter(in_string, out_string) do |row| row[0] = row[0].upcase row[1] *= 4 end out_string

CSV Objects

There are three ways to create a CSV object:

Instance Methods

CSV has three groups of instance methods:

Delegated Methods

For convenience, a CSV object will delegate to many methods in class IO. (A few have wrapper “guard code” in CSV.) You may call:

Options

The default values for options are:

DEFAULT_OPTIONS = {

col_sep: ",", row_sep: :auto, quote_char: '"',

field_size_limit: nil, converters: nil, unconverted_fields: nil, headers: false, return_headers: false, header_converters: nil, skip_blanks: false, skip_lines: nil, liberal_parsing: false, nil_value: nil, empty_value: "",

write_headers: nil, quote_empty: true, force_quotes: false, write_converters: nil, write_nil_value: nil, write_empty_value: "", strip: false, }

Options for Parsing

Options for parsing, described in detail below, include:

Option row_sep

Specifies the row separator, a String or the Symbol :auto (see below), to be used for both parsing and generating.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:row_sep)


When row_sep is a String, that String becomes the row separator. The String will be transcoded into the data's Encoding before use.

Using "\n":

row_sep = "\n" str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str ary = CSV.parse(str) ary

Using | (pipe):

row_sep = '|' str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str ary = CSV.parse(str, row_sep: row_sep) ary

Using -- (two hyphens):

row_sep = '--' str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str ary = CSV.parse(str, row_sep: row_sep) ary

Using '' (empty string):

row_sep = '' str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str ary = CSV.parse(str, row_sep: row_sep) ary


When row_sep is the Symbol :auto (the default), generating uses "\n" as the row separator:

str = CSV.generate do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str

Parsing, on the other hand, invokes auto-discovery of the row separator.

Auto-discovery reads ahead in the data looking for the next \r\n, \n, or \r sequence. The sequence will be selected even if it occurs in a quoted field, assuming that you would have the same line endings there.

Example:

str = CSV.generate do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str ary = CSV.parse(str) ary

The default $INPUT_RECORD_SEPARATOR ($/) is used if any of the following is true:

Obviously, discovery takes a little time. Set manually if speed is important. Also note that IO objects should be opened in binary mode on Windows if this feature will be used as the line-ending translation can cause problems with resetting the document position to where it was before the read ahead.


Raises an exception if the given value is not String-convertible:

row_sep = BasicObject.new

CSV.generate(ary, row_sep: row_sep)

CSV.parse(str, row_sep: row_sep)

Option col_sep

Specifies the String field separator to be used for both parsing and generating. The String will be transcoded into the data's Encoding before use.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:col_sep)

Using the default (comma):

str = CSV.generate do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str ary = CSV.parse(str) ary

Using : (colon):

col_sep = ':' str = CSV.generate(col_sep: col_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str ary = CSV.parse(str, col_sep: col_sep) ary

Using :: (two colons):

col_sep = '::' str = CSV.generate(col_sep: col_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str ary = CSV.parse(str, col_sep: col_sep) ary

Using '' (empty string):

col_sep = '' str = CSV.generate(col_sep: col_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str


Raises an exception if parsing with the empty String:

col_sep = ''

CSV.parse("foo0\nbar1\nbaz2\n", col_sep: col_sep)

Raises an exception if the given value is not String-convertible:

col_sep = BasicObject.new

CSV.generate(line, col_sep: col_sep)

CSV.parse(str, col_sep: col_sep)

Option quote_char

Specifies the character (String of length 1) used used to quote fields in both parsing and generating. This String will be transcoded into the data's Encoding before use.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:quote_char)

This is useful for an application that incorrectly uses ' (single-quote) to quote fields, instead of the correct " (double-quote).

Using the default (double quote):

str = CSV.generate do |csv| csv << ['foo', 0] csv << ["'bar'", 1] csv << ['"baz"', 2] end str ary = CSV.parse(str) ary

Using ' (single-quote):

quote_char = "'" str = CSV.generate(quote_char: quote_char) do |csv| csv << ['foo', 0] csv << ["'bar'", 1] csv << ['"baz"', 2] end str ary = CSV.parse(str, quote_char: quote_char) ary


Raises an exception if the String length is greater than 1:

CSV.new('', quote_char: 'xx')

Raises an exception if the value is not a String:

CSV.new('', quote_char: :foo)

Option field_size_limit

Specifies the Integer field size limit.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:field_size_limit)

This is a maximum size CSV will read ahead looking for the closing quote for a field. (In truth, it reads to the first line ending beyond this size.) If a quote cannot be found within the limit CSV will raise a MalformedCSVError, assuming the data is faulty. You can use this limit to prevent what are effectively DoS attacks on the parser. However, this limit can cause a legitimate parse to fail; therefore the default value is nil (no limit).

For the examples in this section:

str = <<~EOT "a","b" " 2345 ","" EOT str

Using the default nil:

ary = CSV.parse(str) ary

Using 50:

field_size_limit = 50 ary = CSV.parse(str, field_size_limit: field_size_limit) ary


Raises an exception if a field is too long:

big_str = "123456789\n" * 1024

CSV.parse('valid,fields,"' + big_str + '"', field_size_limit: 2048)

Option converters

Specifies converters to be used in parsing fields. See Field Converters

Default value:

CSV::DEFAULT_OPTIONS.fetch(:converters)

The value may be a field converter name (see Stored Converters):

str = '1,2,3'

array = CSV.parse_line(str) array

array = CSV.parse_line(str, converters: :integer) array

The value may be a converter list (see Converter Lists):

str = '1,3.14159'

array = CSV.parse_line(str) array

array = CSV.parse_line(str, converters: [:integer, :float]) array

The value may be a Proc custom converter: (see Custom Field Converters):

str = ' foo , bar , baz '

array = CSV.parse_line(str) array

array = CSV.parse_line(str, converters: proc {|field| field.strip }) array

See also Custom Field Converters


Raises an exception if the converter is not a converter name or a Proc:

str = 'foo,0'

CSV.parse(str, converters: :foo)

Option unconverted_fields

Specifies the boolean that determines whether unconverted field values are to be available.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:unconverted_fields)

The unconverted field values are those found in the source data, prior to any conversions performed via option converters.

When option unconverted_fields is true, each returned row (Array or CSV::Row) has an added method, unconverted_fields, that returns the unconverted field values:

str = <<-EOT foo,0 bar,1 baz,2 EOT

csv = CSV.parse(str, converters: :integer) csv csv.first.respond_to?(:unconverted_fields)

csv = CSV.parse(str, converters: :integer, unconverted_fields: true) csv csv.first.respond_to?(:unconverted_fields) csv.first.unconverted_fields

Specifies a boolean, Symbol, Array, or String to be used to define column headers.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:headers)


Without headers:

str = <<-EOT Name,Count foo,0 bar,1 bax,2 EOT csv = CSV.new(str) csv csv.headers csv.shift


If set to true or the Symbol :first_row, the first row of the data is treated as a row of headers:

str = <<-EOT Name,Count foo,0 bar,1 bax,2 EOT csv = CSV.new(str, headers: true) csv csv.headers csv.shift


If set to an Array, the Array elements are treated as headers:

str = <<-EOT foo,0 bar,1 bax,2 EOT csv = CSV.new(str, headers: ['Name', 'Count']) csv csv.headers csv.shift


If set to a String str, method CSV::parse_line(str, options) is called with the current options, and the returned Array is treated as headers:

str = <<-EOT foo,0 bar,1 bax,2 EOT csv = CSV.new(str, headers: 'Name,Count') csv csv.headers csv.shift

Specifies the boolean that determines whether method shift returns or ignores the header row.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:return_headers)

Examples:

str = <<-EOT Name,Count foo,0 bar,1 bax,2 EOT

csv = CSV.new(str, headers: true) csv.shift

csv = CSV.new(str, headers: true, return_headers: true) csv.shift

Specifies converters to be used in parsing headers. See Header Converters

Default value:

CSV::DEFAULT_OPTIONS.fetch(:header_converters)

Identical in functionality to option converters except that:

This section assumes prior execution of:

str = <<-EOT Name,Value foo,0 bar,1 baz,2 EOT

table = CSV.parse(str, headers: true) table.headers

The value may be a header converter name (see Stored Converters):

table = CSV.parse(str, headers: true, header_converters: :downcase) table.headers

The value may be a converter list (see Converter Lists):

header_converters = [:downcase, :symbol] table = CSV.parse(str, headers: true, header_converters: header_converters) table.headers

The value may be a Proc custom converter (see Custom Header Converters):

upcase_converter = proc {|field| field.upcase } table = CSV.parse(str, headers: true, header_converters: upcase_converter) table.headers

See also Custom Header Converters

Option skip_blanks

Specifies a boolean that determines whether blank lines in the input will be ignored; a line that contains a column separator is not considered to be blank.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:skip_blanks)

See also option skiplines.

For examples in this section:

str = <<-EOT foo,0

bar,1 baz,2

, EOT

Using the default, false:

ary = CSV.parse(str) ary

Using true:

ary = CSV.parse(str, skip_blanks: true) ary

Using a truthy value:

ary = CSV.parse(str, skip_blanks: :foo) ary

Option skip_lines

Specifies an object to use in identifying comment lines in the input that are to be ignored:

Default value:

CSV::DEFAULT_OPTIONS.fetch(:skip_lines)

For examples in this section:

str = <<-EOT

Comment

foo,0 bar,1 baz,2

Another comment

EOT str

Using the default, nil:

ary = CSV.parse(str) ary

Using a Regexp:

ary = CSV.parse(str, skip_lines: /^#/) ary

Using a String:

ary = CSV.parse(str, skip_lines: '#') ary


Raises an exception if given an object that is not a Regexp, a String, or nil:

CSV.parse(str, skip_lines: 0)

Option strip

Specifies the boolean value that determines whether whitespace is stripped from each input field.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:strip)

With default value false:

ary = CSV.parse_line(' a , b ') ary

With value true:

ary = CSV.parse_line(' a , b ', strip: true) ary

Option liberal_parsing

Specifies the boolean value that determines whether CSV will attempt to parse input not conformant with RFC 4180, such as double quotes in unquoted fields.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:liberal_parsing)

For examples in this section:

str = 'is,this "three, or four",fields'

Without liberal_parsing:

CSV.parse_line(str)

With liberal_parsing:

ary = CSV.parse_line(str, liberal_parsing: true) ary

Option nil_value

Specifies the object that is to be substituted for each null (no-text) field.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:nil_value)

With the default, nil:

CSV.parse_line('a,,b,,c')

With a different object:

CSV.parse_line('a,,b,,c', nil_value: 0)

Option empty_value

Specifies the object that is to be substituted for each field that has an empty String.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:empty_value)

With the default, "":

CSV.parse_line('a,"",b,"",c')

With a different object:

CSV.parse_line('a,"",b,"",c', empty_value: 'x')

Options for Generating

Options for generating, described in detail below, include:

Option row_sep

Specifies the row separator, a String or the Symbol :auto (see below), to be used for both parsing and generating.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:row_sep)


When row_sep is a String, that String becomes the row separator. The String will be transcoded into the data's Encoding before use.

Using "\n":

row_sep = "\n" str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str ary = CSV.parse(str) ary

Using | (pipe):

row_sep = '|' str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str ary = CSV.parse(str, row_sep: row_sep) ary

Using -- (two hyphens):

row_sep = '--' str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str ary = CSV.parse(str, row_sep: row_sep) ary

Using '' (empty string):

row_sep = '' str = CSV.generate(row_sep: row_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str ary = CSV.parse(str, row_sep: row_sep) ary


When row_sep is the Symbol :auto (the default), generating uses "\n" as the row separator:

str = CSV.generate do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str

Parsing, on the other hand, invokes auto-discovery of the row separator.

Auto-discovery reads ahead in the data looking for the next \r\n, \n, or \r sequence. The sequence will be selected even if it occurs in a quoted field, assuming that you would have the same line endings there.

Example:

str = CSV.generate do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str ary = CSV.parse(str) ary

The default $INPUT_RECORD_SEPARATOR ($/) is used if any of the following is true:

Obviously, discovery takes a little time. Set manually if speed is important. Also note that IO objects should be opened in binary mode on Windows if this feature will be used as the line-ending translation can cause problems with resetting the document position to where it was before the read ahead.


Raises an exception if the given value is not String-convertible:

row_sep = BasicObject.new

CSV.generate(ary, row_sep: row_sep)

CSV.parse(str, row_sep: row_sep)

Option col_sep

Specifies the String field separator to be used for both parsing and generating. The String will be transcoded into the data's Encoding before use.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:col_sep)

Using the default (comma):

str = CSV.generate do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str ary = CSV.parse(str) ary

Using : (colon):

col_sep = ':' str = CSV.generate(col_sep: col_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str ary = CSV.parse(str, col_sep: col_sep) ary

Using :: (two colons):

col_sep = '::' str = CSV.generate(col_sep: col_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str ary = CSV.parse(str, col_sep: col_sep) ary

Using '' (empty string):

col_sep = '' str = CSV.generate(col_sep: col_sep) do |csv| csv << [:foo, 0] csv << [:bar, 1] csv << [:baz, 2] end str


Raises an exception if parsing with the empty String:

col_sep = ''

CSV.parse("foo0\nbar1\nbaz2\n", col_sep: col_sep)

Raises an exception if the given value is not String-convertible:

col_sep = BasicObject.new

CSV.generate(line, col_sep: col_sep)

CSV.parse(str, col_sep: col_sep)

Option quote_char

Specifies the character (String of length 1) used used to quote fields in both parsing and generating. This String will be transcoded into the data's Encoding before use.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:quote_char)

This is useful for an application that incorrectly uses ' (single-quote) to quote fields, instead of the correct " (double-quote).

Using the default (double quote):

str = CSV.generate do |csv| csv << ['foo', 0] csv << ["'bar'", 1] csv << ['"baz"', 2] end str ary = CSV.parse(str) ary

Using ' (single-quote):

quote_char = "'" str = CSV.generate(quote_char: quote_char) do |csv| csv << ['foo', 0] csv << ["'bar'", 1] csv << ['"baz"', 2] end str ary = CSV.parse(str, quote_char: quote_char) ary


Raises an exception if the String length is greater than 1:

CSV.new('', quote_char: 'xx')

Raises an exception if the value is not a String:

CSV.new('', quote_char: :foo)

Specifies the boolean that determines whether a header row is included in the output; ignored if there are no headers.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:write_headers)

Without write_headers:

file_path = 't.csv' CSV.open(file_path,'w', :headers => ['Name','Value'] ) do |csv| csv << ['foo', '0'] end CSV.open(file_path) do |csv| csv.shift end

With write_headers“:

CSV.open(file_path,'w', :write_headers=> true, :headers => ['Name','Value'] ) do |csv| csv << ['foo', '0'] end CSV.open(file_path) do |csv| csv.shift end

Option force_quotes

Specifies the boolean that determines whether each output field is to be double-quoted.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:force_quotes)

For examples in this section:

ary = ['foo', 0, nil]

Using the default, false:

str = CSV.generate_line(ary) str

Using true:

str = CSV.generate_line(ary, force_quotes: true) str

Option quote_empty

Specifies the boolean that determines whether an empty value is to be double-quoted.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:quote_empty)

With the default true:

CSV.generate_line(['"', ""])

With false:

CSV.generate_line(['"', ""], quote_empty: false)

Option write_converters

Specifies converters to be used in generating fields. See Write Converters

Default value:

CSV::DEFAULT_OPTIONS.fetch(:write_converters)

With no write converter:

str = CSV.generate_line(["\na\n", "\tb\t", " c "]) str

With a write converter:

strip_converter = proc {|field| field.strip } str = CSV.generate_line(["\na\n", "\tb\t", " c "], write_converters: strip_converter) str

With two write converters (called in order):

upcase_converter = proc {|field| field.upcase } downcase_converter = proc {|field| field.downcase } write_converters = [upcase_converter, downcase_converter] str = CSV.generate_line(['a', 'b', 'c'], write_converters: write_converters) str

See also Write Converters


Raises an exception if the converter returns a value that is neither nil nor String-convertible:

bad_converter = proc {|field| BasicObject.new }

CSV.generate_line(['a', 'b', 'c'], write_converters: bad_converter)

Option write_nil_value

Specifies the object that is to be substituted for each nil-valued field.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:write_nil_value)

Without the option:

str = CSV.generate_line(['a', nil, 'c', nil]) str

With the option:

str = CSV.generate_line(['a', nil, 'c', nil], write_nil_value: "x") str

Option write_empty_value

Specifies the object that is to be substituted for each field that has an empty String.

Default value:

CSV::DEFAULT_OPTIONS.fetch(:write_empty_value)

Without the option:

str = CSV.generate_line(['a', '', 'c', '']) str

With the option:

str = CSV.generate_line(['a', '', 'c', ''], write_empty_value: "x") str

CSV allows to specify column names of CSV file, whether they are in data, or provided separately. If headers are specified, reading methods return an instance of CSV::Table, consisting of CSV::Row.

data = CSV.parse(<<~ROWS, headers: true) Name,Department,Salary Bob,Engineering,1000 Jane,Sales,2000 John,Management,5000 ROWS

data.class
data.first
data.first.to_h

data = CSV.parse('Bob,Engineering,1000', headers: %i[name department salary]) data.first

Converters

By default, each value (field or header) parsed by CSV is formed into a String. You can use a field converter or header converter to intercept and modify the parsed values:

Also by default, each value to be written during generation is written 'as-is'. You can use a write converter to modify values before writing.

Specifying Converters

You can specify converters for parsing or generating in the options argument to various CSV methods:

There are three forms for specifying converters:

Converter Procs

This converter proc, strip_converter, accepts a value field and returns field.strip:

strip_converter = proc {|field| field.strip }

In this call to CSV.parse, the keyword argument converters: string_converter specifies that:

Example:

string = " foo , 0 \n bar , 1 \n baz , 2 \n" array = CSV.parse(string, converters: strip_converter) array

A converter proc can receive a second argument, field_info, that contains details about the field. This modified strip_converter displays its arguments:

strip_converter = proc do |field, field_info| p [field, field_info] field.strip end string = " foo , 0 \n bar , 1 \n baz , 2 \n" array = CSV.parse(string, converters: strip_converter) array

Output:

[" foo ", #] [" 0 ", #] [" bar ", #] [" 1 ", #] [" baz ", #] [" 2 ", #]

Each CSV::Info object shows:

Stored Converters

A converter may be given a name and stored in a structure where the parsing methods can find it by name.

The storage structure for field converters is the Hash CSV::Converters. It has several built-in converter procs:

. This example creates a converter proc, then stores it:

strip_converter = proc {|field| field.strip } CSV::Converters[:strip] = strip_converter

Then the parsing method call can refer to the converter by its name, :strip:

string = " foo , 0 \n bar , 1 \n baz , 2 \n" array = CSV.parse(string, converters: :strip) array

The storage structure for header converters is the Hash CSV::HeaderConverters, which works in the same way. It also has built-in converter procs:

There is no such storage structure for write headers.

Converter Lists

A converter list is an Array that may include any assortment of:

Examples:

numeric_converters = [:integer, :float] date_converters = [:date, :date_time] [numeric_converters, strip_converter] [strip_converter, date_converters, :float]

Like a converter proc, a converter list may be named and stored in either CSV::Converters or CSV::HeaderConverters:

CSV::Converters[:custom] = [strip_converter, date_converters, :float] CSV::HeaderConverters[:custom] = [:downcase, :symbol]

There are two built-in converter lists:

CSV::Converters[:numeric] CSV::Converters[:all]

Field Converters

With no conversion, all parsed fields in all rows become Strings:

string = "foo,0\nbar,1\nbaz,2\n" ary = CSV.parse(string) ary

When you specify a field converter, each parsed field is passed to the converter; its return value becomes the stored value for the field. A converter might, for example, convert an integer embedded in a String into a true Integer. (In fact, that's what built-in field converter :integer does.)

There are three ways to use field converters.

Installing a field converter does not affect already-read rows:

csv = CSV.new(string) csv.shift

csv.convert(:integer) csv.converters csv.read

There are additional built-in converters, and custom converters are also supported.

Built-In Field Converters

The built-in field converters are in Hash CSV::Converters:

Display:

CSV::Converters.each_pair do |name, value| if value.kind_of?(Proc) p [name, value.class] else p [name, value] end end

Output:

[:integer, Proc] [:float, Proc] [:numeric, [:integer, :float]] [:date, Proc] [:date_time, Proc] [:all, [:date_time, :numeric]]

Each of these converters transcodes values to UTF-8 before attempting conversion. If a value cannot be transcoded to UTF-8 the conversion will fail and the value will remain unconverted.

Converter :integer converts each field that Integer() accepts:

data = '0,1,2,x'

csv = CSV.parse_line(data) csv

csv = CSV.parse_line(data, converters: :integer) csv

Converter :float converts each field that Float() accepts:

data = '1.0,3.14159,x'

csv = CSV.parse_line(data) csv

csv = CSV.parse_line(data, converters: :float) csv

Converter :numeric converts with both :integer and :float..

Converter :date converts each field that Date::parse accepts:

data = '2001-02-03,x'

csv = CSV.parse_line(data) csv

csv = CSV.parse_line(data, converters: :date) csv

Converter :date_time converts each field that DateTime::parse accepts:

data = '2020-05-07T14:59:00-05:00,x'

csv = CSV.parse_line(data) csv

csv = CSV.parse_line(data, converters: :date_time) csv

Converter :numeric converts with both :date_time and :numeric..

As seen above, method convert adds converters to a CSV instance, and method converters returns an Array of the converters in effect:

csv = CSV.new('0,1,2') csv.converters csv.convert(:integer) csv.converters csv.convert(:date) csv.converters

Custom Field Converters

You can define a custom field converter:

strip_converter = proc {|field| field.strip } string = " foo , 0 \n bar , 1 \n baz , 2 \n" array = CSV.parse(string, converters: strip_converter) array

You can register the converter in Converters Hash, which allows you to refer to it by name:

CSV::Converters[:strip] = strip_converter string = " foo , 0 \n bar , 1 \n baz , 2 \n" array = CSV.parse(string, converters: :strip) array

Header converters operate only on headers (and not on other rows).

There are three ways to use header converters; these examples use built-in header converter :dowhcase, which downcases each parsed header.

The built-in header converters are in Hash CSV::HeaderConverters. The keys there are the names of the converters:

CSV::HeaderConverters.keys

Converter :downcase converts each header by downcasing it:

string = "Name,Count\nFoo,0\n,Bar,1\nBaz,2" tbl = CSV.parse(string, headers: true, header_converters: :downcase) tbl.class tbl.headers

Converter :symbol converts each header by making it into a Symbol:

string = "Name,Count\nFoo,0\n,Bar,1\nBaz,2" tbl = CSV.parse(string, headers: true, header_converters: :symbol) tbl.headers

Details:

You can define a custom header converter:

upcase_converter = proc {|header| header.upcase } string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" table = CSV.parse(string, headers: true, header_converters: upcase_converter) table table.headers

You can register the converter in HeaderConverters Hash, which allows you to refer to it by name:

CSV::HeaderConverters[:upcase] = upcase_converter table = CSV.parse(string, headers: true, header_converters: :upcase) table table.headers

Write Converters

When you specify a write converter for generating CSV, each field to be written is passed to the converter; its return value becomes the new value for the field. A converter might, for example, strip whitespace from a field.

Using no write converter (all fields unmodified):

output_string = CSV.generate do |csv| csv << [' foo ', 0] csv << [' bar ', 1] csv << [' baz ', 2] end output_string

Using option write_converters with two custom write converters:

strip_converter = proc {|field| field.respond_to?(:strip) ? field.strip : field } upcase_converter = proc {|field| field.respond_to?(:upcase) ? field.upcase : field } write_converters = [strip_converter, upcase_converter] output_string = CSV.generate(write_converters: write_converters) do |csv| csv << [' foo ', 0] csv << [' bar ', 1] csv << [' baz ', 2] end output_string

Character Encodings (M17n or Multilingualization)

This new CSV parser is m17n savvy. The parser works in the Encoding of the IO or String object being read from or written to. Your data is never transcoded (unless you ask Ruby to transcode it for you) and will literally be parsed in the Encoding it is in. Thus CSV will return Arrays or Rows of Strings in the Encoding of your data. This is accomplished by transcoding the parser itself into your Encoding.

Some transcoding must take place, of course, to accomplish this multiencoding support. For example, :col_sep, :row_sep, and :quote_char must be transcoded to match your data. Hopefully this makes the entire process feel transparent, since CSV's defaults should just magically work for your data. However, you can set these values manually in the target Encoding to avoid the translation.

It's also important to note that while all of CSV's core parser is now Encoding agnostic, some features are not. For example, the built-in converters will try to transcode data to UTF-8 before making conversions. Again, you can provide custom converters that are aware of your Encodings to avoid this translation. It's just too hard for me to support native conversions in all of Ruby's Encodings.

Anyway, the practical side of this is simple: make sure IO and String objects passed into CSV have the proper Encoding set and everything should just work. CSV methods that allow you to open IO objects (CSV::foreach(), CSV::open(), CSV::read(), and CSV::readlines()) do allow you to specify the Encoding.

One minor exception comes when generating CSV into a String with an Encoding that is not ASCII compatible. There's no existing data for CSV to use to prepare itself and thus you will probably need to manually specify the desired Encoding for most of those cases. It will try to guess using the fields in a row of output though, when using CSV::generate_line() or Array#to_csv().

I try to point out any other Encoding issues in the documentation of methods as they come up.

This has been tested to the best of my ability with all non-“dummy” Encodings Ruby ships with. However, it is brave new code and may have some bugs. Please feel free to report any issues you find with it.

Constants

ConverterEncoding

The encoding used by all converters.

Converters

A Hash containing the names and Procs for the built-in field converters. See Built-In Field Converters.

This Hash is intentionally left unfrozen, and may be extended with custom field converters. See Custom Field Converters.

DEFAULT_OPTIONS

Default values for method options.

DateMatcher

A Regexp used to find and convert some common Date formats.

DateTimeMatcher

A Regexp used to find and convert some common DateTime formats.

FieldInfo

A FieldInfo Struct contains details about a field's position in the data source it was read from. CSV will pass this Struct to some blocks that make decisions based on field structure. See CSV.convert_fields() for an example.

index

The zero-based index of the field in its row.

line

The line of the data source this row is from.

header

The header for the column, when available.

A Hash containing the names and Procs for the built-in header converters. See Built-In Header Converters.

This Hash is intentionally left unfrozen, and may be extended with custom field converters. See Custom Header Converters.

VERSION

The version of the installed library.

Attributes

encoding[R]

:call-seq:

csv.encoding -> endcoding

Returns the encoding used for parsing and generating; see Character Encodings (M17n or Multilingualization):

CSV.new('').encoding

Public Class Methods

filter(**options) {|row| ... } click to toggle source

filter(in_string, **options) {|row| ... }

filter(in_io, **options) {|row| ... }

filter(in_string, out_string, **options) {|row| ... }

filter(in_string, out_io, **options) {|row| ... }

filter(in_io, out_string, **options) {|row| ... }

filter(in_io, out_io, **options) {|row| ... }

Reads CSV input and writes CSV output.

For each input row:

Arguments:

Example:

in_string = "foo,0\nbar,1\nbaz,2\n" out_string = '' CSV.filter(in_string, out_string) do |row| row[0] = row[0].upcase row[1] *= 4 end out_string

def filter(input=nil, output=nil, **options)

in_options, out_options = Hash.new, {row_sep: $INPUT_RECORD_SEPARATOR} options.each do |key, value| case key.to_s when /\Ain(?:put)?(.+)\Z/ in_options[$1.to_sym] = value when /\Aout(?:put)?(.+)\Z/ out_options[$1.to_sym] = value else in_options[key] = value out_options[key] = value end end

input = new(input || ARGF, **in_options) output = new(output || $stdout, **out_options)

need_manual_header_output = (in_options[:headers] and out_options[:headers] == true and out_options[:write_headers]) if need_manual_header_output first_row = input.shift if first_row if first_row.is_a?(Row) headers = first_row.headers yield headers output << headers end yield first_row output << first_row end end

input.each do |row| yield row output << row end end

foreach(path, mode='r', **options) {|row| ... ) click to toggle source

foreach(io, mode='r', **options {|row| ... )

foreach(path, mode='r', headers: ..., **options) {|row| ... )

foreach(io, mode='r', headers: ..., **options {|row| ... )

foreach(path, mode='r', **options) → new_enumerator

foreach(io, mode='r', **options → new_enumerator

Calls the block with each row read from source path or io.

Without option headers, returns each row as an Array object.

These examples assume prior execution of:

string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string)

Read rows from a file at path:

CSV.foreach(path) {|row| p row }

Output:

["foo", "0"] ["bar", "1"] ["baz", "2"]

Read rows from an IO object:

File.open(path) do |file| CSV.foreach(file) {|row| p row } end

Output:

["foo", "0"] ["bar", "1"] ["baz", "2"]

Returns a new Enumerator if no block given:

CSV.foreach(path) CSV.foreach(File.open(path))

Issues a warning if an encoding is unsupported:

CSV.foreach(File.open(path), encoding: 'foo:bar') {|row| }

Output:

warning: Unsupported encoding foo ignored warning: Unsupported encoding bar ignored

With {option headers}, returns each row as a CSV::Row object.

These examples assume prior execution of:

string = "Name,Count\nfoo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string)

Read rows from a file at path:

CSV.foreach(path, headers: true) {|row| p row }

Output:

Read rows from an IO object:

File.open(path) do |file| CSV.foreach(file, headers: true) {|row| p row } end

Output:


Raises an exception if path is a String, but not the path to a readable file:

CSV.foreach('nosuch.csv') {|row| }

Raises an exception if io is an IO object, but not open for reading:

io = File.open(path, 'w') {|row| }

CSV.foreach(io) {|row| }

Raises an exception if mode is invalid:

CSV.foreach(path, 'nosuch') {|row| }

def foreach(path, mode="r", **options, &block) return to_enum(method, path, mode, **options) unless block_given? open(path, mode, **options) do |csv| csv.each(&block) end end

generate(csv_string, **options) {|csv| ... } click to toggle source

generate(**options) {|csv| ... }


Creates a new CSV object via CSV.new(csv_string, **options); calls the block with the CSV object, which the block may modify; returns the String generated from the CSV object.

Note that a passed String is modified by this method. Pass csv_string.dup if the String must be preserved.

This method has one additional option: :encoding, which sets the base Encoding for the output if no no str is specified. CSV needs this hint if you plan to output non-ASCII compatible data.


Add lines:

input_string = "foo,0\nbar,1\nbaz,2\n" output_string = CSV.generate(input_string) do |csv| csv << ['bat', 3] csv << ['bam', 4] end output_string input_string output_string.equal?(input_string)

Add lines into new string, preserving old string:

input_string = "foo,0\nbar,1\nbaz,2\n" output_string = CSV.generate(input_string.dup) do |csv| csv << ['bat', 3] csv << ['bam', 4] end output_string input_string output_string.equal?(input_string)

Create lines from nothing:

output_string = CSV.generate do |csv| csv << ['foo', 0] csv << ['bar', 1] csv << ['baz', 2] end output_string


Raises an exception if csv_string is not a String object:

CSV.generate(0)

def generate(str=nil, **options) encoding = options[:encoding]

if str str = StringIO.new(str) str.seek(0, IO::SEEK_END) str.set_encoding(encoding) if encoding else str = +"" str.force_encoding(encoding) if encoding end csv = new(str, **options) yield csv
csv.string
end

generate_line(ary) click to toggle source

generate_line(ary, **options)

Returns the String created by generating CSV from ary using the specified options.

Argument ary must be an Array.

Special options:

For other options, see Options for Generating.


Returns the String generated from an Array:

CSV.generate_line(['foo', '0'])


Raises an exception if ary is not an Array:

CSV.generate_line(:foo)

def generate_line(row, **options) options = {row_sep: $INPUT_RECORD_SEPARATOR}.merge(options) str = +"" if options[:encoding] str.force_encoding(options[:encoding]) else fallback_encoding = nil output_encoding = nil row.each do |field| next unless field.is_a?(String) fallback_encoding ||= field.encoding next if field.ascii_only? output_encoding = field.encoding break end output_encoding ||= fallback_encoding if output_encoding str.force_encoding(output_encoding) end end (new(str, **options) << row).string end

instance(string, **options) click to toggle source

instance(io = $stdout, **options)

instance(string, **options) {|csv| ... }

instance(io = $stdout, **options) {|csv| ... }

Creates or retrieves cached CSV objects. For arguments and options, see CSV.new.


With no block given, returns a CSV object.

The first call to instance creates and caches a CSV object:

s0 = 's0' csv0 = CSV.instance(s0) csv0.class

Subsequent calls to instance with that same string or io retrieve that same cached object:

csv1 = CSV.instance(s0) csv1.class csv1.equal?(csv0)

A subsequent call to instance with a different string or io creates and caches a different CSV object.

s1 = 's1' csv2 = CSV.instance(s1) csv2.equal?(csv0)

All the cached objects remains available:

csv3 = CSV.instance(s0) csv3.equal?(csv0) csv4 = CSV.instance(s1) csv4.equal?(csv2)


When a block is given, calls the block with the created or retrieved CSV object; returns the block's return value:

CSV.instance(s0) {|csv| :foo }

def instance(data = $stdout, **options)

sig = [data.object_id] + options.values_at(*DEFAULT_OPTIONS.keys.sort_by { |sym| sym.to_s })

@@instances ||= Hash.new instance = (@@instances[sig] ||= new(data, **options))

if block_given? yield instance
else instance
end end

new(string) click to toggle source

new(io)

new(string, **options)

new(io, **options)

Returns the new CSV object created using string or io and the specified options.

In addition to the CSV instance methods, several IO methods are delegated. See Delegated Methods.


Create a CSV object from a String object:

csv = CSV.new('foo,0') csv

Create a CSV object from a File object:

File.write('t.csv', 'foo,0') csv = CSV.new(File.open('t.csv')) csv


Raises an exception if the argument is nil:

CSV.new(nil)

def initialize(data, col_sep: ",", row_sep: :auto, quote_char: '"', field_size_limit: nil, converters: nil, unconverted_fields: nil, headers: false, return_headers: false, write_headers: nil, header_converters: nil, skip_blanks: false, force_quotes: false, skip_lines: nil, liberal_parsing: false, internal_encoding: nil, external_encoding: nil, encoding: nil, nil_value: nil, empty_value: "", quote_empty: true, write_converters: nil, write_nil_value: nil, write_empty_value: "", strip: false) raise ArgumentError.new("Cannot parse nil as CSV") if data.nil?

if data.is_a?(String) @io = StringIO.new(data) @io.set_encoding(encoding || data.encoding) else @io = data end @encoding = determine_encoding(encoding, internal_encoding)

@base_fields_converter_options = { nil_value: nil_value, empty_value: empty_value, } @write_fields_converter_options = { nil_value: write_nil_value, empty_value: write_empty_value, } @initial_converters = converters @initial_header_converters = header_converters @initial_write_converters = write_converters

@parser_options = { column_separator: col_sep, row_separator: row_sep, quote_character: quote_char, field_size_limit: field_size_limit, unconverted_fields: unconverted_fields, headers: headers, return_headers: return_headers, skip_blanks: skip_blanks, skip_lines: skip_lines, liberal_parsing: liberal_parsing, encoding: @encoding, nil_value: nil_value, empty_value: empty_value, strip: strip, } @parser = nil @parser_enumerator = nil @eof_error = nil

@writer_options = { encoding: @encoding, force_encoding: (not encoding.nil?), force_quotes: force_quotes, headers: headers, write_headers: write_headers, column_separator: col_sep, row_separator: row_sep, quote_character: quote_char, quote_empty: quote_empty, }

@writer = nil writer if @writer_options[:write_headers] end

open(file_path, mode = "rb", **options ) → new_csv click to toggle source

open(io, mode = "rb", **options ) → new_csv

open(file_path, mode = "rb", **options ) { |csv| ... } → object

open(io, mode = "rb", **options ) { |csv| ... } → object

possible options elements:

hash form: :invalid => nil # raise error on invalid byte sequence (default) :invalid => :replace # replace invalid byte sequence :undef => :replace # replace undefined conversion :replace => string # replacement string ("?" or "\uFFFD" if not specified)


These examples assume prior execution of:

string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string)


With no block given, returns a new CSV object.

Create a CSV object using a file path:

csv = CSV.open(path) csv

Create a CSV object using an open File:

csv = CSV.open(File.open(path)) csv


With a block given, calls the block with the created CSV object; returns the block's return value:

Using a file path:

csv = CSV.open(path) {|csv| p csv} csv

Output:

Using an open File:

csv = CSV.open(File.open(path)) {|csv| p csv} csv

Output:


Raises an exception if the argument is not a String object or IO object:

CSV.open(:foo)

def open(filename, mode="r", **options)

file_opts = {universal_newline: false}.merge(options) options.delete(:invalid) options.delete(:undef) options.delete(:replace)

begin f = File.open(filename, mode, **file_opts) rescue ArgumentError => e raise unless /needs binmode/.match?(e.message) and mode == "r" mode = "rb" file_opts = {encoding: Encoding.default_external}.merge(file_opts) retry end begin csv = new(f, **options) rescue Exception f.close raise end

if block_given? begin yield csv ensure csv.close end else csv end end

parse(string) → array_of_arrays click to toggle source

parse(io) → array_of_arrays

parse(string, headers: ..., **options) → csv_table

parse(io, headers: ..., **options) → csv_table

parse(string, **options) {|row| ... }

parse(io, **options) {|row| ... }

Parses string or io using the specified options.

Without {option headers} case.

These examples assume prior execution of:

string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string)


With no block given, returns an Array of Arrays formed from the source.

Parse a String:

a_of_a = CSV.parse(string) a_of_a

Parse an open File:

a_of_a = File.open(path) do |file| CSV.parse(file) end a_of_a


With a block given, calls the block with each parsed row:

Parse a String:

CSV.parse(string) {|row| p row }

Output:

["foo", "0"] ["bar", "1"] ["baz", "2"]

Parse an open File:

File.open(path) do |file| CSV.parse(file) {|row| p row } end

Output:

["foo", "0"] ["bar", "1"] ["baz", "2"]

With {option headers} case.

These examples assume prior execution of:

string = "Name,Count\nfoo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string)


With no block given, returns a CSV::Table object formed from the source.

Parse a String:

csv_table = CSV.parse(string, headers: ['Name', 'Count']) csv_table

Parse an open File:

csv_table = File.open(path) do |file| CSV.parse(file, headers: ['Name', 'Count']) end csv_table


With a block given, calls the block with each parsed row, which has been formed into a CSV::Row object:

Parse a String:

CSV.parse(string, headers: ['Name', 'Count']) {|row| p row }

Output:

Parse an open File:

File.open(path) do |file| CSV.parse(file, headers: ['Name', 'Count']) {|row| p row } end

Output:


Raises an exception if the argument is not a String object or IO object:

CSV.parse(:foo)

def parse(str, **options, &block) csv = new(str, **options)

return csv.each(&block) if block_given?

begin csv.read ensure csv.close end end

parse_line(string) → new_array or nil click to toggle source

parse_line(io) → new_array or nil

parse_line(string, **options) → new_array or nil

parse_line(io, **options) → new_array or nil

parse_line(string, headers: true, **options) → csv_row or nil

parse_line(io, headers: true, **options) → csv_row or nil

Returns the data created by parsing the first line of string or io using the specified options.

Without option headers, returns the first row as a new Array.

These examples assume prior execution of:

string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string)

Parse the first line from a String object:

CSV.parse_line(string)

Parse the first line from a File object:

File.open(path) do |file| CSV.parse_line(file) end

Returns nil if the argument is an empty String:

CSV.parse_line('')

With {option headers}, returns the first row as a CSV::Row object.

These examples assume prior execution of:

string = "Name,Count\nfoo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string)

Parse the first line from a String object:

CSV.parse_line(string, headers: true)

Parse the first line from a File object:

File.open(path) do |file| CSV.parse_line(file, headers: true) end


Raises an exception if the argument is nil:

CSV.parse_line(nil)

def parse_line(line, **options) new(line, **options).each.first end

read(source, **options) → array_of_arrays click to toggle source

read(source, headers: true, **options) → csv_table

Opens the given source with the given options (see CSV.open), reads the source (see CSV#read), and returns the result, which will be either an Array of Arrays or a CSV::Table.

Without headers:

string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string) CSV.read(path)

With headers:

string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string) CSV.read(path, headers: true)

def read(path, **options) open(path, **options) { |csv| csv.read } end

readlines(source, **options) click to toggle source

Alias for CSV.read.

def readlines(path, **options) read(path, **options) end

table(source, **options) click to toggle source

Calls CSV.read with source, options, and certain default options:

Returns a CSV::Table object.

Example:

string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string) CSV.table(path)

def table(path, **options) default_options = { headers: true, converters: :numeric, header_converters: :symbol, } options = default_options.merge(options) read(path, **options) end

Public Instance Methods

csv << row → self click to toggle source

Appends a row to self.


Append Arrays:

CSV.generate do |csv| csv << ['foo', 0] csv << ['bar', 1] csv << ['baz', 2] end

Append CSV::Rows:

headers = [] CSV.generate do |csv| csv << CSV::Row.new(headers, ['foo', 0]) csv << CSV::Row.new(headers, ['bar', 1]) csv << CSV::Row.new(headers, ['baz', 2]) end

Headers in CSV::Row objects are not appended:

headers = ['Name', 'Count'] CSV.generate do |csv| csv << CSV::Row.new(headers, ['foo', 0]) csv << CSV::Row.new(headers, ['bar', 1]) csv << CSV::Row.new(headers, ['baz', 2]) end


Raises an exception if row is not an Array or CSV::Row:

CSV.generate do |csv|

csv << :foo end

Raises an exception if the output stream is not opened for writing:

path = 't.csv' File.write(path, '') File.open(path) do |file| CSV.open(file) do |csv|

csv << ['foo', 0]

end end

def <<(row) writer << row self end

binmode?() click to toggle source

def binmode? if @io.respond_to?(:binmode?) @io.binmode? else false end end

col_sep → string click to toggle source

Returns the encoded column separator; used for parsing and writing; see {Option col_sep}:

CSV.new('').col_sep

def col_sep parser.column_separator end

convert(converter_name) → array_of_procs click to toggle source

convert {|field, field_info| ... } → array_of_procs

See Field Converters.


With no block, installs a field converter:

csv = CSV.new('') csv.convert(:integer) csv.convert(:float) csv.convert(:date) csv.converters


The block, if given, is called for each field:

The examples here assume the prior execution of:

string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string)

Example giving a block:

csv = CSV.open(path) csv.convert {|field, field_info| p [field, field_info]; field.upcase } csv.read

Output:

["foo", #] ["0", #] ["bar", #] ["1", #] ["baz", #] ["2", #]

The block need not return a String object:

csv = CSV.open(path) csv.convert {|field, field_info| field.to_sym } csv.read

If converter_name is given, the block is not called:

csv = CSV.open(path) csv.convert(:integer) {|field, field_info| fail 'Cannot happen' } csv.read


Raises a parse-time exception if converter_name is not the name of a built-in field converter:

csv = CSV.open(path) csv.convert(:nosuch) => [nil]

csv.read

def convert(name = nil, &converter) parser_fields_converter.add_converter(name, &converter) end

converters → array click to toggle source

Returns an Array containing field converters; see Field Converters:

csv = CSV.new('') csv.converters csv.convert(:integer) csv.converters csv.convert(proc {|x| x.to_s }) csv.converters

def converters parser_fields_converter.map do |converter| name = Converters.rassoc(converter) name ? name.first : converter end end

each → enumerator click to toggle source

each {|row| ...}

Calls the block with each successive row. The data source must be opened for reading.

Without headers:

string = "foo,0\nbar,1\nbaz,2\n" csv = CSV.new(string) csv.each do |row| p row end

Output:

["foo", "0"] ["bar", "1"] ["baz", "2"]

With headers:

string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" csv = CSV.new(string, headers: true) csv.each do |row| p row end

Output:

<CSV::Row "Name":"foo" "Value":"0"> <CSV::Row "Name":"bar" "Value":"1"> <CSV::Row "Name":"baz" "Value":"2">


Raises an exception if the source is not opened for reading:

string = "foo,0\nbar,1\nbaz,2\n" csv = CSV.new(string) csv.close

csv.each do |row| p row end

def each(&block) parser_enumerator.each(&block) end

eof?() click to toggle source

def eof? return false if @eof_error begin parser_enumerator.peek false rescue MalformedCSVError => error @eof_error = error false rescue StopIteration true end end

Also aliased as: eof

field_size_limit → integer or nil click to toggle source

Returns the limit for field size; used for parsing; see {Option field_size_limit}:

CSV.new('').field_size_limit

def field_size_limit parser.field_size_limit end

flock(*args) click to toggle source

def flock(*args) raise NotImplementedError unless @io.respond_to?(:flock) @io.flock(*args) end

force_quotes? → true or false click to toggle source

Returns the value that determines whether all output fields are to be quoted; used for generating; see {Option force_quotes}:

CSV.new('').force_quotes?

def force_quotes? @writer_options[:force_quotes] end

inspect → string click to toggle source

Returns a String showing certain properties of self:

string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" csv = CSV.new(string, headers: true) s = csv.inspect s

def inspect str = ["#<", self.class.to_s, " io_type:"]

if @io == stdoutthenstr<<"stdout then str << "stdoutthenstr<<"stdout" elsif @io == stdinthenstr<<"stdin then str << "stdinthenstr<<"stdin" elsif @io == stderrthenstr<<"stderr then str << "stderrthenstr<<"stderr" else str << @io.class.to_s end

if @io.respond_to?(:path) and (p = @io.path) str << " io_path:" << p.inspect end

str << " encoding:" << @encoding.name

["lineno", "col_sep", "row_sep", "quote_char"].each do |attr_name| if a = send(attr_name) str << " " << attr_name << ":" << a.inspect end end ["skip_blanks", "liberal_parsing"].each do |attr_name| if a = __send__("#{attr_name}?") str << " " << attr_name << ":" << a.inspect end end _headers = headers str << " headers:" << _headers.inspect if _headers str << ">" begin str.join('') rescue
str.map do |s| e = Encoding::Converter.asciicompat_encoding(s.encoding) e ? s.encode(e) : s.force_encoding("ASCII-8BIT") end.join('') end end

ioctl(*args) click to toggle source

def ioctl(*args) raise NotImplementedError unless @io.respond_to?(:ioctl) @io.ioctl(*args) end

liberal_parsing? → true or false click to toggle source

Returns the value that determines whether illegal input is to be handled; used for parsing; see {Option liberal_parsing}:

CSV.new('').liberal_parsing?

def liberal_parsing? parser.liberal_parsing? end

line → array click to toggle source

Returns the line most recently read:

string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string) CSV.open(path) do |csv| csv.each do |row| p [csv.lineno, csv.line] end end

Output:

[1, "foo,0\n"] [2, "bar,1\n"] [3, "baz,2\n"]

line_no → integer click to toggle source

Returns the count of the rows parsed or generated.

Parsing:

string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string) CSV.open(path) do |csv| csv.each do |row| p [csv.lineno, row] end end

Output:

[1, ["foo", "0"]] [2, ["bar", "1"]] [3, ["baz", "2"]]

Generating:

CSV.generate do |csv| p csv.lineno; csv << ['foo', 0] p csv.lineno; csv << ['bar', 1] p csv.lineno; csv << ['baz', 2] end

Output:

0 1 2

def lineno if @writer @writer.lineno else parser.lineno end end

path() click to toggle source

def path @io.path if @io.respond_to?(:path) end

quote_char → character click to toggle source

Returns the encoded quote character; used for parsing and writing; see {Option quote_char}:

CSV.new('').quote_char

def quote_char parser.quote_character end

read → array or csv_table click to toggle source

Forms the remaining rows from self into:

The data source must be opened for reading.

Without headers:

string = "foo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string) csv = CSV.open(path) csv.read

With headers:

string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" path = 't.csv' File.write(path, string) csv = CSV.open(path, headers: true) csv.read


Raises an exception if the source is not opened for reading:

string = "foo,0\nbar,1\nbaz,2\n" csv = CSV.new(string) csv.close

csv.read

def read rows = to_a if parser.use_headers? Table.new(rows, headers: parser.headers) else rows end end

rewind() click to toggle source

Rewinds the underlying IO object and resets CSV's lineno() counter.

def rewind @parser = nil @parser_enumerator = nil @eof_error = nil @writer.rewind if @writer @io.rewind end

row_sep → string click to toggle source

Returns the encoded row separator; used for parsing and writing; see {Option row_sep}:

CSV.new('').row_sep

def row_sep parser.row_separator end

shift → array, csv_row, or nil click to toggle source

Returns the next row of data as:

The data source must be opened for reading.

Without headers:

string = "foo,0\nbar,1\nbaz,2\n" csv = CSV.new(string) csv.shift csv.shift csv.shift csv.shift

With headers:

string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n" csv = CSV.new(string, headers: true) csv.shift csv.shift csv.shift csv.shift


Raises an exception if the source is not opened for reading:

string = "foo,0\nbar,1\nbaz,2\n" csv = CSV.new(string) csv.close

csv.shift

def shift if @eof_error eof_error, @eof_error = @eof_error, nil raise eof_error end begin parser_enumerator.next rescue StopIteration nil end end

skip_blanks? → true or false click to toggle source

Returns the value that determines whether blank lines are to be ignored; used for parsing; see {Option skip_blanks}:

CSV.new('').skip_blanks?

def skip_blanks? parser.skip_blanks? end

skip_lines → regexp or nil click to toggle source

Returns the Regexp used to identify comment lines; used for parsing; see {Option skip_lines}:

CSV.new('').skip_lines

def skip_lines parser.skip_lines end

stat(*args) click to toggle source

def stat(*args) raise NotImplementedError unless @io.respond_to?(:stat) @io.stat(*args) end

to_i() click to toggle source

def to_i raise NotImplementedError unless @io.respond_to?(:to_i) @io.to_i end

to_io() click to toggle source

def to_io @io.respond_to?(:to_io) ? @io.to_io : @io end

unconverted_fields? → object click to toggle source

Returns the value that determines whether unconverted fields are to be available; used for parsing; see {Option unconverted_fields}:

CSV.new('').unconverted_fields?

def unconverted_fields? parser.unconverted_fields? end

Private Instance Methods

build_fields_converter(initial_converters, options) click to toggle source

def build_fields_converter(initial_converters, options) fields_converter = FieldsConverter.new(options) normalize_converters(initial_converters).each do |name, converter| fields_converter.add_converter(name, &converter) end fields_converter end

build_parser_fields_converter() click to toggle source

def build_parser_fields_converter specific_options = { builtin_converters: Converters, } options = @base_fields_converter_options.merge(specific_options) build_fields_converter(@initial_converters, options) end

build_writer_fields_converter() click to toggle source

def build_writer_fields_converter build_fields_converter(@initial_write_converters, @write_fields_converter_options) end

convert_fields(fields, headers = false) click to toggle source

Processes fields with @converters, or @header_converters if headers is passed as true, returning the converted field set. Any converter that changes the field into something other than a String halts the pipeline of conversion for that field. This is primarily an efficiency shortcut.

def convert_fields(fields, headers = false) if headers header_fields_converter.convert(fields, nil, 0) else parser_fields_converter.convert(fields, @headers, lineno) end end

determine_encoding(encoding, internal_encoding) click to toggle source

def determine_encoding(encoding, internal_encoding)

io_encoding = raw_encoding return io_encoding if io_encoding

return Encoding.find(internal_encoding) if internal_encoding

if encoding encoding, = encoding.split(":", 2) if encoding.is_a?(String) return Encoding.find(encoding) end

Encoding.default_internal || Encoding.default_external end

normalize_converters(converters) click to toggle source

def normalize_converters(converters) converters ||= [] unless converters.is_a?(Array) converters = [converters] end converters.collect do |converter| case converter when Proc [nil, converter] else [converter, nil] end end end

parser() click to toggle source

def parser @parser ||= Parser.new(@io, parser_options) end

parser_enumerator() click to toggle source

def parser_enumerator @parser_enumerator ||= parser.parse end

parser_fields_converter() click to toggle source

def parser_fields_converter @parser_fields_converter ||= build_parser_fields_converter end

parser_options() click to toggle source

def parser_options @parser_options.merge(header_fields_converter: header_fields_converter, fields_converter: parser_fields_converter) end

raw_encoding() click to toggle source

Returns the encoding of the internal IO object.

def raw_encoding if @io.respond_to? :internal_encoding @io.internal_encoding || @io.external_encoding elsif @io.respond_to? :encoding @io.encoding else nil end end

writer() click to toggle source

def writer @writer ||= Writer.new(@io, writer_options) end

writer_fields_converter() click to toggle source

def writer_fields_converter @writer_fields_converter ||= build_writer_fields_converter end

writer_options() click to toggle source

def writer_options @writer_options.merge(header_fields_converter: header_fields_converter, fields_converter: writer_fields_converter) end