Edit page in Livemark
(2023-01-25 11:55)

Csv Format

CSV is a file format which you can you in Frictionless for reading and writing. Arguable it's the main Open Data format so it's supported very well in Frictionless.

Reading Data

You can read this format using Package/Resource, for example:

from pprint import pprint
from frictionless import Resource

resource = Resource('table.csv')
pprint(resource.read_rows())
[{'id': 1, 'name': 'english'}, {'id': 2, 'name': '中国人'}]

Writing Data

The same is actual for writing:

from frictionless import Resource

source = Resource(data=[['id', 'name'], [1, 'english'], [2, 'german']])
target = source.write('table-output.csv')
print(target)
print(target.to_view())
{'name': 'table-output',
 'type': 'table',
 'path': 'table-output.csv',
 'scheme': 'file',
 'format': 'csv',
 'mediatype': 'text/csv'}
+----+-----------+
| id | name      |
+====+===========+
|  1 | 'english' |
+----+-----------+
|  2 | 'german'  |
+----+-----------+

Configuration

There is a control to configure how Frictionless read and write files in this format. For example:

from frictionless import Resource, formats

resource = Resource(data=[['id', 'name'], [1, 'english'], [2, 'german']])
resource.write('tmp/table.csv', control=formats.CsvControl(delimiter=';'))

Reference

formats.CsvControl (class)

formats.CsvControl (class)

Csv dialect representation. Control class to set params for CSV reader/writer.

Signature

(*, title: Optional[str] = None, description: Optional[str] = None, delimiter: str = ,, line_terminator: str = \r\n, quote_char: str = ", double_quote: bool = True, escape_char: Optional[str] = None, null_sequence: Optional[str] = None, skip_initial_space: bool = False) -> None

Parameters

  • title (Optional[str])
  • description (Optional[str])
  • delimiter (str)
  • line_terminator (str)
  • quote_char (str)
  • double_quote (bool)
  • escape_char (Optional[str])
  • null_sequence (Optional[str])
  • skip_initial_space (bool)

formats.csvControl.delimiter (property)

Specify the delimiter used to separate text strings while reading from or writing to the csv file. Default value is ",". For example: delimiter=";"

Signature

str

formats.csvControl.line_terminator (property)

Specify the line terminator for the csv file while reading/writing. For example: line_terminator="\n". Default line_terminator is "\r\n".

Signature

str

formats.csvControl.quote_char (property)

Specify the quote char for fields that contains a special character such as comma, CR, LF or double quote. Default value is '"'. For example: quotechar='|'

Signature

str

formats.csvControl.double_quote (property)

It controls how 'quote_char' appearing inside a field should themselves be quoted. When set to True, the 'quote_char' is doubled else escape char is used. Default value is True.

Signature

bool

formats.csvControl.escape_char (property)

A one-character string used by the csv writer to escape. Default is None, which disables escaping. It uses 'quote_char', if double_quote is False.

Signature

Optional[str]

formats.csvControl.null_sequence (property)

Specify the null sequence and not set by default. For example: \\N

Signature

Optional[str]

formats.csvControl.skip_initial_space (property)

Ignores spaces following the comma if set to True. For example space in header(in csv file): "Name", "Team"

Signature

bool

formats.csvControl.to_python (method)

Conver to Python's `csv.Dialect`