Excel is a very popular tabular data format that usually has xlsx
(newer) and xls
(older) file extensions. Frictionless supports Excel files extensively.
pip install frictionless[excel]
pip install 'frictionless[excel]' # for zsh shell
You can read this format using Package/Resource
, for example:
from pprint import pprint
from frictionless import Resource
resource = Resource(path='table.xlsx')
pprint(resource.read_rows())
[{'id': 1, 'name': 'english'}, {'id': 2, 'name': '中国人'}]
The same is actual for writing:
from frictionless import Resource
source = Resource(data=[['id', 'name'], [1, 'english'], [2, 'german']])
target = source.write('table-output.xlsx')
print(target)
print(target.to_view())
There is a dialect to configure how Frictionless read and write files in this format. For example:
from frictionless import Resource, formats
resource = Resource(data=[['id', 'name'], [1, 'english'], [2, 'german']])
resource.write('table-output-sheet.xls', control=formats.ExcelControl(sheet='My Table'))
Excel control representation. Control class to set params for Excel reader/writer.
(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, sheet: Union[str, int] = 1, workbook_cache: Optional[Any] = None, fill_merged_cells: bool = False, preserve_formatting: bool = False, adjust_floating_point_error: bool = False, stringified: bool = False) -> None
Name of the sheet from where to read or write data.
Union[str, int]
An empty dictionary which is used to handle workbook caching for remote workbooks. It stores the path to the temporary file while reading remote workbooks.
Optional[Any]
If True, it will unmerge and fill all merged cells by the visible value. Default value is False.
bool
If set to True, it preserves text formatting for numeric and temporal cells. If not set, it will return all cell value as string. Default value is False.
bool
If True, it corrects the Excel behavior regarding floating point numbers. For example: 274.65999999999997 -> 274.66 (When True).
bool
Stringifies all the cell values. Default value is False. Note that a table resource schema will still be applied and types coerced to match the schema (either provided or inferred) _after_ the rows are read as strings. To return all cells as strings then both set `stringified=True` and specify a schema that defines all fields to be of type string (see #1659).
bool