Edit page in Livemark
(2024-11-07 15:17)

Data Actions

Describe

Describe is a high-level function (action) to infer a metadata from a data source.

Example

from frictionless import describe

resource = describe('table.csv')
print(resource)
{'name': 'table',
 'type': 'table',
 'path': 'table.csv',
 'scheme': 'file',
 'format': 'csv',
 'mediatype': 'text/csv',
 'encoding': 'utf-8',
 'schema': {'fields': [{'name': 'id', 'type': 'integer'},
                       {'name': 'name', 'type': 'string'}]}}

Reference

describe (function)

describe (function)

Describe the data source

Signature

(source: Optional[Any] = None, *, name: Optional[str] = None, type: Optional[str] = None, stats: bool = False, **options: Any) -> Metadata

Parameters
  • source (Optional[Any]): data source
  • name (Optional[str]): resoucrce name
  • type (Optional[str]): data type: "package", "resource", "dialect", or "schema"
  • stats (bool): if `True` infer resource's stats
  • options (Any)

Extract

Extract is a high-level function (action) to read tabular data from a data source. The output is encoded in 'utf-8' scheme.

Example

from pprint import pprint
from frictionless import extract

rows = extract('table.csv')
pprint(rows)
{'table': [{'id': 1, 'name': 'english'}, {'id': 2, 'name': '中国人'}]}

Reference

extract (function)

extract (function)

Extract rows

Signature

(source: Optional[Any] = None, *, name: Optional[str] = None, type: Optional[str] = None, filter: Optional[types.IFilterFunction] = None, process: Optional[types.IProcessFunction] = None, limit_rows: Optional[int] = None, resource_name: Optional[str] = None, **options: Any)

Parameters
  • source (Optional[Any])
  • name (Optional[str]): extract only resource having this name
  • type (Optional[str])
  • filter (Optional[types.IFilterFunction]): row filter function
  • process (Optional[types.IProcessFunction]): row processor function
  • limit_rows (Optional[int]): limit amount of rows to this number
  • resource_name (Optional[str])
  • options (Any)

Validate

Validate is a high-level function (action) to validate data from a data source.

Example

from frictionless import validate

report = validate('table.csv')
print(report.valid)
True

Reference

validate (function)

validate (function)

Validate resource

Signature

(source: Optional[Any] = None, *, name: Optional[str] = None, type: Optional[str] = None, checklist: Union[frictionless.checklist.checklist.Checklist, str, NoneType] = None, checks: List[frictionless.checklist.check.Check] = [], pick_errors: List[str] = [], skip_errors: List[str] = [], limit_errors: int = 1000, limit_rows: Optional[int] = None, parallel: bool = False, resource_name: Optional[str] = None, **options: Any)

Parameters
  • source (typing.Optional[typing.Any]): a data source
  • name (typing.Optional[str])
  • type (typing.Optional[str]): source type - inquiry, package, resource, schema or table
  • checklist (typing.Union[frictionless.checklist.checklist.Checklist, str, NoneType])
  • checks (typing.List[frictionless.checklist.check.Check])
  • pick_errors (typing.List[str])
  • skip_errors (typing.List[str])
  • limit_errors ()
  • limit_rows (typing.Optional[int])
  • parallel ()
  • resource_name (typing.Optional[str])
  • options (typing.Any)

Transform

Transform is a high-level function (action) to transform tabular data from a data source.

Example

from frictionless import transform, steps

resource = transform('table.csv', steps=[steps.cell_set(field_name='name', value='new')])
print(resource.read_rows())
[{'id': 1, 'name': 'new'}, {'id': 2, 'name': 'new'}]

Reference

transform (function)

transform (function)

Transform resource

Signature

(source: Optional[Any] = None, *, type: Optional[str] = None, pipeline: Union[frictionless.pipeline.pipeline.Pipeline, str, NoneType] = None, steps: Optional[List[frictionless.pipeline.step.Step]] = None, **options: Any)

Parameters
  • source (typing.Optional[typing.Any]): data source
  • type (typing.Optional[str]): data type - package, resource or pipeline (default: infer)
  • pipeline (typing.Union[frictionless.pipeline.pipeline.Pipeline, str, NoneType])
  • steps (typing.Optional[typing.List[frictionless.pipeline.step.Step]]): transform steps
  • options (typing.Any)