Edit page in Livemark
(2022-09-19 18:33)

Row Checks

Duplicate Row

This checks for duplicate rows. You need to take into account that checking for duplicate rows can lead to high memory consumption on big files. Here is an example.

Example

from pprint import pprint
from frictionless import validate, checks

source = b"header\nvalue\nvalue"
report = validate(source, format="csv", checks=[checks.duplicate_row()])
pprint(report.flatten(["type", "message"]))
[['duplicate-row',
  'Row at position 3 is duplicated: the same as row at position "2"']]

Reference

checks.duplicate_row (class)

checks.duplicate_row (class)

Check for duplicate rows This check can be enabled using the `checks` parameter for the `validate` function.

Signature

(*, title: Optional[str] = None, description: Optional[str] = None) -> None

Parameters
  • title (Optional[str])
  • description (Optional[str])

checks.duplicate_row.type (property)

Signature

ClassVar[str]

checks.duplicate_row.Errors (property)

Signature

ClassVar[List[Type[Error]]]

checks.duplicate_row.title (property)

Signature

Optional[str]

checks.duplicate_row.description (property)

Signature

Optional[str]

Row Constraint

This check is the most powerful one as it uses the external simpleeval package allowing you to evaluate arbitrary Python expressions on data rows. Let's show on an example.

Example

from pprint import pprint
from frictionless import validate, checks

source = [
    ["row", "salary", "bonus"],
    [2, 1000, 200],
    [3, 2500, 500],
    [4, 1300, 500],
    [5, 5000, 1000],
]
report = validate(source, checks=[checks.row_constraint(formula="salary == bonus * 5")])
pprint(report.flatten(["type", "message"]))
[['row-constraint',
  'The row at position 4 has an error: the row constraint to conform is '
  '"salary == bonus * 5"']]

Reference

checks.row_constraint (class)

checks.row_constraint (class)

Check that every row satisfies a provided Python expression

Signature

(*, title: Optional[str] = None, description: Optional[str] = None, formula: str) -> None

Parameters
  • title (Optional[str])
  • description (Optional[str])
  • formula (str)

checks.row_constraint.formula (property)

NOTE: add docs

Signature

str

It's a beta version of Frictionless Framework (v5). Read Frictionless Framework (v4) docs for a version that is currently installed by default by pip.