Edit page in Livemark
(2024-11-22 08:02)

Row Checks

Duplicate Row

This checks for duplicate rows. You need to take into account that checking for duplicate rows can lead to high memory consumption on big files. Here is an example.

Example

from pprint import pprint
from frictionless import validate, checks

source = b"header\nvalue\nvalue"
report = validate(source, format="csv", checks=[checks.duplicate_row()])
pprint(report.flatten(["type", "message"]))
[['duplicate-row',
  'Row at position 3 is duplicated: the same as row at position "2"']]

Reference

checks.duplicate_row (class)

checks.duplicate_row (class)

Check for duplicate rows This check can be enabled using the `checks` parameter for the `validate` function.

Signature

(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None) -> None

Parameters
  • name (Optional[str])
  • title (Optional[str])
  • description (Optional[str])

Row Constraint

This check is the most powerful one as it uses the external simpleeval package allowing you to evaluate arbitrary Python expressions on data rows. Let's show on an example.

Example

from pprint import pprint
from frictionless import validate, checks

source = [
    ["row", "salary", "bonus"],
    [2, 1000, 200],
    [3, 2500, 500],
    [4, 1300, 500],
    [5, 5000, 1000],
]
report = validate(source, checks=[checks.row_constraint(formula="salary == bonus * 5")])
pprint(report.flatten(["type", "message"]))
[['row-constraint',
  'The row at position 4 has an error: the row constraint to conform is '
  '"salary == bonus * 5"']]

Reference

checks.row_constraint (class)

checks.row_constraint (class)

Check that every row satisfies a provided Python expression.

Signature

(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, formula: str) -> None

Parameters
  • name (Optional[str])
  • title (Optional[str])
  • description (Optional[str])
  • formula (str)

checks.row_constraint.formula (property)

Python expression to apply to all rows. To evaluate the formula simpleeval library is used.

Signature

str