Edit page in Livemark
(2024-12-13 12:49)

Cell Steps

The Cell steps are responsible for cell operations like converting, replacing, or formating, along with others.

Convert Cells

Converts cell values of one or more fields using arbitrary functions, method invocations or dictionary translations.

Using Value

We can provide a value to be set as a value of all cells of this field. Take into account that the value type needs to conform to the field type otherwise it will lead to a validation error:

Python

from frictionless import Resource, transform, steps

source = Resource(path="transform.csv")
target = transform(
    source,
    steps=[
        steps.cell_convert(field_name='population', value="100"),
    ],
)
print(target.to_view())

+----+-----------+------------+
| id | name      | population |
+====+===========+============+
|  1 | 'germany' |        100 |
+----+-----------+------------+
|  2 | 'france'  |        100 |
+----+-----------+------------+
|  3 | 'spain'   |        100 |
+----+-----------+------------+

Using Mapping

Another option to modify the field's cell is to provide a mapping. It's a translation table that uses literal matching to replace values. It's usually used for string fields:

Python

from frictionless import Resource, transform, steps

source = Resource(path="transform.csv")
target = transform(
    source,
    steps=[
        steps.cell_convert(field_name='name', mapping = {'germany': 'GERMANY'}),
    ],
)
print(target.to_view())

+----+-----------+------------+
| id | name      | population |
+====+===========+============+
|  1 | 'GERMANY' |         83 |
+----+-----------+------------+
|  2 | 'france'  |         66 |
+----+-----------+------------+
|  3 | 'spain'   |         47 |
+----+-----------+------------+

Using Function

Functions are not supported in declarative pipelines

We can provide an arbitrary function to update the field cells. If you want to modify a non-string field it's really important to normalize the table first otherwise the function will be applied to a non-parsed value:

Python

from frictionless import Resource, transform, steps

source = Resource(path="transform.csv")
target = transform(
    source,
    steps=[
        steps.table_normalize(),
        steps.cell_convert(field_name='population', function=lambda v: v*2),
    ],
)
print(target.to_view())

+----+-----------+------------+
| id | name      | population |
+====+===========+============+
|  1 | 'germany' |        166 |
+----+-----------+------------+
|  2 | 'france'  |        132 |
+----+-----------+------------+
|  3 | 'spain'   |         94 |
+----+-----------+------------+

steps.cell_convert (class)

Convert cell Converts cell values of one or more fields using arbitrary functions, method invocations or dictionary translations.

Signature

(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, value: Optional[Any] = None, mapping: Optional[Dict[str, Any]] = None, function: Optional[Any] = None, field_name: Optional[str] = None) -> None

Parameters

name (Optional[str])
title (Optional[str])
description (Optional[str])
value (Optional[Any])
mapping (Optional[Dict[str, Any]])
function (Optional[Any])
field_name (Optional[str])

steps.cell_convert.value (property)

Value to set in the field's cells

Signature

Optional[Any]

steps.cell_convert.mapping (property)

Mapping to apply to the column

Signature

Optional[Dict[str, Any]]

steps.cell_convert.function (property)

Function to apply to the column

Signature

Optional[Any]

steps.cell_convert.field_name (property)

Name of the field to apply the transform on

Signature

Optional[str]

Fill Cells

Example

Python

from pprint import pprint
from frictionless import Package, Resource, transform, steps

source = Resource(path="transform.csv")
target = transform(
    source,
    steps=[
        steps.cell_replace(pattern="france", replace=None),
        steps.cell_fill(field_name="name", value="FRANCE"),
    ]
)
print(target.schema)
print(target.to_view())

{'fields': [{'name': 'id', 'type': 'integer'},
            {'name': 'name', 'type': 'string'},
            {'name': 'population', 'type': 'integer'}]}
+----+-----------+------------+
| id | name      | population |
+====+===========+============+
|  1 | 'germany' |         83 |
+----+-----------+------------+
|  2 | 'FRANCE'  |         66 |
+----+-----------+------------+
|  3 | 'spain'   |         47 |
+----+-----------+------------+

Reference

Hide
Show

steps.cell_fill (class)

Fill cell Replaces missing values with non-missing values from the adjacent row/column.

Signature

(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, value: Optional[Any] = None, field_name: Optional[str] = None, direction: Optional[str] = None) -> None

Parameters

name (Optional[str])
title (Optional[str])
description (Optional[str])
value (Optional[Any])
field_name (Optional[str])
direction (Optional[str])

steps.cell_fill.value (property)

Value to replace in the field cell with missing value

Signature

Optional[Any]

steps.cell_fill.field_name (property)

Name of the field to replace the missing value cells

Signature

Optional[str]

steps.cell_fill.direction (property)

Directions to read the non missing value from(left/right/above)

Signature

Optional[str]

Format Cells

Example

Python

from pprint import pprint
from frictionless import Package, Resource, transform, steps

source = Resource(path="transform.csv")
target = transform(
    source,
    steps=[
        steps.cell_format(template="Prefix: {0}", field_name="name"),
    ]
)
print(target.schema)
print(target.to_view())

{'fields': [{'name': 'id', 'type': 'integer'},
            {'name': 'name', 'type': 'string'},
            {'name': 'population', 'type': 'integer'}]}
+----+-------------------+------------+
| id | name              | population |
+====+===================+============+
|  1 | 'Prefix: germany' |         83 |
+----+-------------------+------------+
|  2 | 'Prefix: france'  |         66 |
+----+-------------------+------------+
|  3 | 'Prefix: spain'   |         47 |
+----+-------------------+------------+

Reference

Hide
Show

steps.cell_format (class)

Format cell Formats all values in the given or all string fields using the `template` format string.

Signature

(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, template: str, field_name: Optional[str] = None) -> None

Parameters

name (Optional[str])
title (Optional[str])
description (Optional[str])
template (str)
field_name (Optional[str])

steps.cell_format.template (property)

format string to apply to cells

Signature

str

steps.cell_format.field_name (property)

field name to apply template format

Signature

Optional[str]

Interpolate Cells

Example

Python

from pprint import pprint
from frictionless import Package, Resource, transform, steps

source = Resource(path="transform.csv")
target = transform(
    source,
    steps=[
        steps.cell_interpolate(template="Prefix: %s", field_name="name"),
    ]
)
pprint(target.schema)
pprint(target.read_rows())

{'fields': [{'name': 'id', 'type': 'integer'},
            {'name': 'name', 'type': 'string'},
            {'name': 'population', 'type': 'integer'}]}
[{'id': 1, 'name': 'Prefix: germany', 'population': 83},
 {'id': 2, 'name': 'Prefix: france', 'population': 66},
 {'id': 3, 'name': 'Prefix: spain', 'population': 47}]

Reference

Hide
Show

steps.cell_interpolate (class)

Interpolate cell Interpolate all values in a given or all string fields using the `template` string.

Signature

(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, template: str, field_name: Optional[str] = None) -> None

Parameters

name (Optional[str])
title (Optional[str])
description (Optional[str])
template (str)
field_name (Optional[str])

steps.cell_interpolate.template (property)

template string to apply to the field cells

Signature

str

steps.cell_interpolate.field_name (property)

field name to apply template string

Signature

Optional[str]

Replace Cells

Example

Python

from pprint import pprint
from frictionless import Package, Resource, transform, steps

source = Resource(path="transform.csv")
target = transform(
    source,
    steps=[
        steps.cell_replace(pattern="france", replace="FRANCE"),
    ]
)
print(target.schema)
print(target.to_view())

{'fields': [{'name': 'id', 'type': 'integer'},
            {'name': 'name', 'type': 'string'},
            {'name': 'population', 'type': 'integer'}]}
+----+-----------+------------+
| id | name      | population |
+====+===========+============+
|  1 | 'germany' |         83 |
+----+-----------+------------+
|  2 | 'FRANCE'  |         66 |
+----+-----------+------------+
|  3 | 'spain'   |         47 |
+----+-----------+------------+

Hide
Show

steps.cell_replace (class)

Replace cell Replace cell values in a given field or all fields using user defined pattern.

Signature

(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, pattern: str, replace: str, field_name: Optional[str] = None) -> None

Parameters

name (Optional[str])
title (Optional[str])
description (Optional[str])
pattern (str)
replace (str)
field_name (Optional[str])

steps.cell_replace.pattern (property)

Pattern to search for in single or all fields

Signature

str

steps.cell_replace.replace (property)

String to replace

Signature

str

steps.cell_replace.field_name (property)

field name to apply template string

Signature

Optional[str]

Set Cells

Example

Python

from pprint import pprint
from frictionless import Package, Resource, transform, steps

source = Resource(path="transform.csv")
target = transform(
    source,
    steps=[
          steps.cell_set(field_name="population", value=100),
    ]
)
print(target.schema)
print(target.to_view())

{'fields': [{'name': 'id', 'type': 'integer'},
            {'name': 'name', 'type': 'string'},
            {'name': 'population', 'type': 'integer'}]}
+----+-----------+------------+
| id | name      | population |
+====+===========+============+
|  1 | 'germany' |        100 |
+----+-----------+------------+
|  2 | 'france'  |        100 |
+----+-----------+------------+
|  3 | 'spain'   |        100 |
+----+-----------+------------+

Reference

Hide
Show

steps.cell_set (class)

Set cell

Signature

(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, value: Any, field_name: str) -> None

Parameters

name (Optional[str])
title (Optional[str])
description (Optional[str])
value (Any)
field_name (str)

steps.cell_set.value (property)

Value to be set in cell of the given field.

Signature

Any

steps.cell_set.field_name (property)

Specifies the field name where to set/replace the value.

Signature

str

Any Field »

« Row Steps