Edit page in Livemark
(2024-11-07 15:17)

Cell Steps

The Cell steps are responsible for cell operations like converting, replacing, or formating, along with others.

Convert Cells

Converts cell values of one or more fields using arbitrary functions, method invocations or dictionary translations.

Using Value

We can provide a value to be set as a value of all cells of this field. Take into account that the value type needs to conform to the field type otherwise it will lead to a validation error:

from frictionless import Resource, transform, steps

source = Resource(path="transform.csv")
target = transform(
    source,
    steps=[
        steps.cell_convert(field_name='population', value="100"),
    ],
)
print(target.to_view())
+----+-----------+------------+
| id | name      | population |
+====+===========+============+
|  1 | 'germany' |        100 |
+----+-----------+------------+
|  2 | 'france'  |        100 |
+----+-----------+------------+
|  3 | 'spain'   |        100 |
+----+-----------+------------+

Using Mapping

Another option to modify the field's cell is to provide a mapping. It's a translation table that uses literal matching to replace values. It's usually used for string fields:

from frictionless import Resource, transform, steps

source = Resource(path="transform.csv")
target = transform(
    source,
    steps=[
        steps.cell_convert(field_name='name', mapping = {'germany': 'GERMANY'}),
    ],
)
print(target.to_view())
+----+-----------+------------+
| id | name      | population |
+====+===========+============+
|  1 | 'GERMANY' |         83 |
+----+-----------+------------+
|  2 | 'france'  |         66 |
+----+-----------+------------+
|  3 | 'spain'   |         47 |
+----+-----------+------------+

Using Function

We can provide an arbitrary function to update the field cells. If you want to modify a non-string field it's really important to normalize the table first otherwise the function will be applied to a non-parsed value:

from frictionless import Resource, transform, steps

source = Resource(path="transform.csv")
target = transform(
    source,
    steps=[
        steps.table_normalize(),
        steps.cell_convert(field_name='population', function=lambda v: v*2),
    ],
)
print(target.to_view())
+----+-----------+------------+
| id | name      | population |
+====+===========+============+
|  1 | 'germany' |        166 |
+----+-----------+------------+
|  2 | 'france'  |        132 |
+----+-----------+------------+
|  3 | 'spain'   |         94 |
+----+-----------+------------+

Reference

steps.cell_convert (class)

steps.cell_convert (class)

Convert cell Converts cell values of one or more fields using arbitrary functions, method invocations or dictionary translations.

Signature

(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, value: Optional[Any] = None, mapping: Optional[Dict[str, Any]] = None, function: Optional[Any] = None, field_name: Optional[str] = None) -> None

Parameters
  • name (Optional[str])
  • title (Optional[str])
  • description (Optional[str])
  • value (Optional[Any])
  • mapping (Optional[Dict[str, Any]])
  • function (Optional[Any])
  • field_name (Optional[str])

steps.cell_convert.value (property)

Value to set in the field's cells

Signature

Optional[Any]

steps.cell_convert.mapping (property)

Mapping to apply to the column

Signature

Optional[Dict[str, Any]]

steps.cell_convert.function (property)

Function to apply to the column

Signature

Optional[Any]

steps.cell_convert.field_name (property)

Name of the field to apply the transform on

Signature

Optional[str]

Fill Cells

Example

from pprint import pprint
from frictionless import Package, Resource, transform, steps

source = Resource(path="transform.csv")
target = transform(
    source,
    steps=[
        steps.cell_replace(pattern="france", replace=None),
        steps.cell_fill(field_name="name", value="FRANCE"),
    ]
)
print(target.schema)
print(target.to_view())
{'fields': [{'name': 'id', 'type': 'integer'},
            {'name': 'name', 'type': 'string'},
            {'name': 'population', 'type': 'integer'}]}
+----+-----------+------------+
| id | name      | population |
+====+===========+============+
|  1 | 'germany' |         83 |
+----+-----------+------------+
|  2 | 'FRANCE'  |         66 |
+----+-----------+------------+
|  3 | 'spain'   |         47 |
+----+-----------+------------+

Reference

steps.cell_fill (class)

steps.cell_fill (class)

Fill cell Replaces missing values with non-missing values from the adjacent row/column.

Signature

(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, value: Optional[Any] = None, field_name: Optional[str] = None, direction: Optional[str] = None) -> None

Parameters
  • name (Optional[str])
  • title (Optional[str])
  • description (Optional[str])
  • value (Optional[Any])
  • field_name (Optional[str])
  • direction (Optional[str])

steps.cell_fill.value (property)

Value to replace in the field cell with missing value

Signature

Optional[Any]

steps.cell_fill.field_name (property)

Name of the field to replace the missing value cells

Signature

Optional[str]

steps.cell_fill.direction (property)

Directions to read the non missing value from(left/right/above)

Signature

Optional[str]

Format Cells

Example

from pprint import pprint
from frictionless import Package, Resource, transform, steps

source = Resource(path="transform.csv")
target = transform(
    source,
    steps=[
        steps.cell_format(template="Prefix: {0}", field_name="name"),
    ]
)
print(target.schema)
print(target.to_view())
{'fields': [{'name': 'id', 'type': 'integer'},
            {'name': 'name', 'type': 'string'},
            {'name': 'population', 'type': 'integer'}]}
+----+-------------------+------------+
| id | name              | population |
+====+===================+============+
|  1 | 'Prefix: germany' |         83 |
+----+-------------------+------------+
|  2 | 'Prefix: france'  |         66 |
+----+-------------------+------------+
|  3 | 'Prefix: spain'   |         47 |
+----+-------------------+------------+

Reference

steps.cell_format (class)

steps.cell_format (class)

Format cell Formats all values in the given or all string fields using the `template` format string.

Signature

(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, template: str, field_name: Optional[str] = None) -> None

Parameters
  • name (Optional[str])
  • title (Optional[str])
  • description (Optional[str])
  • template (str)
  • field_name (Optional[str])

steps.cell_format.template (property)

format string to apply to cells

Signature

str

steps.cell_format.field_name (property)

field name to apply template format

Signature

Optional[str]

Interpolate Cells

Example

from pprint import pprint
from frictionless import Package, Resource, transform, steps

source = Resource(path="transform.csv")
target = transform(
    source,
    steps=[
        steps.cell_interpolate(template="Prefix: %s", field_name="name"),
    ]
)
pprint(target.schema)
pprint(target.read_rows())
{'fields': [{'name': 'id', 'type': 'integer'},
            {'name': 'name', 'type': 'string'},
            {'name': 'population', 'type': 'integer'}]}
[{'id': 1, 'name': 'Prefix: germany', 'population': 83},
 {'id': 2, 'name': 'Prefix: france', 'population': 66},
 {'id': 3, 'name': 'Prefix: spain', 'population': 47}]

Reference

steps.cell_interpolate (class)

steps.cell_interpolate (class)

Interpolate cell Interpolate all values in a given or all string fields using the `template` string.

Signature

(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, template: str, field_name: Optional[str] = None) -> None

Parameters
  • name (Optional[str])
  • title (Optional[str])
  • description (Optional[str])
  • template (str)
  • field_name (Optional[str])

steps.cell_interpolate.template (property)

template string to apply to the field cells

Signature

str

steps.cell_interpolate.field_name (property)

field name to apply template string

Signature

Optional[str]

Replace Cells

Example

from pprint import pprint
from frictionless import Package, Resource, transform, steps

source = Resource(path="transform.csv")
target = transform(
    source,
    steps=[
        steps.cell_replace(pattern="france", replace="FRANCE"),
    ]
)
print(target.schema)
print(target.to_view())
{'fields': [{'name': 'id', 'type': 'integer'},
            {'name': 'name', 'type': 'string'},
            {'name': 'population', 'type': 'integer'}]}
+----+-----------+------------+
| id | name      | population |
+====+===========+============+
|  1 | 'germany' |         83 |
+----+-----------+------------+
|  2 | 'FRANCE'  |         66 |
+----+-----------+------------+
|  3 | 'spain'   |         47 |
+----+-----------+------------+

steps.cell_replace (class)

steps.cell_replace (class)

Replace cell Replace cell values in a given field or all fields using user defined pattern.

Signature

(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, pattern: str, replace: str, field_name: Optional[str] = None) -> None

Parameters
  • name (Optional[str])
  • title (Optional[str])
  • description (Optional[str])
  • pattern (str)
  • replace (str)
  • field_name (Optional[str])

steps.cell_replace.pattern (property)

Pattern to search for in single or all fields

Signature

str

steps.cell_replace.replace (property)

String to replace

Signature

str

steps.cell_replace.field_name (property)

field name to apply template string

Signature

Optional[str]

Set Cells

Example

from pprint import pprint
from frictionless import Package, Resource, transform, steps

source = Resource(path="transform.csv")
target = transform(
    source,
    steps=[
          steps.cell_set(field_name="population", value=100),
    ]
)
print(target.schema)
print(target.to_view())
{'fields': [{'name': 'id', 'type': 'integer'},
            {'name': 'name', 'type': 'string'},
            {'name': 'population', 'type': 'integer'}]}
+----+-----------+------------+
| id | name      | population |
+====+===========+============+
|  1 | 'germany' |        100 |
+----+-----------+------------+
|  2 | 'france'  |        100 |
+----+-----------+------------+
|  3 | 'spain'   |        100 |
+----+-----------+------------+

Reference

steps.cell_set (class)

steps.cell_set (class)

Set cell

Signature

(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, value: Any, field_name: str) -> None

Parameters
  • name (Optional[str])
  • title (Optional[str])
  • description (Optional[str])
  • value (Any)
  • field_name (str)

steps.cell_set.value (property)

Value to be set in cell of the given field.

Signature

Any

steps.cell_set.field_name (property)

Specifies the field name where to set/replace the value.

Signature

str