The Cell steps are responsible for cell operations like converting, replacing, or formating, along with others.
Converts cell values of one or more fields using arbitrary functions, method invocations or dictionary translations.
We can provide a value to be set as a value of all cells of this field. Take into account that the value type needs to conform to the field type otherwise it will lead to a validation error:
from frictionless import Resource, transform, steps
source = Resource(path="transform.csv")
target = transform(
source,
steps=[
steps.cell_convert(field_name='population', value="100"),
],
)
print(target.to_view())
+----+-----------+------------+
| id | name | population |
+====+===========+============+
| 1 | 'germany' | 100 |
+----+-----------+------------+
| 2 | 'france' | 100 |
+----+-----------+------------+
| 3 | 'spain' | 100 |
+----+-----------+------------+
Another option to modify the field's cell is to provide a mapping. It's a translation table that uses literal matching to replace values. It's usually used for string fields:
from frictionless import Resource, transform, steps
source = Resource(path="transform.csv")
target = transform(
source,
steps=[
steps.cell_convert(field_name='name', mapping = {'germany': 'GERMANY'}),
],
)
print(target.to_view())
+----+-----------+------------+
| id | name | population |
+====+===========+============+
| 1 | 'GERMANY' | 83 |
+----+-----------+------------+
| 2 | 'france' | 66 |
+----+-----------+------------+
| 3 | 'spain' | 47 |
+----+-----------+------------+
We can provide an arbitrary function to update the field cells. If you want to modify a non-string field it's really important to normalize the table first otherwise the function will be applied to a non-parsed value:
from frictionless import Resource, transform, steps
source = Resource(path="transform.csv")
target = transform(
source,
steps=[
steps.table_normalize(),
steps.cell_convert(field_name='population', function=lambda v: v*2),
],
)
print(target.to_view())
+----+-----------+------------+
| id | name | population |
+====+===========+============+
| 1 | 'germany' | 166 |
+----+-----------+------------+
| 2 | 'france' | 132 |
+----+-----------+------------+
| 3 | 'spain' | 94 |
+----+-----------+------------+
Convert cell Converts cell values of one or more fields using arbitrary functions, method invocations or dictionary translations.
(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, value: Optional[Any] = None, mapping: Optional[Dict[str, Any]] = None, function: Optional[Any] = None, field_name: Optional[str] = None) -> None
Value to set in the field's cells
Optional[Any]
Mapping to apply to the column
Optional[Dict[str, Any]]
Function to apply to the column
Optional[Any]
Name of the field to apply the transform on
Optional[str]
from pprint import pprint
from frictionless import Package, Resource, transform, steps
source = Resource(path="transform.csv")
target = transform(
source,
steps=[
steps.cell_replace(pattern="france", replace=None),
steps.cell_fill(field_name="name", value="FRANCE"),
]
)
print(target.schema)
print(target.to_view())
{'fields': [{'name': 'id', 'type': 'integer'},
{'name': 'name', 'type': 'string'},
{'name': 'population', 'type': 'integer'}]}
+----+-----------+------------+
| id | name | population |
+====+===========+============+
| 1 | 'germany' | 83 |
+----+-----------+------------+
| 2 | 'FRANCE' | 66 |
+----+-----------+------------+
| 3 | 'spain' | 47 |
+----+-----------+------------+
Fill cell Replaces missing values with non-missing values from the adjacent row/column.
(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, value: Optional[Any] = None, field_name: Optional[str] = None, direction: Optional[str] = None) -> None
Value to replace in the field cell with missing value
Optional[Any]
Name of the field to replace the missing value cells
Optional[str]
Directions to read the non missing value from(left/right/above)
Optional[str]
from pprint import pprint
from frictionless import Package, Resource, transform, steps
source = Resource(path="transform.csv")
target = transform(
source,
steps=[
steps.cell_format(template="Prefix: {0}", field_name="name"),
]
)
print(target.schema)
print(target.to_view())
{'fields': [{'name': 'id', 'type': 'integer'},
{'name': 'name', 'type': 'string'},
{'name': 'population', 'type': 'integer'}]}
+----+-------------------+------------+
| id | name | population |
+====+===================+============+
| 1 | 'Prefix: germany' | 83 |
+----+-------------------+------------+
| 2 | 'Prefix: france' | 66 |
+----+-------------------+------------+
| 3 | 'Prefix: spain' | 47 |
+----+-------------------+------------+
Format cell Formats all values in the given or all string fields using the `template` format string.
(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, template: str, field_name: Optional[str] = None) -> None
format string to apply to cells
str
field name to apply template format
Optional[str]
from pprint import pprint
from frictionless import Package, Resource, transform, steps
source = Resource(path="transform.csv")
target = transform(
source,
steps=[
steps.cell_interpolate(template="Prefix: %s", field_name="name"),
]
)
pprint(target.schema)
pprint(target.read_rows())
{'fields': [{'name': 'id', 'type': 'integer'},
{'name': 'name', 'type': 'string'},
{'name': 'population', 'type': 'integer'}]}
[{'id': 1, 'name': 'Prefix: germany', 'population': 83},
{'id': 2, 'name': 'Prefix: france', 'population': 66},
{'id': 3, 'name': 'Prefix: spain', 'population': 47}]
Interpolate cell Interpolate all values in a given or all string fields using the `template` string.
(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, template: str, field_name: Optional[str] = None) -> None
template string to apply to the field cells
str
field name to apply template string
Optional[str]
from pprint import pprint
from frictionless import Package, Resource, transform, steps
source = Resource(path="transform.csv")
target = transform(
source,
steps=[
steps.cell_replace(pattern="france", replace="FRANCE"),
]
)
print(target.schema)
print(target.to_view())
{'fields': [{'name': 'id', 'type': 'integer'},
{'name': 'name', 'type': 'string'},
{'name': 'population', 'type': 'integer'}]}
+----+-----------+------------+
| id | name | population |
+====+===========+============+
| 1 | 'germany' | 83 |
+----+-----------+------------+
| 2 | 'FRANCE' | 66 |
+----+-----------+------------+
| 3 | 'spain' | 47 |
+----+-----------+------------+
Replace cell Replace cell values in a given field or all fields using user defined pattern.
(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, pattern: str, replace: str, field_name: Optional[str] = None) -> None
Pattern to search for in single or all fields
str
String to replace
str
field name to apply template string
Optional[str]
from pprint import pprint
from frictionless import Package, Resource, transform, steps
source = Resource(path="transform.csv")
target = transform(
source,
steps=[
steps.cell_set(field_name="population", value=100),
]
)
print(target.schema)
print(target.to_view())
{'fields': [{'name': 'id', 'type': 'integer'},
{'name': 'name', 'type': 'string'},
{'name': 'population', 'type': 'integer'}]}
+----+-----------+------------+
| id | name | population |
+====+===========+============+
| 1 | 'germany' | 100 |
+----+-----------+------------+
| 2 | 'france' | 100 |
+----+-----------+------------+
| 3 | 'spain' | 100 |
+----+-----------+------------+
Set cell
(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, value: Any, field_name: str) -> None
Value to be set in cell of the given field.
Any
Specifies the field name where to set/replace the value.
str