Edit page in Livemark
(2022-09-19 18:33)

Schema Class

The Table Schema is a core Frictionless Data concept meaning a metadata information regarding tabular data source. You can read Table Schema Standard for more information.

Creating Schema

Let's create a table schema:

from frictionless import Schema, fields, describe

schema = describe('table.csv', type='schema') # from a resource path
schema = Schema.from_descriptor('schema.json') # from a descriptor path
schema = Schema.from_descriptor({'fields': [{'name': 'id', 'type': 'integer'}]}) # from a descriptor

As you can see it's possible to create a schema providing different kinds of sources which will be detector to have some type automatically (e.g. whether it's a dict or a path). It's possible to make this step more explicit:

from frictionless import Schema, Field

schema = Schema(fields=[fields.StringField(name='id')]) # from fields
schema = Schema.from_descriptor('schema.json') # from a descriptor

Describing Schema

The standard support some additional schema's metadata:

from frictionless import Schema, fields

schema = Schema(
    fields=[fields.StringField(name='id')],
    missing_values=['na'],
    primary_key=['id'],
    # foreign_keys
)
print(schema)
{'fields': [{'name': 'id', 'type': 'string'}],
 'missingValues': ['na'],
 'primaryKey': ['id']}

If you have created a schema, for example, from a descriptor you can access this properties:

from frictionless import Schema

schema = Schema.from_descriptor('schema.json')
print(schema.missing_values)
# and others
['']

And edit them:

from frictionless import Schema

schema = Schema.from_descriptor('schema.json')
schema.missing_values.append('-')
# and others
print(schema)
{'fields': [{'name': 'id', 'type': 'integer'},
            {'name': 'name', 'type': 'string'}],
 'missingValues': ['', '-']}

Field Management

The Schema class provides useful methods to manage fields:

from frictionless import Schema, fields

schema = Schema.from_descriptor('schema.json')
print(schema.fields)
print(schema.field_names)
schema.add_field(fields.StringField(name='new-name'))
field = schema.get_field('new-name')
print(schema.has_field('new-name'))
schema.remove_field('new-name')
[{'name': 'id', 'type': 'integer'}, {'name': 'name', 'type': 'string'}]
['id', 'name']
True

Saving Descriptor

As any of the Metadata classes the Schema class can be saved as JSON or YAML:

from frictionless import Schema, fields
schema = Schema(fields=[fields.IntegerField(name='id')])
schema.to_json('schema.json') # Save as JSON
schema.to_yaml('schema.yaml') # Save as YAML

Reading Cells

During the process of data reading a resource uses a schema to convert data:

from frictionless import Schema, fields

schema = Schema(fields=[fields.IntegerField(name='integer'), fields.StringField(name='string')])
cells, notes = schema.read_cells(['3', 'value'])
print(cells)
[3, 'value']

Writing Cells

During the process of data writing a resource uses a schema to convert data:

from frictionless import Schema, fields

schema = Schema(fields=[fields.IntegerField(name='integer'), fields.StringField(name='string')])
cells, notes = schema.write_cells([3, 'value'])
print(cells)
[3, 'value']

Creating Field

Let's create a field:

from frictionless import fields

field = fields.IntegerField(name='name')
print(field)
{'name': 'name', 'type': 'integer'}

Usually we work with fields which were already created by a schema:

from frictionless import describe

resource = describe('table.csv')
field = resource.schema.get_field('id')
print(field)
{'name': 'id', 'type': 'integer'}

Field Types

Frictionless Framework supports all the Table Schema Standard field types along with an ability to create custom types.

For some types there are additional properties available:

from frictionless import describe

resource = describe('table.csv')
field = resource.schema.get_field('id') # it's an integer
print(field.bare_number)
True

See the complete reference at Tabular Fields.

Reading Cell

During the process of data reading a schema uses a field internally. If needed a user can convert their data using this interface:

from frictionless import fields

field = fields.IntegerField(name='name')
cell, note = field.read_cell('3')
print(cell)
3

Writing Cell

During the process of data writing a schema uses a field internally. The same as with reading a user can convert their data using this interface:

from frictionless import fields

field = fields.IntegerField(name='name')
cell, note = field.write_cell(3)
print(cell)
3

Reference

Schema (class)

Field (class)

Schema (class)

Schema representation This class is one of the cornerstones of of Frictionless framework. It allow to work with Table Schema and its fields. ```python schema = Schema('schema.json') schema.add_fied(Field(name='name', type='string')) ```

Signature

(name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, fields: List[Field] = NOTHING, missing_values: List[str] = NOTHING, primary_key: List[str] = NOTHING, foreign_keys: List[dict] = NOTHING) -> None

Parameters

  • name (Optional[str])
  • title (Optional[str])
  • description (Optional[str])
  • fields (List[Field])
  • missing_values (List[str])
  • primary_key (List[str])
  • foreign_keys (List[dict])

schema.name (property)

NOTE: add docs

Signature

Optional[str]

schema.title (property)

NOTE: add docs

Signature

Optional[str]

schema.description (property)

NOTE: add docs

Signature

Optional[str]

schema.fields (property)

NOTE: add docs

Signature

List[Field]

schema.missing_values (property)

NOTE: add docs

Signature

List[str]

schema.primary_key (property)

NOTE: add docs

Signature

List[str]

schema.foreign_keys (property)

NOTE: add docs

Signature

List[dict]

schema.field_names (property)

List of field names

Signature

List[str]

schema.add_field (method)

Add new field to the schema

Signature

(field: Field, *, position: Optional[int] = None) -> None

Parameters

  • field (Field)
  • position (Optional[int])

schema.clear_fields (method)

Remove all the fields

Signature

() -> None

Schema.describe (method) (static)

Describe the given source as a schema

Signature

(source: Optional[Any] = None, **options)

Parameters

  • source (Optional[Any]): data source
  • options

schema.flatten (method)

Flatten the schema Parameters spec (str[]): flatten specification

Signature

(spec=[name, type])

Parameters

  • spec

Schema.from_jsonschema (method) (static)

Create a Schema from JSONSchema profile

Signature

(profile)

Parameters

  • profile : path or dict with JSONSchema profile

schema.get_field (method)

Get field by name

Signature

(name: str) -> Field

Parameters

  • name (str)

schema.has_field (method)

Check if a field is present

Signature

(name: str) -> bool

Parameters

  • name (str)

schema.read_cells (method)

Read a list of cells (normalize/cast)

Signature

(cells)

Parameters

  • cells : list of cells

schema.remove_field (method)

Remove field by name

Signature

(name: str) -> Field

Parameters

  • name (str)

schema.set_field (method)

Set field by name

Signature

(field: Field) -> Optional[Field]

Parameters

  • field (Field)

schema.set_field_type (method)

Set field type

Signature

(name: str, type: str) -> Field

Parameters

  • name (str)
  • type (str)

schema.to_excel_template (method)

Export schema as an excel template

Signature

(path: str)

Parameters

  • path (str): path of excel file to create with ".xlsx" extension

schema.to_summary (method)

Summary of the schema in table format

Signature

() -> str

schema.update_field (method)

Update field

Signature

(name: str, descriptor: IDescriptor) -> Field

Parameters

  • name (str)
  • descriptor (IDescriptor)

schema.write_cells (method)

Write a list of cells (normalize/uncast)

Signature

(cells, *, types=[])

Parameters

  • cells : list of cells
  • types

Field (class)

Field representation

Signature

(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, format: str = default, missing_values: List[str] = NOTHING, constraints: dict = NOTHING, rdf_type: Optional[str] = None, example: Optional[str] = None, schema: Optional[Schema] = None) -> None

Parameters

  • name (Optional[str])
  • title (Optional[str])
  • description (Optional[str])
  • format (str)
  • missing_values (List[str])
  • constraints (dict)
  • rdf_type (Optional[str])
  • example (Optional[str])
  • schema (Optional[Schema])

field.type (property)

NOTE: add docs

Signature

ClassVar[str]

field.builtin (property)

NOTE: add docs

Signature

ClassVar[bool]

field.supported_constraints (property)

NOTE: add docs

Signature

ClassVar[List[str]]

field.name (property)

NOTE: add docs

Signature

Optional[str]

field.title (property)

NOTE: add docs

Signature

Optional[str]

field.description (property)

NOTE: add docs

Signature

Optional[str]

field.format (property)

NOTE: add docs

Signature

str

field.missing_values (property)

NOTE: add docs

Signature

List[str]

field.constraints (property)

NOTE: add docs

Signature

dict

field.rdf_type (property)

NOTE: add docs

Signature

Optional[str]

field.example (property)

NOTE: add docs

Signature

Optional[str]

field.schema (property)

NOTE: add docs

Signature

Optional[Schema]

field.required (property)

TODO: add docs

Signature

(bool) ->

It's a beta version of Frictionless Framework (v5). Read Frictionless Framework (v4) docs for a version that is currently installed by default by pip.