Edit page in Livemark
(2022-09-19 18:33)

Package Class

The Data Package is a core Frictionless Data concept meaning a set of resources with additional metadata provided. You can read Data Package Standard for more information.

Creating Package

Let's create a data package:

from frictionless import Package, Resource

package = Package('table.csv') # from a resource path
package = Package('tables/*') # from a resources glob
package = Package(['tables/chunk1.csv', 'tables/chunk2.csv']) # from a list
package = Package('package/datapackage.json') # from a descriptor path
package = Package({'resources': {'path': 'table.csv'}}) # from a descriptor
package = Package(resources=[Resource(path='table.csv')]) # from arguments

As you can see it's possible to create a package providing different kinds of sources which will be detected to have some type automatically (e.g. whether it's a glob or a path). It's possible to make this step more explicit:

from frictionless import Package, Resource

package = Package(resources=[Resource(path='table.csv')]) # from arguments
package = Package('datapackage.json') # from a descriptor

Describing Package

The standards support a great deal of package metadata which is possible to have with Frictionless Framework too:

from frictionless import Package, Resource

package = Package(
    name='package',
    title='My Package',
    description='My Package for the Guide',
    resources=[Resource(path='table.csv')],
    # it's possible to provide all the official properties like homepage, version, etc
)
print(package)
{'name': 'package',
 'title': 'My Package',
 'description': 'My Package for the Guide',
 'resources': [{'path': 'table.csv'}]}

If you have created a package, for example, from a descriptor you can access this properties:

from frictionless import Package

package = Package('datapackage.json')
print(package.name)
# and others
test-tabulator

And edit them:

from frictionless import Package

package = Package('datapackage.json')
package.name = 'new-name'
package.title = 'New Title'
package.description = 'New Description'
# and others
print(package)
{'name': 'new-name',
 'title': 'New Title',
 'description': 'New Description',
 'resources': [{'name': 'first-resource',
                'path': 'table.xls',
                'schema': {'fields': [{'name': 'id', 'type': 'number'},
                                      {'name': 'name', 'type': 'string'}]}},
               {'name': 'number-two',
                'path': 'table-reverse.csv',
                'schema': {'fields': [{'name': 'id', 'type': 'integer'},
                                      {'name': 'name', 'type': 'string'}]}}]}

Resource Management

The core purpose of having a package is to provide an ability to have a set of resources. The Package class provides useful methods to manage resources:

from frictionless import Package, Resource

package = Package('datapackage.json')
print(package.resources)
print(package.resource_names)
package.add_resource(Resource(name='new', data=[['key1', 'key2'], ['val1', 'val2']]))
resource = package.get_resource('new')
print(package.has_resource('new'))
package.remove_resource('new')
[{'name': 'first-resource',
 'path': 'table.xls',
 'schema': {'fields': [{'name': 'id', 'type': 'number'},
                       {'name': 'name', 'type': 'string'}]}}, {'name': 'number-two',
 'path': 'table-reverse.csv',
 'schema': {'fields': [{'name': 'id', 'type': 'integer'},
                       {'name': 'name', 'type': 'string'}]}}]
['first-resource', 'number-two']
True

Saving Descriptor

As any of the Metadata classes the Package class can be saved as JSON or YAML:

from frictionless import Package
package = Package('tables/*')
package.to_json('datapackage.json') # Save as JSON
package.to_yaml('datapackage.yaml') # Save as YAML

Reference

Package (class)

Manager (class)

Package (class)

Package representation This class is one of the cornerstones of of Frictionless framework. It manages underlaying resource and provides an ability to describe a package. ```python package = Package(resources=[Resource(path="data/table.csv")]) package.get_resoure('table').read_rows() == [ {'id': 1, 'name': 'english'}, {'id': 2, 'name': '中国人'},

Signature

(source: Optional[Any] = None, control: Optional[Control] = None, innerpath: Optional[str] = None, *, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, homepage: Optional[str] = None, profiles: List[Union[IProfile, str]] = [], licenses: List[dict] = [], sources: List[dict] = [], contributors: List[dict] = [], keywords: List[str] = [], image: Optional[str] = None, version: Optional[str] = None, created: Optional[str] = None, resources: List[Union[Resource, str]] = [], basepath: Optional[str] = None, detector: Optional[Detector] = None, dialect: Optional[Dialect] = None, catalog: Optional[Catalog] = None)

Parameters

  • source (Optional[Any])
  • control (Optional[Control])
  • innerpath (Optional[str])
  • name (Optional[str])
  • title (Optional[str])
  • description (Optional[str])
  • homepage (Optional[str])
  • profiles (List[Union[IProfile, str]])
  • licenses (List[dict])
  • sources (List[dict])
  • contributors (List[dict])
  • keywords (List[str])
  • image (Optional[str])
  • version (Optional[str])
  • created (Optional[str])
  • resources (List[Union[Resource, str]])
  • basepath (Optional[str])
  • detector (Optional[Detector])
  • dialect (Optional[Dialect])
  • catalog (Optional[Catalog])

package.name (property)

A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “.”, “_” or “-” characters.

Signature

Optional[str]

package.title (property)

A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “.”, “_” or “-” characters.

Signature

Optional[str]

package.description (property)

A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “.”, “_” or “-” characters.

Signature

Optional[str]

package.homepage (property)

A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “.”, “_” or “-” characters.

Signature

Optional[str]

package.profiles (property)

A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “.”, “_” or “-” characters.

Signature

List[Union[IProfile, str]]

package.licenses (property)

A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “.”, “_” or “-” characters.

Signature

List[dict]

package.sources (property)

A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “.”, “_” or “-” characters.

Signature

List[dict]

package.contributors (property)

A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “.”, “_” or “-” characters.

Signature

List[dict]

package.keywords (property)

A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “.”, “_” or “-” characters.

Signature

List[str]

package.image (property)

A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “.”, “_” or “-” characters.

Signature

Optional[str]

package.version (property)

A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “.”, “_” or “-” characters.

Signature

Optional[str]

package.created (property)

A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “.”, “_” or “-” characters.

Signature

Optional[str]

package.resources (property)

A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “.”, “_” or “-” characters.

Signature

List[Resource]

package.catalog (property)

A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “.”, “_” or “-” characters.

Signature

Optional[Catalog]

package.basepath (property)

A basepath of the package The normpath of the resource is joined `basepath` and `/path`

Signature

Optional[str]

package.resource_names (property)

Return names of resources

Signature

List[str]

package.resource_paths (property)

Return names of resources

Signature

List[str]

package.add_resource (method)

Add new resource to the package

Signature

(resource: Union[Resource, str]) -> Resource

Parameters

  • resource (Union[Resource, str])

package.analyze (method)

Analyze the resources of the package This feature is currently experimental, and its API may change without warning.

Signature

(: Package, *, detailed=False)

Parameters

  • detailed

package.clear_resources (method)

Remove all the resources

Package.describe (method) (static)

Describe the given source as a package

Signature

(source: Optional[Any] = None, *, stats: bool = False, **options)

Parameters

  • source (Optional[Any]): data source
  • stats (bool)
  • options

package.extract (method)

Extract package rows

Signature

(: Package, *, limit_rows: Optional[int] = None, process: Optional[IProcessFunction] = None, filter: Optional[IFilterFunction] = None, stream: bool = False)

Parameters

  • limit_rows (Optional[int])
  • process (Optional[IProcessFunction])
  • filter (Optional[IFilterFunction])
  • stream (bool)

package.flatten (method)

Flatten the package Parameters spec (str[]): flatten specification

Signature

(spec=[name, path])

Parameters

  • spec

Package.from_bigquery (method) (static)

Import package from Bigquery

Signature

(source, *, control=None)

Parameters

  • source : BigQuery `Service` object
  • control : BigQuery control

Package.from_ckan (method) (static)

Import package from CKAN

Signature

(source: Any, *, control: Optional[portals.CkanControl] = None)

Parameters

  • source (Any): CKAN instance url e.g. "https://demo.ckan.org"
  • control (Optional[portals.CkanControl]): CKAN control

Package.from_github (method) (static)

Import package from Github

Signature

(source: Any = None, *, control: Optional[portals.GithubControl] = None)

Parameters

  • source (Any): Github repo url e.g. "https://github.com/frictionlessdata/repository-demo"
  • control (Optional[portals.GithubControl]): Github control

Package.from_sql (method) (static)

Import package from SQL

Signature

(source, *, control=None)

Parameters

  • source : SQL connection string of engine
  • control : SQL control

Package.from_zip (method) (static)

Create a package from ZIP

Signature

(path, **options)

Parameters

  • path : file path
  • options

package.get_resource (method)

Get resource by name

Signature

(name: str) -> Resource

Parameters

  • name (str)

package.has_resource (method)

Check if a resource is present

Signature

(name: str) -> bool

Parameters

  • name (str)

package.infer (method)

Infer package's attributes

Signature

(*, sample=True, stats=False)

Parameters

  • sample
  • stats

package.publish (method)

Publish package to any supported data portal

Signature

(target: Any = None, *, control: Optional[portals.GithubControl] = None) -> Any

Parameters

  • target (Any): url e.g. "https://github.com/frictionlessdata/repository-demo" of target[CKAN/Github...]
  • control (Optional[portals.GithubControl]): Github control

package.remove_resource (method)

Remove resource by name

Signature

(name: str) -> Resource

Parameters

  • name (str)

package.set_resource (method)

Set resource by name

Signature

(resource: Resource) -> Optional[Resource]

Parameters

  • resource (Resource)

package.to_bigquery (method)

Export package to Bigquery

Signature

(target, *, control=None)

Parameters

  • target : BigQuery `Service` object
  • control : BigQuery control

package.to_ckan (method)

Export package to CKAN

Signature

(target, *, control=None)

Parameters

  • target : CKAN instance url e.g. "https://demo.ckan.org"
  • control : CKAN control

package.to_copy (method)

Create a copy of the package

package.to_er_diagram (method)

Generate ERD(Entity Relationship Diagram) from package resources and exports it as .dot file Based on: - https://github.com/frictionlessdata/frictionless-py/issues/1118

Signature

(path=None) -> str

Parameters

  • path : target path

package.to_github (method)

Export package to Github

Signature

(target: Any = None, *, control: Optional[portals.GithubControl] = None)

Parameters

  • target (Any): Github instance url e.g. "https://github.com/frictionlessdata/repository-demo"
  • control (Optional[portals.GithubControl]): Github control

package.to_sql (method)

Export package to SQL

Signature

(target, *, control=None)

Parameters

  • target : SQL connection string of engine
  • control : SQL control

package.to_zip (method)

Save package to a zip

Signature

(path, *, encoder_class=None, compression=None)

Parameters

  • path : target path
  • encoder_class : json encoder class
  • compression : the ZIP compression method to use when writing the archive. Possible values are the ones supported by Python's `zipfile` module. Defaults: zipfile.ZIP_DEFLATED

package.transform (method)

Transform package

Signature

(: Package, pipeline: Pipeline)

Parameters

  • pipeline (Pipeline)

package.update_resource (method)

Update resource

Signature

(name: str, descriptor: IDescriptor) -> Resource

Parameters

  • name (str)
  • descriptor (IDescriptor)

package.validate (method)

Validate package

Signature

(: Package, checklist: Optional[Checklist] = None, *, limit_errors: int = 1000, limit_rows: Optional[int] = None, parallel: bool = False)

Parameters

  • checklist (Optional[Checklist])
  • limit_errors (int)
  • limit_rows (Optional[int])
  • parallel (bool)

Manager (class)

Abstract base class for generic types. A generic type is typically declared by inheriting from this class parameterized with one or more type variables. For example, a generic mapping type might be defined as:: class Mapping(Generic[KT, VT]): def __getitem__(self, key: KT) -> VT: ... # Etc. This class can then be used as follows:: def lookup_name(mapping: Mapping[KT, VT], key: KT, default: VT) -> VT: try: return mapping[key] except KeyError: return default

Signature

(control: ControlType)

Parameters

  • control (ControlType)

manager.control (property)

NOTE: add docs

Signature

ControlType

It's a beta version of Frictionless Framework (v5). Read Frictionless Framework (v4) docs for a version that is currently installed by default by pip.