Edit page in Livemark
(2024-11-07 15:17)

Catalog Class

Catalog is a set of data packages.

Creating Catalog

We can create a catalog providing a list of data packages:

from frictionless import Catalog, Dataset, Package

catalog = Catalog(datasets=[Dataset(name='name', package=Package('tables/*'))])

Describing Catalog

Usually Catalog is used to describe some external set of datasets like a CKAN instance or a Github user or search. For example:

from frictionless import Catalog

catalog = Catalog('https://demo.ckan.org/dataset/')
print(catalog)

Dataset Management

The core purpose of having a catalog is to provide an ability to have a set of datasets. The Catalog class provides useful methods to manage datasets:

from frictionless import Catalog

catalog = Catalog('https://demo.ckan.org/dataset/')
catalog.dataset_names
catalog.has_dataset
catalog.add_dataset
catalog.get_dataset
catalog.clear_datasets

Saving Descriptor

As any of the Metadata classes the Catalog class can be saved as JSON or YAML:

from frictionless import Package

catalog = Catalog('https://demo.ckan.org/dataset/')
catalog.to_json('datacatalog.json') # Save as JSON
catalog.to_yaml('datacatalog.yaml') # Save as YAML

Reference

Catalog (class)

Dataset (class)

Catalog (class)

Catalog representation

Signature

(*, source: Optional[Any] = None, control: Optional[Control] = None, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, datasets: List[Dataset] = NOTHING, basepath: Optional[str] = None) -> None

Parameters

  • source (Optional[Any])
  • control (Optional[Control])
  • name (Optional[str])
  • title (Optional[str])
  • description (Optional[str])
  • datasets (List[Dataset])
  • basepath (Optional[str])

catalog.source (property)

# TODO: add docs

Signature

Optional[Any]

catalog.control (property)

# TODO: add docs

Signature

Optional[Control]

catalog.name (property)

A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “.”, “_” or “-” characters.

Signature

Optional[str]

catalog.type (property)

Type of the object

Signature

ClassVar[Union[str, None]]

catalog.title (property)

A Catalog title according to the specs. It should be a human-oriented title of the resource.

Signature

Optional[str]

catalog.description (property)

A Catalog description according to the specs. It should be a human-oriented description of the resource.

Signature

Optional[str]

catalog.datasets (property)

A list of datasets. Each package in the list is a Data Dataset.

Signature

List[Dataset]

catalog.basepath (property)

A basepath of the catalog. The normpath of the resource is joined `basepath` and `/path`

Signature

Optional[str]

catalog.dataset_names (property)

Return names of datasets

Signature

List[str]

catalog.add_dataset (method)

Add new dataset to the catalog

Signature

(dataset: Union[Dataset, str]) -> Dataset

Parameters

  • dataset (Union[Dataset, str])

catalog.clear_datasets (method)

Remove all the datasets

catalog.dereference (method)

Dereference underlaying metadata If some of underlaying metadata is provided as a string it will replace it by the metadata object

catalog.get_dataset (method)

Get dataset by name

Signature

(name: str) -> Dataset

Parameters

  • name (str)

catalog.has_dataset (method)

Check if a dataset is present

Signature

(name: str) -> bool

Parameters

  • name (str)

catalog.infer (method)

Infer catalog's metadata

Signature

(*, stats: bool = False)

Parameters

  • stats (bool)

catalog.remove_dataset (method)

Remove dataset by name

Signature

(name: str) -> Dataset

Parameters

  • name (str)

catalog.set_dataset (method)

Set dataset by name

Signature

(dataset: Dataset) -> Optional[Dataset]

Parameters

  • dataset (Dataset)

catalog.to_copy (method)

Create a copy of the catalog

Signature

(**options: Any)

Parameters

  • options (Any)

Dataset (class)

Dataset representation.

Signature

(*, name: str, title: Optional[str] = None, description: Optional[str] = None, package: Union[Package, str], basepath: Optional[str] = None, catalog: Optional[Catalog] = None) -> None

Parameters

  • name (str)
  • title (Optional[str])
  • description (Optional[str])
  • package (Union[Package, str])
  • basepath (Optional[str])
  • catalog (Optional[Catalog])

dataset.name (property)

A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “.”, “_” or “-” characters.

Signature

str

dataset.type (property)

A short name(preferably human-readable) for the Check. This MUST be lower-case and contain only alphanumeric characters along with "-" or "_".

Signature

ClassVar[str]

dataset.title (property)

A human-readable title for the Check.

Signature

Optional[str]

dataset.description (property)

A detailed description for the Check.

Signature

Optional[str]

dataset._package (property)

# TODO: add docs

Signature

Union[Package, str]

dataset._basepath (property)

# TODO: add docs

Signature

Optional[str]

dataset.catalog (property)

# TODO: add docs

Signature

Optional[Catalog]

dataset.dereference (method)

Dereference underlaying metadata If some of underlaying metadata is provided as a string it will replace it by the metadata object

dataset.infer (method)

Infer dataset's metadata

Signature

(*, stats: bool = False)

Parameters

  • stats (bool)