Edit page in Livemark
(2023-01-25 11:55)

Resource Class

The Resource class is arguable the most important class of the whole Frictionless Framework. It's based on Data Resource Standard and Tabular Data Resource Standard

Creating Resource

Let's create a data resource:

from frictionless import Resource

resource = Resource('table.csv') # from a resource path
resource = Resource('resource.json') # from a descriptor path
resource = Resource({'path': 'table.csv'}) # from a descriptor
resource = Resource(path='table.csv') # from arguments

As you can see it's possible to create a resource providing different kinds of sources which will be detector to have some type automatically (e.g. whether it's a descriptor or a path). It's possible to make this step more explicit:

from frictionless import Resource

resource = Resource(path='data/table.csv') # from a path
resource = Resource('data/resource.json') # from a descriptor

Describing Resource

The standards support a great deal of resource metadata which is possible to have with Frictionless Framework too:

from frictionless import Resource

resource = Resource(
    name='resource',
    title='My Resource',
    description='My Resource for the Guide',
    path='table.csv',
    # it's possible to provide all the official properties like mediatype, etc
)
print(resource)
{'name': 'resource',
 'title': 'My Resource',
 'description': 'My Resource for the Guide',
 'path': 'table.csv'}

If you have created a resource, for example, from a descriptor you can access this properties:

from frictionless import Resource

resource = Resource('resource.json')
print(resource.name)
# and others
name

And edit them:

from frictionless import Resource

resource = Resource('resource.json')
resource.name = 'new-name'
resource.title = 'New Title'
resource.description = 'New Description'
# and others
print(resource)
{'name': 'new-name',
 'title': 'New Title',
 'description': 'New Description',
 'path': 'table.csv'}

Saving Descriptor

As any of the Metadata classes the Resource class can be saved as JSON or YAML:

from frictionless import Resource
resource = Resource('table.csv')
resource.to_json('resource.json') # Save as JSON
resource.to_yaml('resource.yaml') # Save as YAML

Resource Lifecycle

You might have noticed that we had to duplicate the with Resource(...) statement in some examples. The reason is that Resource is a streaming interface. Once it's read you need to open it again. Let's show it in an example:

from pprint import pprint
from frictionless import Resource

resource = Resource('capital-3.csv')
resource.open()
pprint(resource.read_rows())
pprint(resource.read_rows())
# We need to re-open: there is no data left
resource.open()
pprint(resource.read_rows())
# We need to close manually: not context manager is used
resource.close()
[{'id': 1, 'name': 'London'},
 {'id': 2, 'name': 'Berlin'},
 {'id': 3, 'name': 'Paris'},
 {'id': 4, 'name': 'Madrid'},
 {'id': 5, 'name': 'Rome'}]
[]
[{'id': 1, 'name': 'London'},
 {'id': 2, 'name': 'Berlin'},
 {'id': 3, 'name': 'Paris'},
 {'id': 4, 'name': 'Madrid'},
 {'id': 5, 'name': 'Rome'}]

At the same you can read data for a resource without opening and closing it explicitly. In this case Frictionless Framework will open and close the resource for you so it will be basically a one-time operation:

from frictionless import Resource

resource = Resource('capital-3.csv')
pprint(resource.read_rows())
[{'id': 1, 'name': 'London'},
 {'id': 2, 'name': 'Berlin'},
 {'id': 3, 'name': 'Paris'},
 {'id': 4, 'name': 'Madrid'},
 {'id': 5, 'name': 'Rome'}]

Reading Data

The Resource class is also a metadata class which provides various read and stream functions. The extract functions always read rows into memory; Resource can do the same but it also gives a choice regarding output data. It can be rows, data, text, or bytes. Let's try reading all of them:

from frictionless import Resource

resource = Resource('country-3.csv')
pprint(resource.read_bytes())
pprint(resource.read_text())
pprint(resource.read_cells())
pprint(resource.read_rows())
(b'id,capital_id,name,population\n1,1,Britain,67\n2,3,France,67\n3,2,Germany,8'
 b'3\n4,5,Italy,60\n5,4,Spain,47\n')
('id,capital_id,name,population\n'
 '1,1,Britain,67\n'
 '2,3,France,67\n'
 '3,2,Germany,83\n'
 '4,5,Italy,60\n'
 '5,4,Spain,47\n')
[['id', 'capital_id', 'name', 'population'],
 ['1', '1', 'Britain', '67'],
 ['2', '3', 'France', '67'],
 ['3', '2', 'Germany', '83'],
 ['4', '5', 'Italy', '60'],
 ['5', '4', 'Spain', '47']]
[{'id': 1, 'capital_id': 1, 'name': 'Britain', 'population': 67},
 {'id': 2, 'capital_id': 3, 'name': 'France', 'population': 67},
 {'id': 3, 'capital_id': 2, 'name': 'Germany', 'population': 83},
 {'id': 4, 'capital_id': 5, 'name': 'Italy', 'population': 60},
 {'id': 5, 'capital_id': 4, 'name': 'Spain', 'population': 47}]

It's really handy to read all your data into memory but it's not always possible if a file is really big. For such cases, Frictionless provides streaming functions:

from frictionless import Resource

with Resource('country-3.csv') as resource:
    pprint(resource.byte_stream)
    pprint(resource.text_stream)
    pprint(resource.cell_stream)
    pprint(resource.row_stream)
    for row in resource.row_stream:
      print(row)
<frictionless.system.loader.ByteStreamWithStatsHandling object at 0x7f0f9465b9a0>
<_io.TextIOWrapper name='country-3.csv' encoding='utf-8'>
<itertools.chain object at 0x7f0f946df130>
<generator object Resource.__prepare_row_stream.<locals>.row_stream at 0x7f0f94497e40>
{'id': 1, 'capital_id': 1, 'name': 'Britain', 'population': 67}
{'id': 2, 'capital_id': 3, 'name': 'France', 'population': 67}
{'id': 3, 'capital_id': 2, 'name': 'Germany', 'population': 83}
{'id': 4, 'capital_id': 5, 'name': 'Italy', 'population': 60}
{'id': 5, 'capital_id': 4, 'name': 'Spain', 'population': 47}

Indexing Data

Indexing resource in Frictionless terms means loading a data table into a database with or without metadata. Let's explore how this feature works in different modes.

All the example are written for SQLite for simplicity

Normal Mode

This mode is supported for any database that is supported by sqlalchemy. Under the hood, Frictionless will infer Table Schema and populate the data table as it normally reads data. It means that type errors will be replaced by null values and in-general it guarantees to finish successfully for any data even very invalid.

frictionless index table.csv --database sqlite:///index/project.db --table table
frictionless extract sqlite:///index/project.db --table table --json
Indexed 2 rows in 0.319 seconds
[
  {
    "id": 1,
    "name": "english"
  },
  {
    "id": 2,
    "name": "中国人"
  }
]
import sqlite3
from frictionless import Resource, formats

resource = Resource('table.csv')
resource.index('sqlite:///index/project.db', table_name='table')
print(Resource('sqlite:///index/project.db', control=formats.sql.SqlControl(table='table')).extract())
[{'id': 1, 'name': 'english'}, {'id': 2, 'name': '中国人'}]

Metadata Mode

In metadata mode, the indexing process will be the same but it also stores the metadata in the database. This mode is highly-experimental and, currently, in-general not intended for using outside of Frictionless Software. Let's explore on the example:

frictionless index table.csv --database sqlite:///index/project.db --metadata
frictionless extract sqlite:///index/project.db --table table --json
frictionless extract sqlite:///index/project.db --table _resources --json
Indexed 2 rows in 0.342 seconds
[
  {
    "_row_number": 2,
    "_row_valid": true,
    "id": 1,
    "name": "english"
  },
  {
    "_row_number": 3,
    "_row_valid": true,
    "id": 2,
    "name": "中国人"
  }
]
[
  {
    "path": "table.csv",
    "table_name": "table",
    "updated": "2023-01-25T11:57:10",
    "resource": "{\n  \"name\": \"table\",\n  \"type\": \"table\",\n  \"path\": \"table.csv\",\n  \"scheme\": \"file\",\n  \"format\": \"csv\",\n  \"encoding\": \"utf-8\",\n  \"mediatype\": \"text/csv\",\n  \"schema\": {\n    \"fields\": [\n      {\n        \"name\": \"id\",\n        \"type\": \"integer\"\n      },\n      {\n        \"name\": \"name\",\n        \"type\": \"string\"\n      }\n    ]\n  },\n  \"stats\": {\n    \"md5\": \"6c2c61dd9b0e9c6876139a449ed87933\",\n    \"sha256\": \"a1fd6c5ff3494f697874deeb07f69f8667e903dd94a7bc062dd57550cea26da8\",\n    \"bytes\": 30,\n    \"fields\": 2,\n    \"rows\": 2\n  }\n}",
    "report": "{\n  \"valid\": true,\n  \"stats\": {\n    \"tasks\": 1,\n    \"warnings\": 0,\n    \"errors\": 0,\n    \"seconds\": 0.004\n  },\n  \"warnings\": [],\n  \"errors\": [],\n  \"tasks\": [\n    {\n      \"valid\": true,\n      \"name\": \"table\",\n      \"type\": \"table\",\n      \"place\": \"table.csv\",\n      \"labels\": [\n        \"id\",\n        \"name\"\n      ],\n      \"stats\": {\n        \"md5\": \"6c2c61dd9b0e9c6876139a449ed87933\",\n        \"sha256\": \"a1fd6c5ff3494f697874deeb07f69f8667e903dd94a7bc062dd57550cea26da8\",\n        \"bytes\": 30,\n        \"fields\": 2,\n        \"rows\": 2,\n        \"warnings\": 0,\n        \"errors\": 0,\n        \"seconds\": 0.004\n      },\n      \"warnings\": [],\n      \"errors\": []\n    }\n  ]\n}"
  }
]
import sqlite3
from frictionless import Resource, formats

resource = Resource('table.csv')
resource.index('sqlite:///index/project.db', with_metadata=True)
print(Resource('sqlite:///index/project.db', control=formats.sql.SqlControl(table='table')).extract())
print(Resource('sqlite:///index/project.db', control=formats.sql.SqlControl(table='_resources')).extract())
[{'_row_number': 2, '_row_valid': True, 'id': 1, 'name': 'english'}, {'_row_number': 3, '_row_valid': True, 'id': 2, 'name': '中国人'}]
[{'path': 'table.csv', 'table_name': 'table', 'updated': datetime.datetime(2023, 1, 25, 11, 57, 12, 671396), 'resource': '{\n  "name": "table",\n  "type": "table",\n  "path": "table.csv",\n  "scheme": "file",\n  "format": "csv",\n  "encoding": "utf-8",\n  "mediatype": "text/csv",\n  "schema": {\n    "fields": [\n      {\n        "name": "id",\n        "type": "integer"\n      },\n      {\n        "name": "name",\n        "type": "string"\n      }\n    ]\n  },\n  "stats": {\n    "md5": "6c2c61dd9b0e9c6876139a449ed87933",\n    "sha256": "a1fd6c5ff3494f697874deeb07f69f8667e903dd94a7bc062dd57550cea26da8",\n    "bytes": 30,\n    "fields": 2,\n    "rows": 2\n  }\n}', 'report': '{\n  "valid": true,\n  "stats": {\n    "tasks": 1,\n    "warnings": 0,\n    "errors": 0,\n    "seconds": 0.004\n  },\n  "warnings": [],\n  "errors": [],\n  "tasks": [\n    {\n      "valid": true,\n      "name": "table",\n      "type": "table",\n      "place": "table.csv",\n      "labels": [\n        "id",\n        "name"\n      ],\n      "stats": {\n        "md5": "6c2c61dd9b0e9c6876139a449ed87933",\n        "sha256": "a1fd6c5ff3494f697874deeb07f69f8667e903dd94a7bc062dd57550cea26da8",\n        "bytes": 30,\n        "fields": 2,\n        "rows": 2,\n        "warnings": 0,\n        "errors": 0,\n        "seconds": 0.004\n      },\n      "warnings": [],\n      "errors": []\n    }\n  ]\n}'}]

Fast Mode

Fast mode is supported for SQLite and Postgresql databases. It will infer Table Schema using a data sample and index data using COPY in Potgresql and .import in SQLite. For big data files this mode will be 10-30x faster than normal indexing but the speed comes with the price -- if there is invalid data the indexing will fail.

frictionless index table.csv --database sqlite:///index/project.db --table table --fast
frictionless extract sqlite:///index/project.db --table table --json
Indexed 30 bytes in 0.607 seconds
[
  {
    "id": 1,
    "name": "english"
  },
  {
    "id": 2,
    "name": "中国人"
  }
]
import sqlite3
from frictionless import Resource, formats

resource = Resource('table.csv')
resource.index('sqlite:///index/project.db', table_name='table', fast=True)
print(Resource('sqlite:///index/project.db', control=formats.sql.SqlControl(table='table')).extract())
[{'id': 1, 'name': 'english'}, {'id': 2, 'name': '中国人'}]

Solution 1: Fallback

To ensure that the data will be successfully indexed it's possible to use fallback option. If the fast indexing fails Frictionless will start over in normal mode and finish the process successfully.

frictionless index table.csv --database sqlite:///index/project.db --table table --fast --fallback
import sqlite3
from frictionless import Resource, formats

resource = Resource('table.csv')
resource.index('sqlite:///index/project.db', table_name='table', fast=True, fallback=True)

Solution 2: QSV

Another option is to provide a path to QSV binary. In this case, initial schema inferring will be done based on the whole data file and will guarantee that the table is valid type-wise:

frictionless index table.csv --database sqlite:///index/project.db --table table --fast --qsv qsv_path
import sqlite3
from frictionless import Resource, formats

resource = Resource('table.csv')
resource.index('sqlite:///index/project.db', table_name='table', fast=True, qsv_path='qsv_path')

Scheme

The scheme also know as protocol indicates which loader Frictionless should use to read or write data. It can be file (default), text, http, https, s3, and others.

from frictionless import Resource

with Resource(b'header1,header2\nvalue1,value2', format='csv') as resource:
  print(resource.scheme)
  print(resource.to_view())
buffer
+----------+----------+
| header1  | header2  |
+==========+==========+
| 'value1' | 'value2' |
+----------+----------+

Format

The format or as it's also called extension helps Frictionless to choose a proper parser to handle the file. Popular formats are csv, xlsx, json and others

from frictionless import Resource

with Resource(b'header1,header2\nvalue1,value2.csv', format='csv') as resource:
  print(resource.format)
  print(resource.to_view())
csv
+----------+--------------+
| header1  | header2      |
+==========+==============+
| 'value1' | 'value2.csv' |
+----------+--------------+

Encoding

Frictionless automatically detects encoding of files but sometimes it can be inaccurate. It's possible to provide an encoding manually:

from frictionless import Resource

with Resource('country-3.csv', encoding='utf-8') as resource:
  print(resource.encoding)
  print(resource.path)
utf-8
country-3.csv
utf-8
data/country-3.csv

Innerpath

By default, Frictionless uses the first file found in a zip archive. It's possible to adjust this behaviour:

from frictionless import Resource

with Resource('table-multiple-files.zip', innerpath='table-reverse.csv') as resource:
  print(resource.compression)
  print(resource.innerpath)
  print(resource.to_view())
zip
table-reverse.csv
+----+-----------+
| id | name      |
+====+===========+
|  1 | '中国人'     |
+----+-----------+
|  2 | 'english' |
+----+-----------+

Compression

It's possible to adjust compression detection by providing the algorithm explicitly. For the example below it's not required as it would be detected anyway:

from frictionless import Resource

with Resource('table.csv.zip', compression='zip') as resource:
  print(resource.compression)
  print(resource.to_view())
zip
+----+-----------+
| id | name      |
+====+===========+
|  1 | 'english' |
+----+-----------+
|  2 | '中国人'     |
+----+-----------+

Dialect

Please read Table Dialect Guide for more information.

Schema

Please read Table Schema Guide for more information.

Checklist

Please read Checklist Guide for more information.

Pipeline

Please read Pipeline Guide for more information.

Stats

Resource's stats can be accessed with resource.stats:

from frictionless import Resource

resource = Resource('table.csv')
resource.infer(stats=True)
print(resource.stats)
{'md5': '6c2c61dd9b0e9c6876139a449ed87933',
 'sha256': 'a1fd6c5ff3494f697874deeb07f69f8667e903dd94a7bc062dd57550cea26da8',
 'bytes': 30,
 'fields': 2,
 'rows': 2}

Reference

Resource (class)

Resource (class)

Resource representation. This class is one of the cornerstones of of Frictionless framework. It loads a data source, and allows you to stream its parsed contents. At the same time, it's a metadata class data description. ```python with Resource("data/table.csv") as resource: resource.header == ["id", "name"] resource.read_rows() == [ {'id': 1, 'name': 'english'}, {'id': 2, 'name': '中国人'}, ] ```

Signature

(source: Optional[Any] = None, control: Optional[Control] = None, *, name: Optional[str] = None, type: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, homepage: Optional[str] = None, profiles: List[Union[IProfile, str]] = [], licenses: List[dict] = [], sources: List[dict] = [], path: Optional[str] = None, data: Optional[Any] = None, scheme: Optional[str] = None, format: Optional[str] = None, encoding: Optional[str] = None, mediatype: Optional[str] = None, compression: Optional[str] = None, extrapaths: List[str] = [], innerpath: Optional[str] = None, dialect: Optional[Union[Dialect, str]] = None, schema: Optional[Union[Schema, str]] = None, checklist: Optional[Union[Checklist, str]] = None, pipeline: Optional[Union[Pipeline, str]] = None, stats: Optional[Stats] = None, basepath: Optional[str] = None, detector: Optional[Detector] = None, package: Optional[Package] = None)

Parameters

  • source (Optional[Any])
  • control (Optional[Control])
  • name (Optional[str])
  • type (Optional[str])
  • title (Optional[str])
  • description (Optional[str])
  • homepage (Optional[str])
  • profiles (List[Union[IProfile, str]])
  • licenses (List[dict])
  • sources (List[dict])
  • path (Optional[str])
  • data (Optional[Any])
  • scheme (Optional[str])
  • format (Optional[str])
  • encoding (Optional[str])
  • mediatype (Optional[str])
  • compression (Optional[str])
  • extrapaths (List[str])
  • innerpath (Optional[str])
  • dialect (Optional[Union[Dialect, str]])
  • schema (Optional[Union[Schema, str]])
  • checklist (Optional[Union[Checklist, str]])
  • pipeline (Optional[Union[Pipeline, str]])
  • stats (Optional[Stats])
  • basepath (Optional[str])
  • detector (Optional[Detector])
  • package (Optional[Package])

resource.name (property)

Resource name according to the specs. It should be a slugified name of the resource.

Signature

Optional[str]

resource.type (property)

Type of the data e.g. "table"

Signature

Optional[str]

resource.title (property)

Resource title according to the specs It should a human-oriented title of the resource.

Signature

Optional[str]

resource.description (property)

Resource description according to the specs It should a human-oriented description of the resource.

Signature

Optional[str]

resource.homepage (property)

A URL for the home on the web that is related to this package. For example, github repository or ckan dataset address.

Signature

Optional[str]

resource.profiles (property)

Strings identifying the profile of this descriptor. For example, `tabular-data-resource`.

Signature

List[Union[IProfile, str]]

resource.licenses (property)

The license(s) under which the resource is provided. If omitted it's considered the same as the package's licenses.

Signature

List[dict]

resource.sources (property)

The raw sources for this data resource. It MUST be an array of Source objects. Each Source object MUST have a title and MAY have path and/or email properties.

Signature

List[dict]

resource.path (property)

Path to data source

Signature

Optional[str]

resource.data (property)

Inline data source

Signature

Optional[Any]

resource.scheme (property)

Scheme for loading the file (file, http, ...). If not set, it'll be inferred from `source`.

Signature

Optional[str]

resource.format (property)

File source's format (csv, xls, ...). If not set, it'll be inferred from `source`.

Signature

Optional[str]

resource.encoding (property)

Source encoding. If not set, it'll be inferred from `source`.

Signature

Optional[str]

resource.mediatype (property)

Mediatype/mimetype of the resource e.g. “text/csv”, or “application/vnd.ms-excel”. Mediatypes are maintained by the Internet Assigned Numbers Authority (IANA) in a media type registry.

Signature

Optional[str]

resource.compression (property)

Source file compression (zip, ...). If not set, it'll be inferred from `source`.

Signature

Optional[str]

resource.extrapaths (property)

List of paths to concatenate to the main path. It's used for multipart resources.

Signature

List[str]

resource.innerpath (property)

Path within the compressed file. It defaults to the first file in the archive (if the source is an archive).

Signature

Optional[str]

resource.detector (property)

File/table detector. For more information, please check the Detector documentation.

Signature

Detector

resource.package (property)

Parental to this resource package. For more information, please check the Package documentation.

Signature

Optional[Package]

resource.basepath (property)

A basepath of the resource The normpath of the resource is joined `basepath` and `/path`

Signature

Optional[str]

resource.buffer (property)

File's bytes used as a sample These buffer bytes are used to infer characteristics of the source file (e.g. encoding, ...).

Signature

IBuffer

resource.byte_stream (property)

Byte stream in form of a generator

Signature

IByteStream

resource.cell_stream (property)

Cell stream in form of a generator

Signature

ICellStream

resource.checklist (property)

Checklist object. For more information, please check the Checklist documentation.

Signature

(Optional[Union[Checklist, str]]) -> Optional[Checklist]

resource.closed (property)

Whether the table is closed

Signature

bool

resource.dialect (property)

File Dialect object. For more information, please check the Dialect documentation.

Signature

(Optional[Union[Dialect, str]]) -> Dialect

resource.fragment (property)

Table's lists used as fragment. These fragment rows are used internally to infer characteristics of the source file (e.g. schema, ...).

Signature

IFragment

resource.header (property)

Signature

Header

resource.labels (property)

Signature

ILabels

resource.lookup (property)

Signature

Lookup

resource.memory (property)

Whether resource is not path based

Signature

bool

resource.multipart (property)

Whether resource is multipart

Signature

bool

resource.normdata (property)

Normalized data or raise if not set

Signature

Any

resource.normpath (property)

Normalized path of the resource or raise if not set

Signature

str

resource.normpaths (property)

Normalized paths of the resource

Signature

List[str]

resource.paths (property)

All paths of the resource

Signature

List[str]

resource.pipeline (property)

Pipeline object. For more information, please check the Pipeline documentation.

Signature

(Optional[Union[Pipeline, str]]) -> Optional[Pipeline]

resource.place (property)

Stringified resource location

Signature

str

resource.remote (property)

Whether resource is remote

Signature

bool

resource.row_stream (property)

Row stream in form of a generator of Row objects

Signature

IRowStream

resource.sample (property)

Table's lists used as sample. These sample rows are used to infer characteristics of the source file (e.g. schema, ...).

Signature

ISample

resource.schema (property)

Table Schema object. For more information, please check the Schema documentation.

Signature

(Optional[Union[Schema, str]]) -> Schema

resource.stats (property)

Stats object. An object with the following possible properties: md5, sha256, bytes, fields, rows.

Signature

(Optional[Union[Stats, str]]) -> Stats

resource.tabular (property)

Whether resource is tabular

Signature

bool

resource.text_stream (property)

Text stream in form of a generator

Signature

ITextStream

resource.analyze (method)

Analyze the resource This feature is currently experimental, and its API may change without warning.

Signature

(: Resource, *, detailed=False) -> dict

Parameters

  • detailed

resource.close (method)

Close the resource as "filelike.close" does

Signature

() -> None

Resource.describe (method) (static)

Describe the given source as a resource

Signature

(source: Optional[Any] = None, *, stats: bool = False, **options)

Parameters

  • source (Optional[Any]): data source
  • stats (bool)
  • options

resource.extract (method)

Extract resource rows

Signature

(: Resource, *, limit_rows: Optional[int] = None, process: Optional[IProcessFunction] = None, filter: Optional[IFilterFunction] = None, stream: bool = False)

Parameters

  • limit_rows (Optional[int])
  • process (Optional[IProcessFunction])
  • filter (Optional[IFilterFunction])
  • stream (bool)

Resource.from_petl (method) (static)

Create a resource from PETL view

Signature

(view, **options)

Parameters

  • view
  • options

resource.index (method)

Index resource into a database

Signature

(: Resource, database_url: str, *, table_name: Optional[str] = None, fast: bool = False, qsv_path: Optional[str] = None, on_progress: Optional[Callable[[str], None]] = None, use_fallback: bool = False, with_metadata: bool = False)

Parameters

  • database_url (str)
  • table_name (Optional[str])
  • fast (bool)
  • qsv_path (Optional[str])
  • on_progress (Optional[Callable[[str], None]])
  • use_fallback (bool)
  • with_metadata (bool)

resource.infer (method)

Infer metadata

Signature

(*, sample: bool = True, stats: bool = False) -> None

Parameters

  • sample (bool)
  • stats (bool)

resource.open (method)

Open the resource as "io.open" does

Signature

(*, as_file: bool = False)

Parameters

  • as_file (bool)

resource.read_bytes (method)

Read bytes into memory

Signature

(*, size: Optional[int] = None) -> bytes

Parameters

  • size (Optional[int])

resource.read_cells (method)

Read lists into memory

Signature

(*, size: Optional[int] = None) -> List[List[Any]]

Parameters

  • size (Optional[int])

resource.read_data (method)

Read data into memory

Signature

(*, size: Optional[int] = None) -> Any

Parameters

  • size (Optional[int])

resource.read_rows (method)

Read rows into memory

Signature

(*, size=None) -> List[Row]

Parameters

  • size

resource.read_text (method)

Read text into memory

Signature

(*, size: Optional[int] = None) -> str

Parameters

  • size (Optional[int])

resource.to_copy (method)

Create a copy from the resource

Signature

(**options)

Parameters

  • options

resource.to_inline (method)

Helper to export resource as an inline data

Signature

(*, dialect=None)

Parameters

  • dialect

resource.to_pandas (method)

Helper to export resource as an Pandas dataframe

Signature

(*, dialect=None)

Parameters

  • dialect

resource.to_petl (method)

Export resource as a PETL table

Signature

(normalize=False)

Parameters

  • normalize

resource.to_snap (method)

Create a snapshot from the resource

Signature

(*, json=False)

Parameters

  • json : make data types compatible with JSON format

resource.to_view (method)

Create a view from the resource See PETL's docs for more information: https://platform.petl.readthedocs.io/en/stable/util.html#visualising-tables

Signature

(type=look, **options)

Parameters

  • type : view's type
  • options

resource.transform (method)

Transform resource

Signature

(: Resource, pipeline: Optional[Pipeline] = None)

Parameters

  • pipeline (Optional[Pipeline])

resource.validate (method)

Validate resource

Signature

(: Resource, checklist: Optional[Checklist] = None, *, limit_errors: int = 1000, limit_rows: Optional[int] = None, on_row: Optional[ICallbackFunction] = None)

Parameters

  • checklist (Optional[Checklist])
  • limit_errors (int)
  • limit_rows (Optional[int])
  • on_row (Optional[ICallbackFunction])

resource.write (method)

Write this resource to the target resource

Signature

(target: Optional[Union[Resource, Any]] = None, *, control: Optional[Control] = None, **options) -> Resource

Parameters

  • target (Optional[Union[Resource, Any]]): target or target resource instance
  • control (Optional[Control])
  • options