Edit page in Livemark
(2024-01-29 13:37)

Inquiry Class

The Inquiry gives you an ability to create arbitrary validation jobs containing a set of individual validation tasks.

Creating Inquiry

Let's create an inquiry that includes an individual file validation and a resource validation:

from frictionless import Inquiry

inquiry = Inquiry.from_descriptor({'tasks': [
  {'path': 'capital-valid.csv'},
  {'path': 'capital-invalid.csv'},
]})
inquiry.to_yaml('capital.inquiry-example.yaml')
print(inquiry)
{'tasks': [{'path': 'capital-valid.csv'}, {'path': 'capital-invalid.csv'}]}

Validating Inquiry

Tasks in the Inquiry accept the same arguments written in camelCase as the corresponding validate functions have. As usual, let' run validation:

frictionless validate capital.inquiry-example.yaml
─────────────────────────────────── Dataset ────────────────────────────────────
                          dataset
┏━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ name            ┃ type  ┃ path                ┃ status  ┃
┡━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ capital-valid   │ table │ capital-valid.csv   │ VALID   │
│ capital-invalid │ table │ capital-invalid.csv │ INVALID │
└─────────────────┴───────┴─────────────────────┴─────────┘
──────────────────────────────────── Tables ────────────────────────────────────
                                capital-invalid
┏━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Row  ┃ Field ┃ Type            ┃ Message                                     ┃
┡━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ None │ 3     │ duplicate-label │ Label "name" in the header at position "3"  │
│      │       │                 │ is duplicated to a label: at position "2"   │
│ 10   │ 3     │ missing-cell    │ Row at position "10" has a missing cell in  │
│      │       │                 │ field "name2" at position "3"               │
│ 11   │ None  │ blank-row       │ Row at position "11" is completely blank    │
│ 12   │ 1     │ type-error      │ Type error in the cell "x" in row "12" and  │
│      │       │                 │ field "id" at position "1": type is         │
│      │       │                 │ "integer/default"                           │
│ 12   │ 4     │ extra-cell      │ Row at position "12" has an extra value in  │
│      │       │                 │ field at position "4"                       │
└──────┴───────┴─────────────────┴─────────────────────────────────────────────┘

At first sight, it's no clear why such a construct exists but when your validation workflow gets complex, the Inquiry can provide a lot of flexibility and power. Last but not least, the Inquiry will use multiprocessing if there are more than 1 task provided.

Reference

Inquiry (class)

InquiryTask (class)

Inquiry (class)

Inquiry representation.

Signature

(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, tasks: List[InquiryTask] = NOTHING) -> None

Parameters

  • name (Optional[str])
  • title (Optional[str])
  • description (Optional[str])
  • tasks (List[InquiryTask])

inquiry.name (property)

A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “_” or “-” characters.

Signature

Optional[str]

inquiry.type (property)

Type of the object

Signature

ClassVar[Union[str, None]]

inquiry.title (property)

A human-oriented title for the Inquiry.

Signature

Optional[str]

inquiry.description (property)

A brief description of the Inquiry.

Signature

Optional[str]

inquiry.tasks (property)

List of underlaying task to be validated.

Signature

List[InquiryTask]

inquiry.validate (method)

Validate inquiry

Signature

(*, parallel: bool = False)

Parameters

  • parallel (bool)

InquiryTask (class)

Inquiry task representation.

Signature

(*, name: Optional[str] = None, type: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, path: Optional[str] = None, scheme: Optional[str] = None, format: Optional[str] = None, encoding: Optional[str] = None, mediatype: Optional[str] = None, compression: Optional[str] = None, extrapaths: Optional[List[str]] = None, innerpath: Optional[str] = None, dialect: Optional[Dialect] = None, schema: Optional[Schema] = None, checklist: Optional[Checklist] = None, resource: Optional[str] = None, package: Optional[str] = None) -> None

Parameters

  • name (Optional[str])
  • type (Optional[str])
  • title (Optional[str])
  • description (Optional[str])
  • path (Optional[str])
  • scheme (Optional[str])
  • format (Optional[str])
  • encoding (Optional[str])
  • mediatype (Optional[str])
  • compression (Optional[str])
  • extrapaths (Optional[List[str]])
  • innerpath (Optional[str])
  • dialect (Optional[Dialect])
  • schema (Optional[Schema])
  • checklist (Optional[Checklist])
  • resource (Optional[str])
  • package (Optional[str])

inquiryTask.name (property)

A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “_” or “-” characters.

Signature

Optional[str]

inquiryTask.type (property)

Type of the source to be validated such as "package", "resource" etc.

Signature

Optional[str]

inquiryTask.title (property)

A human-oriented title for the Inquiry.

Signature

Optional[str]

inquiryTask.description (property)

A brief description of the Inquiry.

Signature

Optional[str]

inquiryTask.path (property)

Path to the data source.

Signature

Optional[str]

inquiryTask.scheme (property)

Scheme for loading the file (file, http, ...). If not set, it'll be inferred from `source`.

Signature

Optional[str]

inquiryTask.format (property)

File source's format (csv, xls, ...). If not set, it'll be inferred from `source`.

Signature

Optional[str]

inquiryTask.encoding (property)

Source encoding. If not set, it'll be inferred from `source`.

Signature

Optional[str]

inquiryTask.mediatype (property)

Mediatype/mimetype of the resource e.g. “text/csv”, or “application/vnd.ms-excel”. Mediatypes are maintained by the Internet Assigned Numbers Authority (IANA) in a media type registry.

Signature

Optional[str]

inquiryTask.compression (property)

Source file compression (zip, ...). If not set, it'll be inferred from `source`.

Signature

Optional[str]

inquiryTask.extrapaths (property)

List of paths to concatenate to the main path. It's used for multipart resources.

Signature

Optional[List[str]]

inquiryTask.innerpath (property)

Path within the compressed file. It defaults to the first file in the archive (if the source is an archive).

Signature

Optional[str]

inquiryTask.dialect (property)

Specific set of formatting parameters applied while reading data source. The parameters are set as a Dialect class. For more information, please check the Dialect Class documentation.

Signature

Optional[Dialect]

inquiryTask.schema (property)

Schema descriptor. A string descriptor or path to schema file.

Signature

Optional[Schema]

inquiryTask.checklist (property)

Checklist class with a set of validation checks to be applied to the data source being read. For more information, please check the Validation Checks documentation.

Signature

Optional[Checklist]

inquiryTask.resource (property)

Resource descriptor. A string descriptor or path to resource file.

Signature

Optional[str]

inquiryTask.package (property)

Package descriptor. A string descriptor or path to package file.

Signature

Optional[str]