The Inquiry gives you an ability to create arbitrary validation jobs containing a set of individual validation tasks.
Let's create an inquiry that includes an individual file validation and a resource validation:
from frictionless import Inquiry
inquiry = Inquiry.from_descriptor({'tasks': [
{'path': 'capital-valid.csv'},
{'path': 'capital-invalid.csv'},
]})
inquiry.to_yaml('capital.inquiry-example.yaml')
print(inquiry)
{'tasks': [{'path': 'capital-valid.csv'}, {'path': 'capital-invalid.csv'}]}
Tasks in the Inquiry accept the same arguments written in camelCase as the corresponding validate
functions have. As usual, let' run validation:
frictionless validate capital.inquiry-example.yaml
─────────────────────────────────── Dataset ────────────────────────────────────
dataset
┏━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ name ┃ type ┃ path ┃ status ┃
┡━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ capital-valid │ table │ capital-valid.csv │ VALID │
│ capital-invalid │ table │ capital-invalid.csv │ INVALID │
└─────────────────┴───────┴─────────────────────┴─────────┘
──────────────────────────────────── Tables ────────────────────────────────────
capital-invalid
┏━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Row ┃ Field ┃ Type ┃ Message ┃
┡━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ None │ 3 │ duplicate-label │ Label "name" in the header at position "3" │
│ │ │ │ is duplicated to a label: at position "2" │
│ 10 │ 3 │ missing-cell │ Row at position "10" has a missing cell in │
│ │ │ │ field "name2" at position "3" │
│ 11 │ None │ blank-row │ Row at position "11" is completely blank │
│ 12 │ 1 │ type-error │ Type error in the cell "x" in row "12" and │
│ │ │ │ field "id" at position "1": type is │
│ │ │ │ "integer/default" │
│ 12 │ 4 │ extra-cell │ Row at position "12" has an extra value in │
│ │ │ │ field at position "4" │
└──────┴───────┴─────────────────┴─────────────────────────────────────────────┘
At first sight, it's no clear why such a construct exists but when your validation workflow gets complex, the Inquiry can provide a lot of flexibility and power. Last but not least, the Inquiry will use multiprocessing if there are more than 1 task provided.
Inquiry representation.
(*, name: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, tasks: List[InquiryTask] = NOTHING) -> None
A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “_” or “-” characters.
Optional[str]
Type of the object
ClassVar[Union[str, None]]
A human-oriented title for the Inquiry.
Optional[str]
A brief description of the Inquiry.
Optional[str]
List of underlaying task to be validated.
List[InquiryTask]
Validate inquiry
(*, parallel: bool = False)
Inquiry task representation.
(*, name: Optional[str] = None, type: Optional[str] = None, title: Optional[str] = None, description: Optional[str] = None, path: Optional[str] = None, scheme: Optional[str] = None, format: Optional[str] = None, encoding: Optional[str] = None, mediatype: Optional[str] = None, compression: Optional[str] = None, extrapaths: Optional[List[str]] = None, innerpath: Optional[str] = None, dialect: Optional[Dialect] = None, schema: Optional[Schema] = None, checklist: Optional[Checklist] = None, resource: Optional[str] = None, package: Optional[str] = None) -> None
A short url-usable (and preferably human-readable) name. This MUST be lower-case and contain only alphanumeric characters along with “_” or “-” characters.
Optional[str]
Type of the source to be validated such as "package", "resource" etc.
Optional[str]
A human-oriented title for the Inquiry.
Optional[str]
A brief description of the Inquiry.
Optional[str]
Path to the data source.
Optional[str]
Scheme for loading the file (file, http, ...). If not set, it'll be inferred from `source`.
Optional[str]
File source's format (csv, xls, ...). If not set, it'll be inferred from `source`.
Optional[str]
Source encoding. If not set, it'll be inferred from `source`.
Optional[str]
Mediatype/mimetype of the resource e.g. “text/csv”, or “application/vnd.ms-excel”. Mediatypes are maintained by the Internet Assigned Numbers Authority (IANA) in a media type registry.
Optional[str]
Source file compression (zip, ...). If not set, it'll be inferred from `source`.
Optional[str]
List of paths to concatenate to the main path. It's used for multipart resources.
Optional[List[str]]
Path within the compressed file. It defaults to the first file in the archive (if the source is an archive).
Optional[str]
Specific set of formatting parameters applied while reading data source. The parameters are set as a Dialect class. For more information, please check the Dialect Class documentation.
Optional[Dialect]
Schema descriptor. A string descriptor or path to schema file.
Optional[Schema]
Checklist class with a set of validation checks to be applied to the data source being read. For more information, please check the Validation Checks documentation.
Optional[Checklist]
Resource descriptor. A string descriptor or path to resource file.
Optional[str]
Package descriptor. A string descriptor or path to package file.
Optional[str]