By Shashi Gharti on 2022-09-07 » Blog Index
We are happy to announce github plugin which makes sharing data between frictionless and github easier without any extra work and configuration. All the github plugin functionalities are wrapped around the PyGithub library. The main idea is to make the interaction between the framework and github seamless using read and write functions developed on top of the Frictionless python library. Here is a short introduction and examples of the features.Reading package from github repository is made easy! The existing Package
class can identify the github url and read the packages and resources from the repo. It can read packages from repos with or without packages descriptors. If a package descriptor is not defined, it will create a package descriptor with resources that it finds in the repo.
from frictionless import Package
package = Package("https://github.com/fdtester/test-repo-with-datapackage-json")
print(package)
Writing and publishing can be easily done by passing the repository link using publish
function.
from frictionless import Package, portals
apikey = 'YOUR-GITHUB-API-KEY'
package = Package('data/datapackage.json')
response = package.publish("https://github.com/fdtester/test-repo-write",
control=portals.GithubControl(apikey=apikey)
)
Catalog can be created from a single repository by using 'search' queries. Repositories can be searched using combination of any search text and github qualifiers. A simple example of creating catalog from search is as follows:
from frictionless import Catalog, portals
catalog = Catalog(
control=portals.GithubControl(search="user:fdtester", per_page=1, page=1),
)
We will have more updates in future and would love to hear from you about this new feature. Let's chat in our Slack if you have questions or just want to say hi.
Read Github Plugin Docs for more information.