cubicweb-datacat #5160284 Harvest a fully DCAT compliant catalog [validation pending]

Given a URI referencing RDF data of a DCAT catalog, import required attributes into local database.

As described in by the following quotation:

"A data catalog conforms to DCAT if:

  • It is organized into datasets and distributions.
  • An RDF description of the catalog itself and its datasets and distributions is available (but the choice of RDF syntaxes, access protocols, and access policies is not mandated by this specification).
  • The contents of all metadata fields that are held in the catalog, and that contain data about the catalog itself and its dataset and distributions, are included in this RDF description, expressed using the appropriate classes and properties from DCAT, except where no such class or property exists.
  • All classes and properties defined in DCAT are used in a way consistent with the semantics declared in this specification.
  • DCAT-compliant catalogs MAY include additional non-DCAT metadata fields and additional RDF data in the catalog's RDF description."
done in0.3.0
load left0.000
closed by#fc29e52aaa5b Even more robust RDF DCAT ext entities importer.
patchRefactor tests of dataimport to make them clearer. [rejected]Even more robust RDF DCAT ext entities importer. [applied]Add tools to clean and filter ext entities according to a schema. [applied]Add tools to derive object types from predicate ranges in RDF graph. [applied]Refactor dataimport tests. [applied][yams] Extend base converters and checkers to handle datetime. [applied]More robust RDF DCAT ext entities importer. [applied]Refactor tests of RDF DCAT import to make them clearer. [applied]Catch errors while parsing incorrect URIs in RDF. [applied]Make import_log parameter mandatory in import function. [applied]Keep all dataimport code into one module. [applied]Extract rdf mapping declaration into its own function. [applied]Rename function. [applied][schema] Invert requirements on two Dataset attributes. [applied]_ Check fetched ext entities attributes. [folded]Add test: check number of fetched ExtEntities. [applied]Add a DCAT feed parser which import a DCAT catalog. [applied]Add in schema: DataCatalog entity & relation to Dataset. [applied]Add opt props to Dataset & Distribution entities. [applied]Fix, rename & inline rel between Dataset & Distribution. [applied]Add support for some literal properties of DCAT Catalog ExtEntity. [applied]Add support for Distribution ExtEntity & literal properties. [applied]Add support for Dataset ExtEntity & its literal properties. [applied]Add function to harvest DCAT catalog. [applied]Add sample data for testing. [applied]Add Catalog entity to schema. [rejected]Rename module dcat_harvester -> dataimport. [rejected]Add a DCAT feed parser which import a DCAT catalog. [rejected][test] Improve test coverage. [folded][dcat-harv] Add support for Distribution ExtEntity and literal properties. [folded][dcat-harv] Add support for CatalogRecord ExtEntity and literal properties. [rejected][test-data] Add sample catalog records. [rejected][test-data] Fix catalog reference in each dataset graph. [folded][dcat_harv] Add support for Dataset ExtEntity & literal properties. [folded][dcat_harv] Rename catgraph -> graph. [folded][test] Better handle number of found ExtEntities. [folded][test] Improve access to data file using tc.datapath() method. [folded]