To enable interoperability of genome annotations, we have developed the Genome Biology Ontology Language (GBOL) and associated stack (GBOL stack). GBOL is provenance centered and provides a consistent representation of genome derived automated predictions linked to the dataset-wise and element-wise provenance of predicted elements. GBOL is modular in design, extendible and is integrated with existing ontologies. Interoperability of linked data can only be guaranteed through the application of tools that provide the means for a continuous validation of generated linked data. The GBOL stack enforces consistency within and between the OWL and ShEx definitions. Genome wide large scale functional analyses can then easily be achieved using SPARQL queries. Additionally, modules have been developed to serialize the linked data (RDF) and to generate a plain text format files with integrated support for data provenance that that mimic the indentation structure of GenBank and EMBL formats.

Citing GBOL

Interoperable genome annotation with GBOL, an extendable infrastructure for functional data mining

Jesse C.J. van Dam, Jasper J. Koehorst, Jon Olav Vik, Peter J. Schaap, Maria Suarez-Diez

bioRxiv 184747; doi: