Python reader for Linked Data in HDF5 files

Last update: May 17, 2022

Related tags

Overview

`h5ld`: HDF5 Linked Data

Linked Data are becoming more popular for user-created metadata in HDF5 files. This Python package provides readers for the HDF5-based formats with such metadata . Entire linked data content is read in one operation and made available as an rdflib graph object.

Currently supported:

Allotrope Data Format (ADF)

Installation

pip install git+https://github.com/HDFGroup/h5ld@{LABEL}

where {LABEL} is either master or a tag label.

Requirements:

Python >= 3.7
h5py >= 3.3.0
rdflib >= 5.0.0

License

This software is open source. See this file for details.

Quick Start

This package can be used either as a command-line tool or programmatically. On the command-line, the package dumps the link data of an input HDF5 file into several popular RDF formats supported by the rdflib package. For example:

python -m h5ld -f json-ld -o output.json INPUT.h5

will dump the input file's RDF data to a file output.json in the JSON-LD format. Omitting an output file prints out the same content so it can be ingested by another command-line tool. Full description is available from:

python -m h5ld --help

There is also a programmatic interface for integration into Python applications. Each h5ld reader will provide the following methods and attributes:

File format name.

print(f"Input file format is: {reader.name}")

Short (usually an acronym) of the file format.

print(f"File format acronym: {reader.short_name}")

Check if the reader is the right choice for the input file.

with h5py.File("input.h5", mode="r") as f:
    if reader.verify_format(f):
        # Do something...
      else:
          print("Sorry but not the right h5ld reader.")

Check if there is linked data content in the input HDF5 file. Optionally, print an appropriate description of the data.
```
with h5py.File("input.h5", mode="r") as f:
    reader.check_ld(f, report=True)
```

Read linked data and export it to a destination in the requested RDF format.

with h5py.File("input.h5", mode="r") as f:
    reader(f).dump_ld("output.json", format="json-ld")

Read linked data and return either an rdflib.Graph or rdflib.ConjunctiveGraph object.

with h5py.File("input.h5", mode="r") as f:
    graph = reader(f).get_ld()

A Python dictionary with the reader's namespace prefixes and their IRIs.

with h5py.File("input.h5", mode="r") as f:
    rdr = reader(f)
    namespaces = rdr.namespaces

Python reader for Linked Data in HDF5 files

Related tags

Overview

`h5ld`: HDF5 Linked Data

Installation

License

Quick Start

Owner

The HDF Group

Reading streams of Twitter data, save them to Kafka, then process with Kafka Stream API and Spark Streaming

A set of functions and analysis classes for solvation structure analysis

Python package to transfer data in a fast, reliable, and packetized form.

Parses data out of your Google Takeout (History, Activity, Youtube, Locations, etc...)

HyperSpy is an open source Python library for the interactive analysis of multidimensional datasets

Lale is a Python library for semi-automated data science.

We're Team Arson and we're using the power of predictive modeling to combat wildfires.

Using Python to scrape some basic player information from www.premierleague.com and then use Pandas to analyse said data.

Dbt-core - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

Fitting thermodynamic models with pycalphad

Bigdata Simulation Library Of Dream By Sandman Books

BAyesian Model-Building Interface (Bambi) in Python.

NumPy aware dynamic Python compiler using LLVM

Tokyo 2020 Paralympics, Analytics

This repository contains some analysis of possible nerdle answers

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

A probabilistic programming language in TensorFlow. Deep generative models, variational inference.

Random dataframe and database table generator

Data imputations library to preprocess datasets with missing data

My solution to the book A Collection of Data Science Take-Home Challenges

Python reader for Linked Data in HDF5 files

Related tags

Overview

h5ld: HDF5 Linked Data

Installation

License

Quick Start

Owner

The HDF Group

Reading streams of Twitter data, save them to Kafka, then process with Kafka Stream API and Spark Streaming

A set of functions and analysis classes for solvation structure analysis

Python package to transfer data in a fast, reliable, and packetized form.

Parses data out of your Google Takeout (History, Activity, Youtube, Locations, etc...)

HyperSpy is an open source Python library for the interactive analysis of multidimensional datasets

Lale is a Python library for semi-automated data science.

We're Team Arson and we're using the power of predictive modeling to combat wildfires.

Using Python to scrape some basic player information from www.premierleague.com and then use Pandas to analyse said data.

Dbt-core - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

Fitting thermodynamic models with pycalphad

Bigdata Simulation Library Of Dream By Sandman Books

BAyesian Model-Building Interface (Bambi) in Python.

NumPy aware dynamic Python compiler using LLVM

Tokyo 2020 Paralympics, Analytics

This repository contains some analysis of possible nerdle answers

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

A probabilistic programming language in TensorFlow. Deep generative models, variational inference.

Random dataframe and database table generator

Data imputations library to preprocess datasets with missing data

My solution to the book A Collection of Data Science Take-Home Challenges

`h5ld`: HDF5 Linked Data