Module hdf5

The module hdf5 defines functions to save and validate Photon-HDF5 files. The main two functions in this module are:

This module also provides functions to save free-form dict to HDF5 (dict_to_group()) and read a HDF5 group into a dict (dict_from_group()). Finally there are utility functions to easily print HDF5 nodes and attributes (print_children(), print_attrs()).

For more info see: Writing Photon-HDF5 files.

List of functions

Main functions to save and validate Photon-HDF5 files.

phconvert.hdf5.save_photon_hdf5(data_dict, h5_fname=None, h5file=None, user_descr=None, overwrite=False, compression={'complevel': 6, 'complib': 'zlib'}, close=True, validate=True, warnings=True, skip_measurement_specs=False, require_setup=True, debug=False)

Saves the dict data_dict in the Photon-HDF5 format.

This function requires the data to be saved as data_dict argument. The data needs to have the hierarchical structure of a Photon-HDF5 file. For the purpose, we use a standard python dictionary: each keys is a Photon-HDF5 field name and each value contains data (e.g. array, string, etc..) or another dictionary (in which case, it represents an HDF5 sub-group). Similarly, sub-dictionaries contain data or other dictionaries, as needed to represent the hierarchy of Photon-HDF5 files.

Features of this function:

  • Checks that all field names are valid Photon-HDF5 field names.
  • Checks that all field type match the Photon-HDF5 specs (scalar, array, or string).
  • Populates automatically the identity group with filename, software, version and file creation date.
  • Populates automatically the provenance group with info on the original data file (if it can be found on disk): creation and modification date, path.
  • Computes field acquisition_duration when not provided (single-spot data only).

Minimal fields required to create a Photon-HDF5 file:

  • /description (string)
  • /photon_data/timestamps (array)
  • /photon_data/timestamps_specs/timestamps_unit (scalar float)
  • /setup/num_pixels (int): number of detectors
  • /setup/num_spots (int): number of excitation/detection spots
  • /setup/num_spectral_ch (int): number of detection spectral bands
  • /setup/num_polarization_ch (int): number of detected polarization states
  • /setup/num_split_ch (int): number of beam split channels
  • /setup/modulated_excitation (bool): True if there is any form of intensity or polarization modulation or interleaved excitation (PIE or nsALEX). This field has become obsolete in version 0.5 and maintained only for compatibility.
  • /setup/excitation_alternated (array of bool): New in version 0.5. Values are True if the respective excitation source is intensity-modulated. In us-ALEX both sources are alternated, while in PAX measurements only one source is alternated.
  • /setup/lifetime (bool): True if dataset contains TCSPC data.

See also Writing Photon-HDF5 files.

As a side effect data_dict is modified by adding the key ‘_data_file’ containing a reference to the pytables file.

Parameters:
  • data_dict (dict) – the dictionary containing the photon data. The keys must strings matching valid Photon-HDF5 paths. The values must be scalars, arrays, strings or another dict.
  • h5_fname (string or None) – file name for the output Photon-HDF5 file. If None and h5file is also None, the file name is taken from data_dict['_filename'] with extension changed to ‘.hdf5’.
  • h5file (pytables.File or None) – an already open and writable HDF5 file to use as container. This argument can be used to complete an HDF5 file already containing some arrays, or to update an already existing Photon-HDF5 file in-place. For more info see note below.
  • user_descr (dict or None) – dictionary of descriptions (strings) for user-defined fields. The keys must be strings representing the full HDF5 path of each field. The values must be binary (i.e. encoded) strings restricted to the ASCII set.
  • overwrite (bool) – if True, a pre-existing HDF5 file with same name is overwritten. If False, save the new file by adding the suffix “new_copy” (and if a “_new_copy” file is already present overwrites it).
  • compression (dict) – a dictionary containing the compression type and level. Passed to pytables tables.Filters().
  • close (bool) – If True (default) the HDF5 file is closed before returning. If False the file is left open.
  • validate (bool) – if True, after saving perform a validation step raising an error if the specs are not followed.
  • warnings (bool) – if True, print warnings for important optional fields that are missing. If False, don’t print warnings.
  • skip_measurement_specs (bool) – if True don’t print any warning for missing measurement_specs group.
  • require_setup (bool) – if True, raises an error if some mandatory fields in /setup are missing. If False, allows missing setup fields (or missing setup altogether). Use False when saving only detectors’ dark counts.
  • debug (bool) – if True prints additional debug information.

For description and specs of the Photon-HDF5 format see: http://photon-hdf5.readthedocs.org/

Note

The argument h5file accepts an already open HDF5 file for storage. This allows completing a partially written file (for example containing only photon_data arrays) or correcting and already complete Photon-HDF5 file. When using h5file, you need to pass a full data_dict structure as usual. If you don’t want update an array, put in data_dict a reference to the existing pytables array (instead of using a numpy array). Fields containing numpy arrays will be overwritten. Fields containing pytables Array (including CArray or EArray) will be left unmodified. In either cases the TITLE attribute is always updated.

phconvert.hdf5.assert_valid_photon_hdf5(datafile, warnings=True, verbose=False, strict_description=True, require_setup=True, skip_measurement_specs=False)

Asserts that datafile follows the Photon-HDF5 specs.

If the input datafile does not follow the specifications, it raises the Invalid_PhotonHDF5 exception, with a message indicating the cause of the error.

This function checks that:

  • all fields are valid Photon-HDF5 names
  • all fields have valid descriptions
  • all mandatory fields are present
  • if /setup/lifetime is True (i.e. 1), assures that nanotimes and nanotimes_specs are present
Parameters:
  • datafile (string or tables.File) – input data file to be validated
  • warnings (bool) – if True, print warnings for important optional fields that are missing. If False, don’t print warnings.
  • verbose (bool) – if True print details about the performed tests.
  • strict_description (bool) – if True consider a non-conforming description (TITLE) a specs violation.
  • require_setup (bool) – if True, raises an error if some mandatory fields in /setup are missing. If False, allows missing setup fields (or missing setup altogether).
  • skip_measurement_specs (bool) – if True don’t print any warning for missing measurement_specs group.

Utility functions

Utility functions to work with HDF5 files in pytables.

phconvert.hdf5.print_children(group)

Print all the sub-groups in group and leaf-nodes children of group.

Parameters:group (pytables group) – the group to be printed.
phconvert.hdf5.print_attrs(node, which='user')

Print the HDF5 attributes for node_name.

Parameters:
  • node (pytables node) – node whose attributes will be printed. Can be either a group or a leaf-node.
  • which (string) – Valid values are ‘user’ for user-defined attributes, ‘sys’ for pytables-specific attributes and ‘all’ to print both groups of attributes. Default ‘user’.
phconvert.hdf5.dict_from_group(group, read=True)

Return a dict with the content of a PyTables group.

phconvert.hdf5.dict_to_group(group, dictionary)

Save dictionary into HDF5 format in group.