wiki:ESGFPublicationZFS

Version 1 (modified by zequi, 4 years ago) (diff)

--

ZFS in ESGF publication

Institutions store their datasets in different formats according to their own needs. Publication to projects, such as CORDEX, require from common formats that datasets must follow. Here we present a use case of ZFS to prepare data for publication.

Background

Suppose that we have a zfs like this:

someUser@someHost# zfs list -r tank/test
NAME                       USED  AVAIL  REFER  MOUNTPOINT
tank/test                  104M  66.2G    23K  /tank/test
tank/test/datasetA         104M  66.2G   104M  /tank/test/datasetA

Imagine that /tank/test/datasetA contains various .nc files that for legacy reasons differ in their metadata from CORDEX required metadata and they must be modified in order to be published in ESGF. How can we effectively maintain two versions of the datasets?

ZFS snapshots and clones