The Official Data Catalog

Image credit: https://undatacatalog.org

The DUNE Data Catalog, that started in late 2015 for use with the 35 ton prototype, then expanded to index Monte Carlo samples, got a makeover last summer.

Makeover by NOvA web tools. Because DUNE’s worth it.

DUNE uses SAM (Sequential Access via Metadata) as its data cataloguing tool. SAM maintains metadata and location information for data files. The DUNE data catalog lists the datasets that are defined in SAM, and provides information as to what these datasets contain and tips on how to access the data. It also provides some convenient web-based SAM queries.

Datasets are stored both at Fermilab in Enstore and at CERN in EOS.

The recently completed Monte Carlo Challenge 9 (MCC9) datasets are now available in both locations from the Data Catalog website. MCC7 and MCC8 Far Detector Monte Carlo samples are there, as well as raw data from the ongoing 3x1x1 test of WA105, which are stored on tape.

Adam Aurisano and Steve Timm are the data catalog maintainers, and Anna Mazzacane and Ivan Furic are the current production leads.

Each MC challenge involves completing an entire workflow, from the generation of physics events through reconstruction, and producing data objects in a format ready for analysis, explained Mazzacane. MCC9 simulates interactions in the ProtoDUNE-SP detector, placed in the CERN test beam, for beams of both polarities at a variety of beam energies, with and without the space charge effect.  Particles from numerous beam-target interactions are overlaid in the beam interface to LArSoft in order to achieve the nominal trigger rate. The beam particles are also overlaid with cosmics to study their effect on the track reconstruction.

For each dataset, the Data Catalog website provides the status, the beam configuration, the software release used, and the time window associated with the data. In most cases, datasets from each phase of production are identified.

For information on how to access the datasets and other DUNE computing, see the Computing How-To Documentation, linked under How To… from DUNE at Work.

The data catalog site will evolve to meet DUNE’s needs, especially those of the two ProtoDUNE detectors and the Monte Carlo needed for the Technical Design Report.

“We expect the data catalog site to be primarily a list of datasets and a front end to SAM. We hope to turn it into a source of helpful documentation for users wanting to access DUNE data,” said Tom Junk.

Because you’re worth it.