Project

General

Profile

Third meeting (09102023)

File structure

Auger files

For now, files from Auger are splitted into run/sub_run (i.e. td002015_f0001.root, td002015_f0002.root, etc...). The run number (i.e. 2015) is the same in each file but the event number is reseted for each subrun (i.e. starts at 1 in each file). Merging all the files into a single directory for the run is tricky (because events need then to be renumbered). As the run_number and event_number comes from the hardware we should ask the hardware team to correct this. Moreover, files splitting rules from the daq is not clear (some files have 16 events, other 9000...). It would be nice to have some more homogeneous files. We thus ask the hardware team to (if possible):
- Split files on a event base (e.g. every 1000 or 2000 events)
- Change run_number at each split or do not reset the event number at each subrun.
- Add site information on the binary blobs
- Create a checksum of the created files to verify transfer quality

These requests should be also part of the discussion to unify the DAQ code (NL vs CN version).
Olivier and Lech will discuss that.

Trees file groupping

Datas will be stored into directory. As far as possible, all Trun trees will go into a single file (not possible for simulations where an additional trunvoltage file will be created later).
Events trees will be stored separately into different files (to allow users to retreive only the files they are interrested in).
Possible evolutions in this policy could occurs based on usage in the future.

Versioning

For now python code and root files created by root_trees.py will be versioned using the local git hook mechanism proposed by Francois. A additional tag will be created at each push to the repository.
The alignment of version number between root_trees.py and gtot is not straightforward ! Lech will try to keep them compatible and try to use the same version number.
Although Lech will try to see if he can implement some metadata informations at file level (not tree level) to store the version number.

Having a lot of files in different versions of the root file format should lead to problems. We will probably have to think about a way to keep/convert official files into a single file version (not urgent).

Root file format reference

For now the root file format reference document is the google spreadsheet https://docs.google.com/spreadsheets/d/1rKwZ-ReJBh_h1emLyPYtLc0gSi-fR2ULnV9RyeKD73Y/edit#gid=958343241 .
We need to keep it up to date and warn the collaboration (by slack or email) in case of modifications.
Francois will see if it's possible to transfer this document into an onlyoffice document.
Lech would like to extract automatically the root file structure from the code.
When root file format will be more stable we could release an official document (pdf ?).

Hosting

Ramesh asked about hosting a copy of the data in China. Pengxiong is ok, but more discussions is needed (Olivier and Lech should do it during their trip to China).