Fourth meeting (27112023)¶
Progress since last meeting¶
Concerning the naming of the raw files from Grand@auger, Lech started a discussion in slack, but without any result up to now. People from the hardware are quite busy with the new firmware version so adding informations about the site (in file name or in binary blob) is not a priority. Olivier will talk about that at the next hardware meeting. Concerning the splitting of files into subruns, it seems reasonable to ask for a splitting on a time base (1h should be a good compromise). The event number should not be reseted between subruns. This should be achieved by forgetting about subruns and simply change run number ( informations about the run are present in all subrun files so this would not produce more data). This will be asked by Olivier at the next hardware meeting.
The root file format reference is still in google doc (Francois had no time to migrate it into onlyoffice).
Naming of raw and root files and data structure¶
It seems very important that data from different sites shares the same naming convention. We thus suggests that the raw files will be named following the pattern :
[site]_[date]_[time]_[run_number]_[extra].bin
and that they should be transfered @CCIN2P3 in directories named
[site]/raw/[year]/[month]
The Grand Root files should then be stored in directories named following the same pattern :
[site]/GrandRoot/[year]/[month]
This will strongly ease the automation process of data.
This suggestion try to deal with various constrains :- Limit the number of files/directories inside a directory. If we have 20 runs by day, this will lead to 600 files or directories in a month which is reasonable for a file system.
- Avoid a too complex structure with dozens of subdirectories and keep the stucture as simple as possible
- Ease automation and manual search and transfert processes