Sixth meeting (04102024)¶
Progress since last meeting¶
Naming conventions implemented both in China and Argentina.
Automation of transfer/conversion/referencement/archiving done.
Implementation of level analysis in file naming almost finished (~90% done, ok in grandlib, implementation in gtot in progress).
Push usage of grandlib tools... not much progress (Marion and Xishui were very busy and Xishui is leaving) --> Maybe push this on newcommers !
Marion use code by JM and Matias scripts to read Root files... not many docs about root format/branches...
But need for a "starter kit" to get into the code :
- Which git branch to do what (read data, simulation,....)?
- Which scripts or code no navigate into the data ?
Also there are too many branches in git... no reliable way to test functionalities (lack of tests for automatic builds, no time for testing and review before merging ==> Merge to slow (~1/year)
Location for simulations @ccin2p3¶
Simulations will go to /sps/grand/data/sims to be separated from observations.
We will start with a /GrandRoot directory for Root files.
Raw files may be packed (1 tar /shower ?) and pushed into a /raw directory (need to be discussed by Francois and Matias).
Intermediate RawRoot files should be kept apart in /sps by the producer (to be used for reprocessing or generations of new libs) but will not be moved into the "official" space.
Naming and grouping data into directories + reprocessing ?¶
When different mods are produced with the same run number (e.g. UD, MD, CD) they will go into different directories.
If extra field changes with same run number, then files will go into different directories. Lech will try to convince our collegues to change run number if extra changes.
When data for a same run (and extra) are splitted (like in GP13) with a serial at the end of the bin file, then all root files will go into the same directory. In that case, the date_time of the directory name will be the one of the first file (begining of run). The files into the directory will thus contains the date_time corresponding to each file (and the serial number will be removed).
This means that the Naming convention will be changed : files inside directories will contains date_time informations.
When the new directory format will be available, all the previous files will be reprocessed !
Fields not properly filled in data and simulations¶
For simulations : Francois will open a github issue in sim2root.
For observations : Many fields seems to be wrong or not properly defined !
Examples :
- Run mode --> int or str ? What should it be ?
- first event and first event time from headers are WRONG. Lech propose to use values directly from events instead of headers, but the goal is to have this correct in raw files.
- du_id in gp13 are not du_id but du_feb ! du_id should be the location of du in the field !
Francois will complete the inventory fields with suspicious data with Olivier and Lech will review
Also trunvoltage need to be build -> Lech will look at it.
Fields and informations to be saved into the database¶
Francois will send the list of data into the database to Matias and Lech and they will point out what is useless and can be removed.
Strategy to reduce size of files¶
Not easy to reduce traces to 2 bytes floats.
We will gzip raw files
Data archiving (raw files)¶
Once archived into irods we will keep raw files for 6 month in /sps before deletion.
No access through web to raw archives.
Todo¶
- Olivier will push newcommers to use Grandlib and Lech (and others) will try to write a very short intro of few lines with title "how to start in grandlib?" that would be made clearly visible (GitHUB README, Wiki software page, etc...) .
- Francois and Matias will discuss the way to store the raw simulations into /sps/grand/data/sims
- Francois will prepare the transfer and referencement of sims in /sps/grand/data/sims/GrandRoot (but root files must be corrected first... see next point)
- Francois will open a github issue in sim2root to correct fields in simulations root files.
- Lech will try to convince our collegues to change run number if extra changes.
- Francois will complete the inventory fields with suspicious data with Olivier and Lech will review
- Lech will implement generation of trunvoltage tree
- Francois will send the list of data into the database to Matias and Lech and they will point out what is useless and can be removed.
- Francois will implement gzip of raw files after convertion into GrandRoot format.
- Francois will implement conservation of raw files in /sps for 6 month after archiving (but after reprocessing of all data to the new directory format).