Task #51438
openTask #49925: WP 6: first level trigger (FLT)
Task #49928: WP 6.3: testbench
test bench pour DL: CNN
Added by Colley Jean-Marc over 2 years ago. Updated about 2 months ago.
90%
Description
faire un bench pour la version du réseau de neuronnes CNN de S. Le Coz appliquée à TREND
Updated by Colley Jean-Marc over 1 year ago
Info sur le code CNN de Sandra Le coz¶
Réponse de Sandra
"""
les données utilisées pour faire le modele:
/sps/grand/slecoz/MLP6_selected.bin
MLP6_transient.bin
le code pour lire les données binaires:
https://github.com/lesandra/ML/blob/master/binreader.py
le code qui permet de faire des inférences (il faut l'adapter à ton cas)
https://github.com/lesandra/ML/blob/master/classpredict.py
et le modèle est ici:
/sps/grand/slecoz/modelFFT
"""
Updated by Colley Jean-Marc over 1 year ago
Au CCIN2P3¶
CPU¶
cat /proc/cpuinfo | grep 'model name'
model name : Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz
Environnement¶
Environnment conda avec tensorflow version 2.7
ccenv anaconda
conda activate /sps/grand/software/conda/tensor_27
Trace une à une¶
Présentation du calcul¶
On prend en compte la préparation des données:- centrage et mise à l'échelle
- FFT de la trace centrée
et l'inférence du réseau sur les traces
On ne mesure pas:- import des modules python
- temps lecture du fichier de données
- temps lecture du modèle Keras
for id_t in range(10):
nor_trace = traces[id_t] - np.mean(traces[id_t])
nor_trace /= shared.quantization
input_data[0,:, 0] = rfft(nor_trace)
predictions = model_dl.predict(input_data)
output console¶
(/sps/grand/software/conda/tensor_27) cca008:/sps/grand/colley/bench/cnn>ipython
Python 3.9.16 (main, Mar 8 2023, 14:00:05)
Type 'copyright', 'credits' or 'license' for more information
IPython 8.11.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import classpredict_bench as cb
2023-03-21 14:26:22.788449: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /pbs/software/centos-7-x86_64/anaconda/3.6/lib:/pbs/throng/grand/soft/lib/:/pbs/software/centos-7-x86_64/xrootd/4.8.1/lib64:/pbs/software/centos-7-x86_64/oracle/12.2.0/instantclient/lib:
2023-03-21 14:26:22.788496: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
In [2]: import timeit
In [3]: m_dl=cb.keras.models.load_model('modelFFT')
2023-03-21 14:26:34.392873: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /pbs/software/centos-7-x86_64/anaconda/3.6/lib:/pbs/throng/grand/soft/lib/:/pbs/software/centos-7-x86_64/xrootd/4.8.1/lib64:/pbs/software/centos-7-x86_64/oracle/12.2.0/instantclient/lib:
2023-03-21 14:26:34.392952: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2023-03-21 14:26:34.393015: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (cca008): /proc/driver/nvidia/version does not exist
2023-03-21 14:26:34.393597: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
In [4]: d_tr = cb.read_trace_trend('MLP6_selected.bin')
In [5]: %timeit cb.bench_modelfft(m_dl, d_tr)
513 ms ± 36.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [7]: 0.513/10
Out[7]: 0.0513
Vectorisé¶
output console¶
(/sps/grand/software/conda/tensor_27) cca008:/sps/grand/colley/bench/cnn>time python ML/classpredict_bench.py $PWD/modelFFT $PWD/MLP6_selected.bin
2023-03-21 14:27:53.424889: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /pbs/software/centos-7-x86_64/anaconda/3.6/lib:/pbs/throng/grand/soft/lib/:/pbs/software/centos-7-x86_64/xrootd/4.8.1/lib64:/pbs/software/centos-7-x86_64/oracle/12.2.0/instantclient/lib:
2023-03-21 14:27:53.424936: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2023-03-21 14:27:56.734983: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /pbs/software/centos-7-x86_64/anaconda/3.6/lib:/pbs/throng/grand/soft/lib/:/pbs/software/centos-7-x86_64/xrootd/4.8.1/lib64:/pbs/software/centos-7-x86_64/oracle/12.2.0/instantclient/lib:
2023-03-21 14:27:56.735038: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2023-03-21 14:27:56.735071: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (cca008): /proc/driver/nvidia/version does not exist
2023-03-21 14:27:56.735392: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
0 1495
==================> 0/0
begin pred for 1495 traces
<class 'numpy.ndarray'>
(1495, 1024, 1)
[1.0000000e+00 1.0536264e-16]
real 0m10.665s
user 0m6.266s
sys 0m6.879s
In [1]: 6.25/1495
Out1: 0.004180602006688963
Présentation du calcul¶
On prend en compte la préparation des données:- centrage et mise à l'échelle
- FFT de la trace centrée
- import des modules
et l'inférence du réseau sur les traces
On ne mesure pas:- temps lecture du fichier de données
- temps lecture du modèle Keras
=> on prend "user" time qui en preincipe prend en compte que le CPU
Résultats¶
Entre 50 ms et 5 ms par trace