Actions
Task #10061
openTesting 'stability-fix' branch
Status:
In progress
Priority:
Normal
Assigned To:
-
Category:
-
Start date:
06/02/2015
Due date:
% Done:
0%
Estimated time:
Updated by Chambon Bernard over 9 years ago
Monday 2015/06/01
- Test_1000 : PASS
2 files remaining "Registered in Queue ",
automatic restart => those 2 files staged => ok
- Test_5000 : FAIL
4397 files ok
237 Fail : Registered in Queue
366 Fail : File locked or HSM is currently unavailable.
Several (3 or 4) Queue_id with 'Staged files' and 'Registered in Queue'
Ex 179 : 330 files (also 330 files in queueMap), but only 162 'Staged files'
Ex 202 : 145 files (also 145 files in queueMap), but only 83 'Staged files'
Is this due to HSM problem ?
For "File locked or HSM is currently unavailable"
Check the next files to be sure .../hpss/in2p3.fr/group/ccin2p3/treqs/RUN01/ccwl0100.11781_000010Mb_0029.dat /hpss/in2p3.fr/group/ccin2p3/treqs/RUN01/ccwl0141.15044_000010Mb_0052.dat
Done, those 2 fies (as examples) were not locked, (= staged successfully) => was it an HPSS temporary unavailability ?
Yes, it's confirmed that tape JS088200 has got I/O failures and then HPSS has locked it
Updated by Chambon Bernard over 9 years ago
Wednesday 2015/06/03
- Test_1000 : PASS
2 files remaining "Registered in Queue ",
automatic restart => those 2 files staged => ok
- Test_5000 : PASS
- Test_5000 : PASS, but ...
- 9 restart events of app. on message "No staging since a while although there are submitted requests" (normal restart)
- several ConcurrentModification Exceptions :
INFO | jvm 9 | 2015/06/04 16:37:44 | 2015-06-04 16:37:44,598 [Stager_Qn_JTI58200_QId102_stagerNo_1] ERROR toStart 259 f.i.c.s.treqs.control.stager.Stager - Stopping INFO | jvm 9 | 2015/06/04 16:37:44 | java.util.ConcurrentModificationException: null INFO | jvm 9 | 2015/06/04 16:37:44 | at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) ~[na:1.7.0_75] INFO | jvm 9 | 2015/06/04 16:37:44 | at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) ~[na:1.7.0_75] INFO | jvm 9 | 2015/06/04 16:37:44 | at fr.in2p3.cc.storage.treqs.model.Queue.getNextReading(Queue.java:787) ~[treqs-java-1.0-SNAPSHOT.jar:na] INFO | jvm 9 | 2015/06/04 16:37:44 | at fr.in2p3.cc.storage.treqs.control.stager.Stager.stage(Stager.java:195) ~[treqs-java-1.0-SNAPSHOT.jar:na] INFO | jvm 9 | 2015/06/04 16:37:44 | at fr.in2p3.cc.storage.treqs.control.stager.Stager.action(Stager.java:122) ~[treqs-java-1.0-SNAPSHOT.jar:na] INFO | jvm 9 | 2015/06/04 16:37:44 | at fr.in2p3.cc.storage.treqs.control.stager.Stager.toStart(Stager.java:255) ~[treqs-java-1.0-SNAPSHOT.jar:na] INFO | jvm 9 | 2015/06/04 16:37:44 | at fr.in2p3.cc.storage.treqs.control.process.AbstractProcess.run(AbstractProcess.java:214) [treqs-java-1.0-SNAPSHOT.jar:na] DEBUG | wrapperp | 2015/06/04 16:37:46 | send a packet PING : ping
Updated by Chambon Bernard over 9 years ago
Thursday 2015/06/04
- Test_100 : PASS (checking jtreqs.log and wrapper.log)
No ConcurrentModificationException
No restart
- Test_1000 : PASS
ConcurrentModificationException no
Restart of app 3 or 4
All files staged (see ES, june the 5th)
file:///Users/bchambon/Documents/Logstash-Elasticsearch-Kibana/Kibana/Application4PreProd/index.html#dashboard/temp/ZWHOqkZyQTqBeM7fX7RlvQ
Some queues with no activation time, strange !
Updated by Chambon Bernard over 9 years ago
Monday 2015/06/08
- Test_5000 : PASS
ConcurrentModificationException ?
Restart of app ?
All files staged (see ES, june the 8th)
Some queues with no activation_time nor end_time !
Updated by Chambon Bernard over 9 years ago
- Status changed from In progress to Feedback
Updated by Chambon Bernard over 9 years ago
Monday 2015/06/22
- test on test instance (=ccsvli10), by PEB
- Test_5000 : PASS
Restart of app, It seems that no, cool
All files staged (see ES, june the 22th)
All queues have activation_time + end_time ok
Updated by Chambon Bernard over 9 years ago
- Status changed from Feedback to In progress
Actions