Bug #10498
openMysql Deadlock
0%
Description
Found in /var/log/mysql.log
150721 12:19:36 [Note] Event Scheduler: [root@localhost].[treqsjobs.cleaning_jqueues] event execution failed. 150722 16:39:36 [ERROR] Event Scheduler: [root@localhost][treqsjobs.cleaning_jqueues] Deadlock found when trying to get lock; try restarting transaction 150722 16:39:36 [Note] Event Scheduler: [root@localhost].[treqsjobs.cleaning_jqueues] event execution failed. 150722 16:46:36 [ERROR] Event Scheduler: [root@localhost][treqsjobs.cleaning_jqueues] Deadlock found when trying to get lock; try restarting transaction 150722 16:46:36 [Note] Event Scheduler: [root@localhost].[treqsjobs.cleaning_jqueues] event execution failed.
Updated by Chambon Bernard over 9 years ago
Deadlock seems related to mysql event scheduler, mainly 'cleaning_jqueues' which interacts with jqueues and also jrequests tables (see mysql.log)
More details with SHOW ENGINE INNODB STATUS gives :
------------------------ LATEST DETECTED DEADLOCK ------------------------ 150727 8:00:36 *** (1) TRANSACTION: TRANSACTION 0 101886833, ACTIVE 0 sec, process no 12317, OS thread id 140274448369408 starting index read mysql tables in use 2, locked 2 LOCK WAIT 5 lock struct(s), heap size 1216, 122 row lock(s) MySQL thread id 8416867, query id 22431805 event_scheduler preparing DELETE FROM jqueues WHERE ( (end_time is not null) and (TIMESTAMPDIFF(SECOND,end_time,now()) > NAME_CONST('old_normally_ended_var',172800)) and (status=230) and (id not in (SELECT queue_id FROM jrequests where queue_id is not null)) ) *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 0 page no 6096 n bits 1256 index `queue_id` of table `treqsjobs`.`jrequests` trx id 0 101886833 lock mode S waiting Record lock, heap no 894 PHYSICAL RECORD: n_fields 2; compact format; info bits 32 0: len 4; hex 800f8a64; asc d;; 1: len 4; hex 80474a5e; asc GJ^;; *** (2) TRANSACTION: TRANSACTION 0 101886834, ACTIVE 0 sec, process no 12317, OS thread id 140274446771968 updating or deleting, thread declared inside InnoDB 476 mysql tables in use 1, locked 1 5 lock struct(s), heap size 1216, 22 row lock(s), undo log entries 6 MySQL thread id 8416868, query id 22431802 event_scheduler updating delete from jrequests where ((end_time is not null) and (TIMESTAMPDIFF(SECOND,end_time,now()) > NAME_CONST('old_request_var',172800)) and ((status = 140) OR (status = 150) OR ((status = 160) AND (tries >= 3))) ) *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 0 page no 6096 n bits 1256 index `queue_id` of table `treqsjobs`.`jrequests` trx id 0 101886834 lock_mode X locks rec but not gap Record lock, heap no 894 PHYSICAL RECORD: n_fields 2; compact format; info bits 32 0: len 4; hex 800f8a64; asc d;; 1: len 4; hex 80474a5e; asc GJ^;; *** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 0 page no 6096 n bits 1256 index `queue_id` of table `treqsjobs`.`jrequests` trx id 0 101886834 lock_mode X locks rec but not gap waiting Record lock, heap no 884 PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 4; hex 800f8a60; asc `;; 1: len 4; hex 80474a70; asc GJp;; *** WE ROLL BACK TRANSACTION (1)
Such deadlock can leads to sql exception in treqs app
o Case SQL exceptions for Dispatcher, dispatcher stops and app restarts with help of JSW, => ok
o Case SQL exceptions Watchdog, watchdog stops (see bug # 10490), this implies an NAGIOS alert ! (event is treqs still running correctly)
Possible actions
1) Try to ensure that adhoc indexes are available (see clause where, used in event scheduler requests).
2) Try to improve event scheduler requests with a) events not running at the same time, b) ensure serialization of execution
3) other ?
Updated by Chambon Bernard over 9 years ago
- No deadlock found on cctreqs[xrootd]
- New deadlock found on cctreqsxrootd for jrequest tables between :
UPDATE jrequests SET status = 110, message = 'Registered in Queue 1042823', queue_id = 1042823, tape = 'JT113700', position = 459, position_offset = 0, level = 0, size = 644364039, errorcode = 0, submission_time = '2015-07-29 08:27:00' WHERE file = '/hpss/in2p3.fr/group/hess/Simul/Shower/gFixedEnergy_paris_0-8-8-8/KAS07010320145111.gz' AND status < 140 and cleaning_jrequest event scheduler
action :creating new index , checking usage via explain select ... => ok create index k2 on jrequests (`end_time`, `status`, `tries` ) show index from jrequests ;
Same operation made on jqueues table. Both operations on cctreqs en cctreqsxrootd
- No deadlock found on cctreqs[xrootd] (low activity)
- No deadlock found on cctreqs[xrootd]
Checking on 08/04,05
* No deadlock found on cctreqs[xrootd] ?????
Updated by Chambon Bernard over 9 years ago
- Status changed from New to In progress