背景
在计算程序中,调度系统会分页计算所有数据,每个分页是一个线程,所以整个计算会有多个线程操作。
计算又分为多个步骤,所有的操作最后都需要更新 detail
表,update
语句的条件是 a_id
和 b_id
。
detail
表主要字段如下:
create table srm_review_detail
(
id int(10) auto_increment primary key,
a_id int(10) not null,
b_id int(10) not null,
......
)
本地进行单元测试时,由于是单线程调用,整个计算过程没有报出异常。
异常
在测试环境使用调度系统进行计算时,频繁报出锁超时异常,导致整个任务失败。
Cause: java.sql.SQLException: Lock wait timeout exceeded;
try restarting transaction
为了调查锁超时原因,查看了 InnoDB
状态:
SHOW ENGINE INNODB STATUS
日志记录了最近发生的一次死锁,截取了关键部分:
------------------------
LATEST DETECTED DEADLOCK
------------------------
2019-01-15 15:54:12 0x7fe3180c3700
*** (1) TRANSACTION:
TRANSACTION 294999, ACTIVE 1 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 42 row lock(s)
MySQL thread id 2432, OS thread handle 140611048765184, query id 3037739 192.168.1.118 srmuser updating
UPDATE detail
SET type_id = 6
where a_id = 91
and b_id = 15
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 708 page no 3411 n bits 184 index PRIMARY of table `srm_dev`.`srm_review_detail` trx id 294999 lock_mode X locks rec but not gap waiting
Record lock, heap no 39 PHYSICAL RECORD: n_fields 31; compact format; info bits 0
0: len 4; hex 8006c59f; asc ;;
1: len 6; hex 00000004803e; asc >;;
2: len 7; hex 5f0000042d2b04; asc _ -+ ;;
3: len 4; hex 9b35b14f; asc 5 O;;
4: len 1; hex 85; asc ;;
......
*** (2) TRANSACTION:
TRANSACTION 294974, ACTIVE 5 sec starting index read
mysql tables in use 1, locked 1
59 lock struct(s), heap size 8400, 10175 row lock(s), undo log entries 5084
MySQL thread id 2445, OS thread handle 140613337757440, query id 3037741 192.168.1.118 srmuser updating
UPDATE detail
SET score = 0.0000,
where a_id = 91
and b_id = 423
......
从这段日志里可以看到,TRANSACTION 1
和 TRANSACTION 2
分别持有一定数量的行锁,然后又等待对方的锁,最后 MySQL
检测到 deadlock
。
说明多个线程参与计算在不同步骤时锁定了同一部分表数据,导致超时。
异常原因
InnoDB
支持行级锁,但是update
语句的条件 a_id
和 b_id
是没有索引的。
如果没有合适索引,MySQL
有可能锁定整个表或者多行:
If you have no indexes suitable for your statement and MySQL must scan the entire table to process the statement, every row of the table becomes locked, which in turn blocks all inserts by other users to the table. It is important to create good indexes so that your queries do not unnecessarily scan many rows.
如果有合适的二级索引
,MySQL会根据此索引设置锁,并在聚簇索引
记录上也加上锁:
If a secondary index is used in a search and index record locks to be set are exclusive, InnoDB also retrieves the corresponding clustered index records and sets locks on them.
解决
经过上述分析,由于在更新语句时没有合适的索引,导致多个线程在等待锁定行时超时,因此我们决定把表的主键
作为条件来更新,这样能确保只有一条数据会被锁定,避免多个线程之间的冲突。