我有一个查询,我每30分钟在postgreSQL上运行一次查询,我希望postgresSQL在大约1秒钟内可以回答我(基于数据库表的大小)。几乎所有时间我都在预期的时间(约1秒)内收到查询响应。但是有时我没有收到答案,什么也没收到!我的意思是我需要永远等待该查询。换句话说,似乎postgresSQL等待其他事情完成,然后回答我的查询。
问题是我如何找出为什么postgreSQL无法回答我的查询?
我试图查看pg_stat_activity,结果如下(检查下表的第4行):
datid | datname | pid | usesysid | usename | backend_start | xact_start | query_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+----------+-------+----------+----------+----------------------------------+----------------------------------+----------------------------------+----------------------------------+-----------------+---------------------+--------+-------------+--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------
| | 25038 | | | 2018-07-26 17:42:31.417755+04:30 | | | | Activity | AutoVacuumMain | | | | | autovacuum launcher
| | 25040 | 10 | postgres | 2018-07-26 17:42:31.417821+04:30 | | | | Activity | LogicalLauncherMain | | | | | background worker
16385 | analysis | 36912 | 10 | postgres | 2018-07-26 20:16:13.393578+04:30 | 2018-07-26 20:48:37.738759+04:30 | 2018-07-26 20:48:37.738759+04:30 | 2018-07-26 20:48:37.738764+04:30 | | | active | | 116475 | select datid,datname,pid,usesysid,usename,backend_start,xact_start,query_start,state_change,wait_event_type,wait_event,state,backend_xid,backend_xmin,query,backend_type from pg_stat_activity; | client backend
16385 | analysis | 9427 | 10 | postgres | 2018-07-26 19:34:37.109833+04:30 | 2018-07-26 19:35:08.502592+04:30 | 2018-07-26 19:35:08.502592+04:30 | 2018-07-26 19:35:08.502596+04:30 | | | active | 116475 | 116470 | update dns as origdns set address = (select address from dns where name = origdns.value limit 1) where type = 'cname' and address is null; | client backend
| | 25036 | | | 2018-07-26 17:42:31.410811+04:30 | | | | Activity | BgWriterMain | | | | | background writer
| | 25035 | | | 2018-07-26 17:42:31.41015+04:30 | | | | Activity | CheckpointerMain | | | | | checkpointer
| | 25037 | | | 2018-07-26 17:42:31.411034+04:30 | | | | Activity | WalWriterMain | | | | | walwriter
(7 rows)
如上所示,第四个查询始于19:35:08,大约1小时20分钟后,我没有收到任何答复。
我还在查询时间内检查了postgreSQL日志:
2018-07-26 19:28:07.371 +0430 [32528] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 19:31:08.126 +0430 [19065] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 19:33:07.468 +0430 [8837] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 19:36:07.137 +0430 [36325] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 19:39:07.497 +0430 [22791] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 19:42:07.434 +0430 [9232] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 19:45:07.707 +0430 [36181] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 19:47:07.375 +0430 [22590] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 19:50:07.250 +0430 [9076] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 19:53:07.416 +0430 [36104] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 19:56:08.399 +0430 [23818] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 19:58:07.130 +0430 [10272] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 20:01:08.183 +0430 [37180] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 20:03:07.061 +0430 [23597] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 20:06:07.283 +0430 [10160] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 20:09:07.701 +0430 [37102] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 20:11:08.053 +0430 [23514] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 20:14:08.538 +0430 [9960] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 20:16:07.602 +0430 [36884] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 20:19:07.282 +0430 [23344] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 20:22:07.418 +0430 [9796] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 20:25:07.797 +0430 [36698] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 20:28:07.991 +0430 [23137] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 20:31:07.548 +0430 [9587] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 20:34:07.893 +0430 [36483] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 20:37:07.364 +0430 [22917] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 20:40:09.849 +0430 [9401] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 20:43:07.694 +0430 [36307] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 20:46:07.424 +0430 [22741] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 20:49:07.818 +0430 [9191] analysis@followdns WARNING: there is already a transaction in progress
2018-07-26 20:52:07.221 +0430 [36097] analysis@followdns WARNING: there is already a transaction in progress
仅供参考:
服务状态:(为什么任务,内存和CPU为零?)
mysystem:~$ sudo service postgresql status
[sudo] password for username:
● postgresql.service - PostgreSQL RDBMS
Loaded: loaded (/lib/systemd/system/postgresql.service; enabled; vendor preset: enabled)
Active: active (exited) since Thu 2018-07-26 17:42:33 +0430; 3h 12min ago
Process: 25054 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
Main PID: 25054 (code=exited, status=0/SUCCESS)
Tasks: 0
Memory: 0B
CPU: 0
CGroup: /system.slice/postgresql.service
Jul 26 17:42:33 User systemd[1]: Starting PostgreSQL RDBMS...
Jul 26 17:42:33 User systemd[1]: Started PostgreSQL RDBMS.
监听端口:
mysystem:~$ sudo netstat -natp | grep -i post
extract_h 8990 root 9u IPv4 33879493 0t0 TCP localhost:50888->localhost:postgresql (ESTABLISHED)
postgres 8997 postgres 11u IPv6 31906717 0t0 UDP localhost:57751->localhost:57751
postgres 8997 postgres 12u IPv4 33881503 0t0 TCP localhost:postgresql->localhost:50888 (ESTABLISHED)
postgres 9427 postgres 11u IPv6 31906717 0t0 UDP localhost:57751->localhost:57751
postgres 25033 postgres 6u IPv6 31850266 0t0 TCP localhost:postgresql (LISTEN)
postgres 25033 postgres 7u IPv4 31850267 0t0 TCP localhost:postgresql (LISTEN)
postgres 25033 postgres 11u IPv6 31906717 0t0 UDP localhost:57751->localhost:57751
postgres 25035 postgres 11u IPv6 31906717 0t0 UDP localhost:57751->localhost:57751
postgres 25036 postgres 11u IPv6 31906717 0t0 UDP localhost:57751->localhost:57751
postgres 25037 postgres 11u IPv6 31906717 0t0 UDP localhost:57751->localhost:57751
postgres 25038 postgres 11u IPv6 31906717 0t0 UDP localhost:57751->localhost:57751
postgres 25039 postgres 11u IPv6 31906717 0t0 UDP localhost:57751->localhost:57751
postgres 25040 postgres 11u IPv6 31906717 0t0 UDP localhost:57751->localhost:57751
请注意,我可以在此数据库的同一表上执行其他查询:
analysis=# select count(*) from dns;
count
---------
2073805
(1 row)
analysis=# select count(*) from dns where type = 'cname' and address is null;
count
-------
46539
(1 row)
analysis=#
PG_LOCKS:
analysis=# SELECT * FROM pg_locks;
locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted | fastpath
---------------+----------+----------+------+-------+------------+---------------+---------+--------+----------+--------------------+-------+---------------------+---------+----------
relation | 171525 | 212563 | | | | | | | | 6/1041 | 35517 | AccessShareLock | t | t
relation | 171525 | 212562 | | | | | | | | 6/1041 | 35517 | AccessShareLock | t | t
relation | 171525 | 212561 | | | | | | | | 6/1041 | 35517 | AccessShareLock | t | t
relation | 171525 | 212560 | | | | | | | | 6/1041 | 35517 | AccessShareLock | t | t
relation | 171525 | 212559 | | | | | | | | 6/1041 | 35517 | AccessShareLock | t | t
relation | 171525 | 212558 | | | | | | | | 6/1041 | 35517 | AccessShareLock | t | t
relation | 171525 | 212557 | | | | | | | | 6/1041 | 35517 | AccessShareLock | t | t
relation | 171525 | 212550 | | | | | | | | 6/1041 | 35517 | AccessShareLock | t | t
virtualxid | | | | | 6/1041 | | | | | 6/1041 | 35517 | ExclusiveLock | t | t
relation | 16385 | 11577 | | | | | | | | 4/1495 | 8794 | AccessShareLock | t | t
virtualxid | | | | | 4/1495 | | | | | 4/1495 | 8794 | ExclusiveLock | t | t
relation | 16385 | 213857 | | | | | | | | 5/850 | 9427 | AccessShareLock | t | t
relation | 16385 | 213857 | | | | | | | | 5/850 | 9427 | RowExclusiveLock | t | t
relation | 16385 | 213856 | | | | | | | | 5/850 | 9427 | AccessShareLock | t | t
relation | 16385 | 213856 | | | | | | | | 5/850 | 9427 | RowExclusiveLock | t | t
relation | 16385 | 213855 | | | | | | | | 5/850 | 9427 | AccessShareLock | t | t
relation | 16385 | 213855 | | | | | | | | 5/850 | 9427 | RowExclusiveLock | t | t
relation | 16385 | 213854 | | | | | | | | 5/850 | 9427 | AccessShareLock | t | t
relation | 16385 | 213854 | | | | | | | | 5/850 | 9427 | RowExclusiveLock | t | t
relation | 16385 | 213853 | | | | | | | | 5/850 | 9427 | AccessShareLock | t | t
relation | 16385 | 213853 | | | | | | | | 5/850 | 9427 | RowExclusiveLock | t | t
relation | 16385 | 213847 | | | | | | | | 5/850 | 9427 | AccessShareLock | t | t
relation | 16385 | 213847 | | | | | | | | 5/850 | 9427 | RowExclusiveLock | t | t
virtualxid | | | | | 5/850 | | | | | 5/850 | 9427 | ExclusiveLock | t | t
virtualxid | | | | | 3/5845 | | | | | 3/5845 | 35522 | ExclusiveLock | t | t
relation | 171525 | 215376 | | | | | | | | 6/1041 | 35517 | AccessExclusiveLock | t | f
transactionid | | | | | | 116819 | | | | 6/1041 | 35517 | ExclusiveLock | t | f
relation | 171525 | 215371 | | | | | | | | 6/1041 | 35517 | AccessShareLock | t | f
relation | 171525 | 215371 | | | | | | | | 6/1041 | 35517 | RowExclusiveLock | t | f
relation | 171525 | 215371 | | | | | | | | 6/1041 | 35517 | AccessExclusiveLock | t | f
transactionid | | | | | | 116475 | | | | 5/850 | 9427 | ExclusiveLock | t | f
object | 0 | | | | | | 1260 | 16384 | 0 | 6/1041 | 35517 | AccessShareLock | t | f
relation | 171525 | 215374 | | | | | | | | 6/1041 | 35517 | ShareLock | t | f
object | 171525 | | | | | | 2615 | 215369 | 0 | 6/1041 | 35517 | AccessShareLock | t | f
page | 171525 | 212562 | 822 | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
page | 171525 | 212562 | 814 | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
page | 171525 | 212562 | 812 | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
relation | 171525 | 212558 | | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
page | 171525 | 212562 | 825 | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
page | 171525 | 212562 | 824 | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
relation | 171525 | 212561 | | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
relation | 171525 | 212550 | | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
page | 171525 | 212562 | 816 | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
page | 171525 | 212562 | 826 | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
page | 171525 | 212562 | 818 | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
page | 171525 | 212562 | 828 | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
page | 171525 | 212562 | 827 | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
page | 171525 | 212562 | 823 | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
relation | 171525 | 212559 | | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
page | 171525 | 212562 | 821 | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
page | 171525 | 212562 | 815 | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
page | 171525 | 212562 | 817 | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
page | 171525 | 212562 | 829 | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
page | 171525 | 212562 | 819 | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
page | 171525 | 212562 | 813 | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
page | 171525 | 212562 | 820 | | | | | | | 6/1041 | 35517 | SIReadLock | t | f
(56 rows)
analysis=#
--------------------
更新:
我检查了htop
的输出,看带有pid = 9427
的过程(该奇怪查询的过程)是否消耗CPU,并且它使用100%的CPU大约3个小时!
我还使用Strace
检查该过程:
mysystem:~$ sudo strace -p 9427
strace: Process 9427 attached
strace: [ Process PID=9427 runs in x32 mode. ]
strace: [ Process PID=9427 runs in 64 bit mode. ]
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
kill(25036, SIGUSR1) = 0
lseek(30, 102711296, SEEK_SET) = 102711296
read(30, "\0\0\0\0\0\0\0\0\0\0\0\0L\3p\6\360\37\4 \0\0\0\0p\2060\0\320\237@\0"..., 8192) = 8192
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
open("base/16385/229351_fsm", O_RDWR) = -1 ENOENT (No such file or directory)
lseek(30, 0, SEEK_END) = 109174784
kill(25036, SIGUSR1) = 0
write(30, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
lseek(30, 102719488, SEEK_SET) = 102719488
read(30, "\0\0\0\0\0\0\0\0\0\0\0\0\350\2\0\6\360\37\4 \0\0\0\0\0\206@\0\330\2370\0"..., 8192) = 8192
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
open("base/16385/229352_fsm", O_RDWR) = -1 ENOENT (No such file or directory)
lseek(31, 0, SEEK_END) = 63250432
kill(25036, SIGUSR1) = 0
write(31, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
open("base/16385/229351_fsm", O_RDWR) = -1 ENOENT (No such file or directory)
lseek(30, 0, SEEK_END) = 109182976
kill(25036, SIGUSR1) = 0
write(30, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
lseek(30, 102727680, SEEK_SET) = 102727680
read(30, "\0\0\0\0\0\0\0\0\0\0\0\0\300\2\370\5\360\37\4 \0\0\0\0\370\205P\0\320\237@\0"..., 8192) = 8192
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
lseek(24, 0, SEEK_END) = 233725952
kill(25036, SIGUSR1) = 0
write(24, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
open("base/16385/229349_fsm", O_RDWR) = -1 ENOENT (No such file or directory)
lseek(25, 0, SEEK_END) = 69246976
kill(25036, SIGUSR1) = 0
write(25, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
lseek(24, 0, SEEK_END) = 233734144
kill(25036, SIGUSR1) = 0
write(24, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
lseek(24, 0, SEEK_END) = 233742336
lseek(24, 0, SEEK_END) = 233742336
lseek(24, 0, SEEK_END) = 233742336
lseek(24, 0, SEEK_END) = 233742336
lseek(24, 0, SEEK_END) = 233742336