如何修复postgresql服务器自动重启

时间:2013-06-10 08:22:19

标签: postgresql debian postgresql-9.1

有些日子,PostgreSql服务器会定期开始重启。 日志文件如下。它包含找到的孤立临时表记录,然后执行重新启动。

应用程序可以通过大约50个用户在互联网上使用的单个应用程序访问,并且群集中有大约10个数据库。

如何修复重启和那些错误?

在x86_64-unknown-linux-gnu上使用PostgreSQL 9.1.2,由gcc-4.4 .real编译(Debian 4.4.5-8)4.4.5,64位

日志文件包含:

....

    2013-06-10 11:11:57 EEST   LOG:  server process (PID 25148) was terminated by signal 9: Killed
    2013-06-10 11:11:57 EEST   LOG:  terminating any other active server processes
...

更新

dmesg内容如下。这是在托管服务提供商的Parallels Desktop下运行的VPS服务器。 交换可能已禁用。 在这个服务器中还有mono 2.6和apache,它们为访问同一数据库的ASP.NET应用程序提供服务。 顶部的输出低于。 我有小型Linux经验。我从答案中读到了链接但却不明白。 哪个是解决问题的最佳解决方案,最好不要添加内存。

免费退货

             total       used       free     shared    buffers     cached
Mem:       1048576    1046504       2072          0          0     230512
-/+ buffers/cache:     815992     232584
Swap:            0          0          0

dmesg的:

[2083241.896072] OOM killed process 21849 (mono) vm:420164kB, rss:165296kB, swap:0kB
[2116348.398705] OOM killed process 4970 (postgres) vm:157948kB, rss:76236kB, swap:0kB
[2121711.560995] OOM killed process 5366 (postgres) vm:160348kB, rss:82340kB, swap:0kB
[2123522.901114] OOM killed process 5505 (postgres) vm:145272kB, rss:66840kB, swap:0kB
[2151490.026306] OOM killed process 362 (mono) vm:370636kB, rss:162272kB, swap:0kB
[2160560.103350] OOM killed process 13285 (postgres) vm:195468kB, rss:103792kB, swap:0kB
[2202499.040721] OOM killed process 19391 (postgres) vm:118792kB, rss:45116kB, swap:0kB
[2207881.033010] OOM killed process 19876 (postgres) vm:141356kB, rss:57004kB, swap:0kB
[2209677.336040] OOM killed process 20017 (postgres) vm:127360kB, rss:50764kB, swap:0kB
[2211481.827980] OOM killed process 20193 (postgres) vm:139560kB, rss:56112kB, swap:0kB
[2227779.349062] OOM killed process 12151 (mono) vm:346484kB, rss:142900kB, swap:0kB
[2233087.801652] OOM killed process 21250 (postgres) vm:111996kB, rss:38548kB, swap:0kB
[2236034.881167] OOM killed process 22622 (postgres) vm:111972kB, rss:37672kB, swap:0kB
[2237418.351794] OOM killed process 23868 (postgres) vm:114480kB, rss:40864kB, swap:0kB
[2237723.417347] OOM killed process 24460 (postgres) vm:112764kB, rss:37500kB, swap:0kB
[2238023.668780] OOM killed process 24583 (postgres) vm:112884kB, rss:36024kB, swap:0kB
[2238210.220733] OOM killed process 24773 (postgres) vm:105600kB, rss:22608kB, swap:0kB
[2238397.290829] OOM killed process 24812 (postgres) vm:106360kB, rss:28996kB, swap:0kB
[2238808.757086] OOM killed process 24973 (postgres) vm:109156kB, rss:28676kB, swap:0kB
[2239112.617356] OOM killed process 25148 (postgres) vm:105520kB, rss:26392kB, swap:0kB
[2239217.367104] OOM killed process 25298 (postgres) vm:105700kB, rss:31020kB, swap:0kB
[2239277.036465] OOM killed process 25417 (postgres) vm:106424kB, rss:26024kB, swap:0kB
[2239400.317380] OOM killed process 25479 (postgres) vm:106392kB, rss:18544kB, swap:0kB
[2239536.589647] OOM killed process 25561 (postgres) vm:108412kB, rss:18364kB, swap:0kB
[2239715.268972] OOM killed process 25602 (postgres) vm:111832kB, rss:35944kB, swap:0kB
[2239798.713414] OOM killed process 25701 (postgres) vm:124232kB, rss:37844kB, swap:0kB
[2239812.799232] OOM killed process 25746 (postgres) vm:135948kB, rss:34552kB, swap:0kB
[2239885.587583] OOM killed process 25752 (postgres) vm:113524kB, rss:36880kB, swap:0kB
[2240040.811768] OOM killed process 25789 (postgres) vm:109204kB, rss:33684kB, swap:0kB
[2240416.506723] OOM killed process 25870 (postgres) vm:109268kB, rss:34060kB, swap:0kB

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
26679 postgres  20   0  101m  14m  10m S  5.0  1.4   0:00.20 postgres
26680 postgres  20   0  103m  29m  24m S  1.7  2.8   0:00.21 postgres
26135 www-data  20   0  265m  60m 3180 S  0.3  5.9   0:07.37 mono
26401 www-data  20   0  244m  49m 2912 S  0.3  4.8   0:03.17 mono
    1 root      20   0  8360  236  108 S  0.0  0.0   0:14.01 init
    2 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kthreadd/893
    3 root      20   0     0    0    0 S  0.0  0.0   0:00.00 khelper/893
  460 root      20   0  5988  372  204 S  0.0  0.0   0:07.68 syslogd
  488 root      20   0 54568  540   44 S  0.0  0.1   0:00.00 saslauthd
  490 root      20   0 54568  496    0 S  0.0  0.0   0:00.00 saslauthd
  552 root      20   0 22432  440  192 S  0.0  0.0   0:02.05 cron
  563 messageb  20   0 23272  408  136 S  0.0  0.0   0:00.01 dbus-daemon
  594 postgres  20   0 99628 6740 5568 S  0.0  0.6   2:51.67 postgres
  613 root      20   0 19340  216   12 S  0.0  0.0   0:00.00 xinetd
  641 root      20   0 49180  776  220 S  0.0  0.1   0:11.05 sshd
  669 root      20   0 56036 1956  372 S  0.0  0.2   0:42.91 sendmail-mta
  728 postgres  20   0 65848 1388  172 S  0.0  0.1   0:36.99 postgres
 6547 www-data  20   0  252m  57m  280 S  0.0  5.6   1:03.18 mono
 6930 www-data  20   0  117m  43m  124 S  0.0  4.3   1:01.71 mono
 7489 www-data  20   0  122m  40m  124 S  0.0  4.0   1:00.75 mono
 8158 www-data  20   0  118m  38m  124 S  0.0  3.8   0:58.19 mono
 8311 www-data  20   0  120m  38m  124 S  0.0  3.8   1:16.12 mono
 9776 www-data  20   0  302m  85m  660 S  0.0  8.4   1:17.09 mono
12555 root      20   0  183m 2100  612 S  0.0  0.2   0:00.13 console-kit-dae
14887 root      20   0 74392 2544  908 S  0.0  0.2   0:23.99 apache2
14890 www-data  20   0 50000 9792  292 S  0.0  0.9   0:05.86 mono
14892 www-data  20   0  189m  51m  732 S  0.0  5.1   1:57.05 mono
14900 www-data  20   0  168m  34m  608 S  0.0  3.4  11:47.60 mono

更新2

postgresql.conf设置如下。工作mem被注释,shared_buffers是24MB。 因此,更改这些设置不会影响行为。内存为1GB。 杀死单声道进程后,占用10%的内存重新启动停止。 如何解决问题,以免发生更多?

#work_mem = 1MB                         # min 64kB
#maintenance_work_mem = 16MB            # min 1MB
shared_buffers = 24MB

1 个答案:

答案 0 :(得分:10)

2013-06-10 11:11:57 EEST   LOG:  server process (PID 25148) was terminated by signal 9: Killed

你的内存不足,Linux内核的OOM杀手正在运行,或者cron作业或其他工具正在直接杀死PostgreSQL。

如果它是OOM杀手你会在dmesg或你的内核日志文件中看到,所以先检查一下。如果这是问题所在,请阅读the PostgreSQL documentation on Linux memory overcommit了解如何阻止它发生。

如果内存不足,请确保不要设置过多work_memmaintenance_work_mem,保持shared_buffers合理等等。

如果OOM杀手没有错,那么你需要找出向你的PostgreSQL后端发送SIGKILL的内容,因为除非是OOM杀手正在进行,否则这种情况永远不会发生在普通系统上。