Heroku pg:备份恢复失败,显然是因为Rails应用程序超时

时间:2015-08-09 23:51:39

标签: ruby-on-rails postgresql heroku

我遇到了在Heroku上恢复数据库的问题。我使用pg:pull以及pg:backups restore方法多次还原了数据库。但随着最近数据库大小的增加,它无法正常工作。我注意到使用转储文件从一台本地计算机恢复到另一台本地计算机是可行的。但它不适用于Heroku。

我的规格:

  1. Rails 4.2,Ruby 2.2.0,Postgres 9.4.1
  2. 使用Heroku的'Standard 0'Postgres服务(每月50美元)托管生产数据库。
  3. 2x web dyno,1x worker
  4. Unicorn服务器
  5. 为了让您了解我的数据库大小,在此最新恢复之前,有83个表,大约10毫米行,转储文件大小约为350MB。

    我使用本地数据库收集大型数据集。我将定期使用pg_restore将此数据替换为Heroku上的生产。

    我最近将很多数据下载到我的本地数据库,其中一个表格我将称之为rankings。该表从7毫米行增加到60毫米行。 pg_dump文件现在是1.24GB。

    按照惯例,我执行了以下操作将此新数据库传输到heroku:

    1. pg_dump
    2. 将文件上传到aws s3
    3. heroku pg:backups restore AWS_FILE_URL HEROKU_DATABASE_URL
    4. 它始终失败并显示以下消息(来自heroku pg:backups info r023):

      //after successfully restoring all tables and data prior to this://    
      2015-08-09 19:37:33 +0000: pg_restore: processing data for table "rankings"
      2015-08-09 19:41:49 +0000: 2015/08/09 19:41:49 aborting: could not write to output stream: Expected HTTP Status 206 or 200, received: "403 Forbidden"
      2015-08-09 19:41:49 +0000: waiting for restore to complete
      2015-08-09 19:41:49 +0000: pg_restore: [custom archiver] could not read from input file: end of file
      2015-08-09 19:41:49 +0000: restore done
      2015-08-09 19:41:49 +0000: waiting for download to complete
      2015-08-09 19:41:49 +0000: download done
      

      Heroku docs指出这是应用程序的超时,而不是数据库。

      但是,我注意到,经过多次尝试后,只有在处理新更新的rankings表的数据时才会发生此超时。

      我读过this SO post,也许这是独角兽的暂停。但我把我的独角兽超时设置为18000秒。

      我很难过,可以使用一些指导。

      以下是日志输出:

          015-08-09T17:52:34Z app[postgres.11]: [GREEN] checkpoint starting: xlog
          2015-08-09T17:52:39Z app[postgres.11]: [GREEN] checkpoint complete: wrote 53 buffers (0.2%); 0 transaction log file(s) added, 0 removed, 40 recycled; write=1.105 s, sync=4.111 s, total=5.224 s; sync files=3, longest=4.048 s, average=1.370 s
          2015-08-09T17:52:43.777815+00:00 heroku[web.1]: source=web.1 dyno=heroku.22448404.22d13b5c-8725-43f4-9f0b-1b8e219825a9 sample#load_avg_1m=0.00 sample#load_avg_5m=0.00 sample#load_avg_15m=0.01
          2015-08-09T17:52:43.777815+00:00 heroku[web.1]: source=web.1 dyno=heroku.22448404.22d13b5c-8725-43f4-9f0b-1b8e219825a9 sample#memory_total=546.99MB sample#memory_rss=529.55MB sample#memory_cache=2.30MB sample#memory_swap=15.13MB sample#memory_pgpgin=152151pages sample#memory_pgpgout=15996pages
          2015-08-09T17:52:31+00:00 app[heroku-postgres]: source=HEROKU_POSTGRESQL_GREEN sample#current_transaction=366264 sample#db_size=3561754388bytes sample#tables=83 sample#active-connections=6 sample#waiting-connections=0 sample#index-cache-hit-rate=0 sample#table-cache-hit-rate=0.66952 sample#load-avg-1m=1.245 sample#load-avg-5m=0.845 sample#load-avg-15m=0.62 sample#read-iops=1.2 sample#write-iops=2554.6 sample#memory-total=15405668kB sample#memory-free=171092kB sample#memory-cached=14145576kB sample#memory-postgres=490544kB
          2015-08-09T17:52:46Z app[postgres.11021]: [GREEN] canceling statement due to user request
          Query: COPY rankings (id, person_id, max, min, current, date, period_years_min_max, created_at, updated_at) FROM stdin;
          2015-08-09T17:52:46Z app[postgres.11021]: [GREEN] could not receive data from client: Connection reset by peer
          2015-08-09T17:52:46Z app[postgres.11021]: [GREEN] incomplete message from client
      

      下面是heroku logs -t的完整输出:

      015-08-10T02:55:04Z app[postgres.3977]: [GREEN] statement: DROP SCHEMA public CASCADE 
      2015-08-10T02:55:05Z app[postgres.3675]: [GREEN] canceling autovacuum task 
      2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] sending cancel to blocking autovacuum PID 3675 
      Detail: Process 3977 waits for AccessExclusiveLock on relation 35169 of database 16385. 
      Query: DROP SCHEMA public CASCADE 
      2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] process 3977 still waiting for AccessExclusiveLock on relation 35169 of database 16385 after 1000.367 ms 
      Detail: Process holding the lock: 3675. Wait queue: 3977. 
      Query: DROP SCHEMA public CASCADE 
      2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] process 3977 acquired AccessExclusiveLock on relation 35169 of database 16385 after 1000.779 ms 
      Query: DROP SCHEMA public CASCADE 
      2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] drop cascades to 97 other objects 
      Detail: drop cascades to extension hstore 
      drop cascades to type daily_box_input 
      //<--84 tables, materialized views and views redacted for brevity-->// 
      drop cascades to table users 
      drop cascades to extension pg_stat_statements 
      2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] statement: CREATE SCHEMA public 
      2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] statement: ALTER SCHEMA public OWNER TO uca46vrebkj2l9 
      2015-08-10T02:54:33+00:00 app[heroku-postgres]: source=HEROKU_POSTGRESQL_GREEN sample#current_transaction=368590 sample#db_size=4203753236bytes sample#tables=83 sample#active-connections=3 sample#waiting-connections=0 sample#index-cache-hit-rate=0 sample#table-cache-hit-rate=0.66674 sample#load-avg-1m=0.45 sample#load-avg-5m=0.71 sample#load-avg-15m=0.66 sample#read-iops=17.058 sample#write-iops=498.2 sample#memory-total=15405668kB sample#memory-free=283204kB sample#memory-cached=14065608kB sample#memory-postgres=477864kB 
      2015-08-10T02:55:57.359270+00:00 heroku[router]: at=info code=H80 desc="Maintenance mode" method=GET path="/" host=www.studiocomps.com request_id=e0d6bbd7-076f-4a56-93a1-2f80a45d28c4 fwd="74.86.158.110" dyno= connect= service= status=503 bytes= 
      2015-08-10T02:56:02Z app[postgres.4009]: [GREEN] statement: CREATE EXTENSION IF NOT EXISTS pg_stat_statements 
      2015-08-10T02:56:03.118589+00:00 heroku[router]: at=info code=H80 desc="Maintenance mode" method=GET path="/" host=test.studiocomps.com request_id=25fbbab0-8005-454c-b432-0f9b91cd5a75 fwd="69.64.56.47" dyno= connect= service= status=503 bytes= 
      2015-08-10T02:55:43+00:00 app[heroku-postgres]: source=HEROKU_POSTGRESQL_GREEN sample#current_transaction=368608 sample#db_size=9137940bytes sample#tables=0 sample#active-connections=2 sample#waiting-connections=0 sample#index-cache-hit-rate=0 sample#table-cache-hit-rate=0 sample#load-avg-1m=0.625 sample#load-avg-5m=0.7 sample#load-avg-15m=0.66 sample#read-iops=234.98 sample#write-iops=226.24 sample#memory-total=15405668kB sample#memory-free=4200392kB sample#memory-cached=10112136kB sample#memory-postgres=470496kB
      

      和heroku日志的尾端输出-t --ps postgres:

      2015-08-10T02:46:35Z app[postgres.2427]: [GREEN] duration: 3538.409 ms statement: COPY flows (id, created_at, updated_at, medium_id, value_subunit, timing_id, ultimate_ratio_id, sheet_id, company_id, value_to_date_subunit, period_start, period_end, currency_code, local_title, kind_id, theaters, screens, flowable_id, flowable_type) FROM stdin;
      2015-08-10T02:47:15Z app[postgres.11]: [GREEN] checkpoint starting: xlog 
      2015-08-10T02:47:53Z app[postgres.11]: [GREEN] checkpoint complete: wrote 14930 buffers (47.8%); 0 transaction log file(s) added, 0 removed, 55 recycled; write=30.562 s, sync=7.386 s, total=38.138 s; sync files=219, longest=5.843 s, average=0.033 s 
      2015-08-10T02:47:57Z app[postgres.2829]: [GREEN] statement: CREATE EXTENSION IF NOT EXISTS pg_stat_statements 
      2015-08-10T02:48:13Z app[postgres.11]: [GREEN] checkpoint starting: xlog 
      2015-08-10T02:48:42Z app[postgres.11]: [GREEN] checkpoint complete: wrote 3766 buffers (12.1%); 0 transaction log file(s) added, 0 removed, 38 recycled; write=26.669 s, sync=1.943 s, total=28.658 s; sync files=36, longest=1.118 s, average=0.053 s 
      2015-08-10T02:49:11Z app[postgres.11]: [GREEN] checkpoint starting: xlog 
      2015-08-10T02:49:25Z app[postgres.11]: [GREEN] checkpoint complete: wrote 80 buffers (0.3%); 0 transaction log file(s) added, 0 removed, 26 recycled; write=2.021 s, sync=11.500 s, total=13.529 s; sync files=3, longest=6.301 s, average=3.833 s 
      2015-08-10T02:50:15Z app[postgres.11]: [GREEN] checkpoint starting: xlog 
      2015-08-10T02:50:20Z app[postgres.11]: [GREEN] checkpoint complete: wrote 81 buffers (0.3%); 0 transaction log file(s) added, 0 removed, 38 recycled; write=1.586 s, sync=3.825 s, total=5.477 s; sync files=2, longest=3.823 s, average=1.912 s 
      2015-08-10T02:51:08Z app[postgres.2427]: [GREEN] canceling statement due to user request 
      Query: COPY rankings (id, person_id, max, min, current, date, period_years_min_max, created_at, updated_at) FROM stdin; 
      2015-08-10T02:51:08Z app[postgres.2427]: [GREEN] could not receive data from client: Connection reset by peer 
      2015-08-10T02:51:08Z app[postgres.2427]: [GREEN] incomplete message from client 
      2015-08-10T02:51:27Z app[postgres.11]: [GREEN] checkpoint starting: xlog 
      2015-08-10T02:54:17Z app[postgres.11]: [GREEN] checkpoint complete: wrote 2098 buffers (6.7%); 0 transaction log file(s) added, 0 removed, 58 recycled; write=169.538 s, sync=0.952 s, total=170.500 s; sync files=11, longest=0.887 s, average=0.086 s 
      2015-08-10T02:55:04Z app[postgres.3977]: [GREEN] statement: DROP SCHEMA public CASCADE 
      2015-08-10T02:55:05Z app[postgres.3675]: [GREEN] canceling autovacuum task 
      2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] sending cancel to blocking autovacuum PID 3675 
      Detail: Process 3977 waits for AccessExclusiveLock on relation 35169 of database 16385. 
      Query: DROP SCHEMA public CASCADE 
      2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] process 3977 still waiting for AccessExclusiveLock on relation 35169 of database 16385 after 1000.367 ms 
      Detail: Process holding the lock: 3675. Wait queue: 3977. 
      Query: DROP SCHEMA public CASCADE 
      2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] process 3977 acquired AccessExclusiveLock on relation 35169 of database 16385 after 1000.779 ms 
      Query: DROP SCHEMA public CASCADE 
      2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] drop cascades to 97 other objects 
      Detail: drop cascades to extension hstore 
      drop cascades to type daily_box_input 
      //<--84 tables, materialized views and views redacted for brevity-->// 
      drop cascades to table users 
      drop cascades to extension pg_stat_statements 
      2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] statement: CREATE SCHEMA public 
      2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] statement: ALTER SCHEMA public OWNER TO uca46vrebkj2l9 
      2015-08-10T02:56:02Z app[postgres.4009]: [GREEN] statement: CREATE EXTENSION IF NOT EXISTS pg_stat_statements
      

1 个答案:

答案 0 :(得分:2)

好的,我很难发帖,因为我感觉很愚蠢,但直接解决我的问题是我没有在Amazon S3上公开我的链接。这就是我收到403错误的原因。

潜在的混淆是您可以使用非公共链接,这是一长串编码字符。但它使用缓存的URL。由于缓存是临时的,因此在恢复过程完成之前缓存已被丢弃。

所以答案是公开Amazon S3链接,然后使用公共链接,例如:https://s3-us-west-2.amazonaws.com/your_bucket_name/your_file_name.dump

然后:heroku pg:backups restore 'https://s3-us-west-2.amazonaws.com/your_bucket_name/your_file_name.dump' HEROKU_DATABASE_URL

话虽如此,在这个问题的同时,我相信仍然会出现“死锁”的问题。我需要单独解决与AccessExclusiveLock相关的问题,这使得pg_restore需要足够长的时间才能使缓存超时,但这是一个单独的问题。