尝试解析大型apache访问日志(大于900 MB)时,Import_logs.py失败

时间:2017-01-24 10:45:03

标签: apache mariadb matomo

我正在尝试解析大的apache日志,但import_logs.py因此错误而失败:

2017-01-23 18:30:39,245:[INFO]达到的最大尝试次数,服务器无法访问!_

  

致命错误:HTTP错误500内部服务器错误,响应:

     

{"状态":"错误""跟踪":0,"无效":0," invalidindices&# 34;:[]} _你

     

可以重新开始导入

     

" /awdata/piwik/cosmote/geratgweb04/www.cosmote.grwith_ssl-access.log-20170123"   通过在命令上指定--skip = 78454,从失败的角度来看   线。

我使用的是一个8 cpus和8 gb内存的虚拟机。

我的操作系统版本是RHEL 7.3,

apache版本是Apache / 2.4.6(Red Hat Enterprise Linux),

php是PHP 7.0.14,mariadb版本是5.5.52 Piwik 3.0.1版

我使用的命令是:

/var/www/html/zak/piwik/misc/log-analytics/import_logs.py --url=http://middlinf.ote.gr:81/piwik/ --idsite=6 --recorders=8 --enable-http-errors --enable-http-redirects --enable-static /awdata/piwik/cosmote/geratgweb04/www.cosmote.gr_with_ssl-access.log-20170123 --retry-max-attempts=20

Apache错误日志显示:

[Tue Jan 24 13:44:33.855134 2017] [:error] [pid 17456:tid 140544525190912] [client 172.18.20.26:16610] Piwik(tracker)出错:错误查询:SQLSTATE [40001]:序列化失败:1213试图锁定时发现死锁;尝试重新启动事务在查询中:

UPDATE piwik_log_visit SET idvisitor = ?, user_id = ?,
    visit_last_action_time = ?, visit_exit_idaction_url = ?,
    visit_total_actions = visit_total_actions + 1 ,
    visit_total_interactions = visit_total_interactions + 1 ,
    visit_total_time = ?
    WHERE idsite = ? AND idvisit = ?

参数:array(0 =>' \ xa6 \ x96 \ xbc \ xef \ xb9 \ xde \ xf5',1 =>'""& #39;,2 =>' 2017-01-23 08:14:43',3 => 47298,4 => 0,5 => 4,6 => 11224, )

[Tue Jan 24 13:51:58.582401 2017] [:error] [pid 18419:tid 140544525190912] [client 172.18.20.26:18232] Piwik(tracker)出错:错误查询:SQLSTATE [HY000]:一般错误:1205超过锁定等待超时;尝试重新启动事务在查询中:

UPDATE piwik_log_visit SET idvisitor = ?, user_id = ?,
    visit_last_action_time = ?, visit_exit_idaction_url = ?,
    visit_total_actions = visit_total_actions + 1 ,
    visit_total_interactions = visit_total_interactions + 1 ,
    visit_total_time = ?
    WHERE idsite = ? AND idvisit = ?

参数:array(0 =>' \ xa6 \ x96 \ xbc \ xef \ xb9 \ xde \ xf5',1 =>'""& #39;,2 =>' 2017-01-23 08:42:33',3 => 49791,4 => 242,5 => 4,6 => 11371, )>

and mariadb log is showing:

Time: 170123 18:00:46
User@Host: root[root] @ localhost [127.0.0.1]
Thread_id: 1691 Schema: piwik_db QC_hit: No
Query_time: 3.858223 Lock_time: 0.000060 Rows_sent: 1 Rows_examined: 1
     SET timestamp=1485187246;
     SELECT visit_last_action_time, visit_first_action_time, idvisitor,
 idvisit, user_id, visit_exit_idaction_url, visit_exit_idaction_name,
 visitor_returning, visitor_days_since_first, visitor_days_since_order,
 visitor_count_visits, visit_goal_buyer, location_country,
 location_region, location_city, location_latitude, location_longitude,
 referer_name, referer_keyword, referer_type, idsite,
 visit_entry_idaction_url, visit_total_actions,
 visit_total_interactions, visit_total_searches, config_device_brand,
 config_device_model, config_device_type, visit_total_events,
 visit_total_time, location_ip, location_browser_lang, custom_var_k1,
 custom_var_v1, custom_var_k2, custom_var_v2, custom_var_k3,
 custom_var_v3, custom_var_k4, custom_var_v4, custom_var_k5,
 custom_var_v5 FROM piwik_log_visit
WHERE visit_last_action_time >= '2017-01-22 05:14:21'
  AND visit_last_action_time <= '2017-01-22 06:14:21'
  AND idsite = '6' AND idvisitor = '
ORDER BY visit_last_action_time DESC

我在论坛上做了一些研究,但没有发现任何有趣的东西。 你有什么建议吗?

提前谢谢。

-Thanassis

1 个答案:

答案 0 :(得分:0)

请提供SHOW CREATE TABLE piwik_log_visit。我怀疑你错过了复合

INDEX(idsite, idvisit)

(列可以按任意顺序排列。)

有关

WHERE visit_last_action_time >= '2017-01-22 05:14:21'
  AND visit_last_action_time <= '2017-01-22 06:14:21'
  AND idsite = '6' AND idvisitor = '

您需要INDEX(idsite, idvisitor, visit_last_action_time),确保将范围列放在最后。

您是否注意到该范围涵盖了3601秒?我推荐这种模式有各种原因:

WHERE visit_last_action_time >= '2017-01-22 05:14:21'
  AND visit_last_action_time  < '2017-01-22 05:14:21' + INTERVAL 1 HOUR