我有一些日志文件包含许多类似的行:
[26-Nov-2010 07:33:08] query error: INSERT INTO members (id,name,member_login_key,email,mgroup,posts,joined,ip_address,time_offset,view_sigs,email_pm,view_img,view_avs,restrict_post,view_pop,msg_total,new_msg,coppa_user,language,dst_in_use,allow_admin_mails,hide_email,subs_pkg_chosen,members_l_username,members_l_display_name, item_id, members_display_name)
VALUES(8416961,'abc','3857b123a1a67ce1fc4a39fd7ae47355','test@email.com',1,0,1290756788,'127.0.0.1','',1,1,1,1,
0,1,0,0,0,'',0,1,0,0,'abc','abc',
'0', 'abc');|http://www.example.com/|Duplicate entry '8388607' for key 1
[26-Nov-2010 08:33:08] query error: INSERT INTO members (id,name,member_login_key,email,mgroup,posts,joined,ip_address,time_offset,view_sigs,email_pm,view_img,view_avs,restrict_post,view_pop,msg_total,new_msg,coppa_user,language,dst_in_use,allow_admin_mails,hide_email,subs_pkg_chosen,members_l_username,members_l_display_name, item_id, members_display_name)
VALUES(8416962,'abc','3857b123a1a67ce1fc4a39fd7ae47355','test@email.com',1,0,1290756788,'127.0.0.1','',1,1,1,1,
0,1,0,0,0,'',0,1,0,0,'abc','abc',
'0', 'abc');|http://www.example.com/|Duplicate entry '8388607' for key 1
我想要做的是运行一个正则表达式来匹配所有插入查询(忽略时间,网址和重复的消息。
所以它应该返回:
INSERT INTO members (id,name,member_login_key,email,mgroup,posts,joined,ip_address,time_offset,view_sigs,email_pm,view_img,view_avs,restrict_post,view_pop,msg_total,new_msg,coppa_user,language,dst_in_use,allow_admin_mails,hide_email,subs_pkg_chosen,members_l_username,members_l_display_name, item_id, members_display_name)
VALUES(8416961,'abc','3857b123a1a67ce1fc4a39fd7ae47355','test@email.com',1,0,1290756788,'127.0.0.1','',1,1,1,1,
0,1,0,0,0,'',0,1,0,0,'abc','abc',
'0', 'abc');
INSERT INTO members (id,name,member_login_key,email,mgroup,posts,joined,ip_address,time_offset,view_sigs,email_pm,view_img,view_avs,restrict_post,view_pop,msg_total,new_msg,coppa_user,language,dst_in_use,allow_admin_mails,hide_email,subs_pkg_chosen,members_l_username,members_l_display_name, item_id, members_display_name)
VALUES(8416962,'abc','3857b123a1a67ce1fc4a39fd7ae47355','test@email.com',1,0,1290756788,'127.0.0.1','',1,1,1,1,
0,1,0,0,0,'',0,1,0,0,'abc','abc',
'0', 'abc');
任何人都可以提供帮助?提前谢谢!
答案 0 :(得分:0)
你想提取它的一部分,还是只匹配?
只是匹配很简单,它根本不需要正则表达式,只需要子串INSERT INTO。
grep 'INSERT INTO' foo.log
如果您想提取详细信息或进行更具体的匹配,请提供更多信息。
如果你想拥有以下三行,你可以这样做。
grep -A 3 'INSERT INTO' foo.log
如果你想从开始和结束中修剪一些东西(这很丑陋,但对你的例子有效)
grep -A 3 'INSERT INTO' foo.log | sed -e 's/^.*INSERT INTO/INSERT INTO/' -e 's/);|.*/);/'
答案 1 :(得分:0)
如果所有插入都跨越4行日志文件,那么您可以使用此正则表达式:
(.*)(INSERT INTO.*\n.*\n.*\n.*\))(;.*)
使用此匹配替换字符串:
\2\n
答案 2 :(得分:0)
这应该是可能的,这在很大程度上取决于整个文件是否与之相同。
这只是为了获取INSERT,如果你想要日志条目,那么正则表达式需要稍微改变。
$logFile = file_get_contents('inserts.log');
$matches = array();
preg_match_all("/(?P<insert>INSERT .+?;)/s", $logFile, $matches);
foreach ($matches['insert'] as $cQuery) {
echo $cQuery . "\n";
}
有关此方法的详细信息,请参阅preg_match_all documentation。