需要正则表达式找到" xxxx" ,,,,, of" yyyy" ,,,,并删除

时间:2017-09-20 19:37:04

标签: mysql regex csv text

我已经获得了一个非常大的CSV数据文件,我需要将其导入MySQL数据库。不幸的是,CSV文件在每50行数据之后有一个文本页脚,如下所示:

0,,,,,," of 2,401",,,,
10,,,,,," of 2,401",,,,
999,,,,,," of 2,401",,,,
"1,000",,,,,," of 2,401",,,,
"2,396",,,,,," of 2,401",,,,

...etc

正如您所看到的,当数字达到1,000时,模式会发生变化(他们开始使用双引号括起首页#)。这超出了我对RegEx的理解。我需要一个正则表达式来识别所有这些行并删除它们。

1 个答案:

答案 0 :(得分:0)

尝试

    (\d+|"(\d+,\d+)+"),+" of (\d+|(\d+,\d+)+)",+(\n|$)

它将匹配以下所有情况:

0 ,,,,,,“2,401”,,,,

10 ,,,,,,“2,401”,,,,

999 ,,,,,,“2,401”,,,,

“1,000”,,,,,,“2,401”,,,,

“2,396”,,,,,,“2,401”,,,,

10 ,,,,,,“2,401,000”,,,,

“1,999,822”,,,,,,“2,401,000”,,,,