正则表达式匹配patern accross lines并查找和替换

时间:2018-01-17 13:01:12

标签: regex replace

嗨,我有大量文件,如:

1234567890\t this is head row1 and some random text here... \r\n
12\t line1 some randomtexthere... \r\n
6549853695\t this is head row2 and some random text here... \r\n
6\t line1 some randomtexthere... \r\n
1\t line2 some randomtexthere... \r\n
54\t iine3 some randomtexthere... \r\n
2158965845\t this is head row3 and some random text here... \r\n
1\t line1 some randomtexthere... \r\n
25\t line2 some randomtexthere... \r\n

基本上是头部和数据行...... 我需要将头编号添加到所有后续行,直到下一个头。 头行后的数据行数可能会在1到200行之间变化! 每个数据行都以数字从1到99开始,后跟选项卡 每个行头以十位数字开头,后跟制表符。

目前有: 找到:(^(([0-9]{10}\t).*\r\n)((([0-9]{1})|([0-9]{2}))\t.*\r\n))

替换为:\2\3\4

当前选项使用emeditor重复查找和替换200次,直到所有内容都被替换...但考虑到大量文件和文件大小,这需要花费大量时间......

任何有关神奇解决方案的想法? 最终结果应该如下:

1234567890\t this is head row1 and some random text here... \r\n
1234567890\t12\t line1 some randomtexthere... \r\n
6549853695\t this is head row2 and some random text here... \r\n
6549853695\t6\t line1 some randomtexthere... \r\n
6549853695\t1\t line2 some randomtexthere... \r\n
6549853695\t54\t iine3 some randomtexthere... \r\n
2158965845\t this is head row3 and some random text here... \r\n
2158965845\t1\t line1 some randomtexthere... \r\n
2158965845\t25\t line2 some randomtexthere... \r\n

1 个答案:

答案 0 :(得分:0)

使用perl one-liner

perl -F'\t' -ape 'if ($F[0]=~/\d{10}/) { $id = $F[0]; } else { $_ = "$id\t$_"; }' <input_file >output_file

或修改地点-i

中的文件
perl -i.bak -F'\t' -ape 'if ($F[0]=~/\d{10}/) { $id = $F[0]; } else { $_ = "$id\t$_"; }' files

来自perl -h

  • -i [extension] edit&lt;&gt;文件到位(如果提供扩展,则进行备份)
  • -F / pattern / split()模式用于-a开关(//是可选的)
  • -a autosplit mode with -n or -p($ _ into @F)
  • -p假设循环像-n但是打印行也像sed
  • -e程序一行程序(允许多个-e,省略程序文件)