仅在数字后删除新行

时间:2016-11-04 13:24:08

标签: regex perl shell awk sed

我从终端收集了一些CSV数据,但每行只有80个字符,因此无法正确导入。

这里有两行数据:

28,26166,25180,23645,22824,21257,20080,18921,17893,16702,15650,14647,13667,12691
,11971,11179,10393,9885,9294,8930,8390,8079,7660,7341,6907,6425,6120,5789,5588,5
267,4924,4581,4246,4025,3857,

3423,3567,3636,3633,3714,3844,4543,5887,7287,8499,9
746,10704,11658,12591,13379,13950,14679,14954,14756,14224,13921,13494,12849,1230
0,11970,12240,12867,13475,14310,15962,17624,19105,21075,

我想删除换行符 ,如果它在任何数字或逗号之后,但如果它只是在它自己的上面,则不会,因为这意味着它和&# #39;新的CSV数据系列。

我无法弄清楚如何在sed的shell上执行此操作。如果awkperl等任何其他程序更适合此方案,请随时向我展示解决方案。

预期产出:

28,26166,25180,23645,22824,21257,20080,18921,17893,16702,15650,14647,13667,12691,11971,11179,10393,9885,9294,8930,8390,8079,7660,7341,6907,6425,6120,5789,5588,5267,4924,4581,4246,4025,3857,

3423,3567,3636,3633,3714,3844,4543,5887,7287,8499,9746,10704,11658,12591,13379,13950,14679,14954,14756,14224,13921,13494,12849,12300,11970,12240,12867,13475,14310,15962,17624,19105,21075,

4 个答案:

答案 0 :(得分:3)

如果换行前面有数字或逗号,请删除换行符:

perl -pe 'chomp if /[\d,]$/' input-file > output-file
  • -p逐行读取输入并打印结果
  • chomp删除新行(如果存在于最后)
  • \d匹配数字
  • $匹配行尾

答案 1 :(得分:1)

使用awk阅读段落模式并替换所有\n

$ awk -v RS= '{gsub("\n","")} 1' ip.txt 
28,26166,25180,23645,22824,21257,20080,18921,17893,16702,15650,14647,13667,12691,11971,11179,10393,9885,9294,8930,8390,8079,7660,7341,6907,6425,6120,5789,5588,5267,4924,4581,4246,4025,3857,
3423,3567,3636,3633,3714,3844,4543,5887,7287,8499,9746,10704,11658,12591,13379,13950,14679,14954,14756,14224,13921,13494,12849,12300,11970,12240,12867,13475,14310,15962,17624,19105,21075,


要保留空白,请将ORS设置为加倍换行,但这会在结尾处添加额外的换行符

$ awk -v RS= -v ORS='\n\n' '{gsub("\n","")} 1' ip.txt 
28,26166,25180,23645,22824,21257,20080,18921,17893,16702,15650,14647,13667,12691,11971,11179,10393,9885,9294,8930,8390,8079,7660,7341,6907,6425,6120,5789,5588,5267,4924,4581,4246,4025,3857,

3423,3567,3636,3633,3714,3844,4543,5887,7287,8499,9746,10704,11658,12591,13379,13950,14679,14954,14756,14224,13921,13494,12849,12300,11970,12240,12867,13475,14310,15962,17624,19105,21075,

答案 2 :(得分:0)

您可以使用此正则表达式:

(?<!\n)\n(?!\n)

并替换为空字符串。

答案 3 :(得分:0)

perl -0pe 's/([\d,])\n([\d,])/$1$2/sg' (file)

应该这样做。

也就是说,读取没有行分隔符的文件,将整个文件视为一个字符串,并删除前面跟着数字或逗号的换行符。