自动文本文件编辑

时间:2015-07-08 07:31:23

标签: text editing

我有一个类似于此的文本文件:

+PhoneNumber          3/5/15 7:16 PM          us          Text is here
+PhoneNumber          3/5/15 7:16 PM          us          Text is here
+PhoneNumber          3/5/15 7:16 PM          us          Text is here
+PhoneNumber          3/5/15 7:16 PM          us          Text is here

现在问题是有些线路会这样做:

+PhoneNumber          3/5/15 7:16 PM          us          Text is here
but runs down to here 
+PhoneNumber          3/5/15 7:16 PM          us          Text is here
+PhoneNumber          3/5/15 7:16 PM          us          Text is here
but runs down to here 
+PhoneNumber          3/5/15 7:16 PM          us          Text is here
but runs down to here or even
longer like this

现在我的线条长度各不相同,并做了类似上面的例子。我的目标是我需要每一行看起来像第一个例子。 IE我希望每一行都以“+ PhoneNumber”而不是文本开头。所有文本都应该退回到它的前一行,以便完成句子。所以它更像是这样:

+PhoneNumber          3/5/15 7:16 PM          us          Text is here but runs down to here 
+PhoneNumber          3/5/15 7:16 PM          us          Text is here
+PhoneNumber          3/5/15 7:16 PM          us          Text is here but runs down to here 
+PhoneNumber          3/5/15 7:16 PM          us          Text is here but runs down to here or even longer like this

我完全不知道如何为我做一个脚本或任何事情,所以我要求帮助。我试过谷歌搜索它没有任何帮助。现在我正在手工编辑每一行,但是有超过30000行文本,并且手动编辑所有这些将永远。所以任何帮助都会非常感激。谢谢你们!

TLDR;需要一个脚本,如果它所在的行不以+

开头,则将文本返回到上一行

2 个答案:

答案 0 :(得分:1)

我建议使用两个表达式首先用空格替换\ r \ n然后用(。*?)+替换为$ 1 \ r \ n +

在notepad ++中快速输出

答案 1 :(得分:0)

假设您有权访问awk:

~ $ cat test.awk
/^\+/ { printf "\n%s", $0; }
/^[^+]/ { printf " %s", $0; }
END { print ""; }

~ $ cat test.input
+PhoneNumber          3/5/15 7:16 PM          us          Text is here
but runs down to here
+PhoneNumber          3/5/15 7:16 PM          us          Text is here
+PhoneNumber          3/5/15 7:16 PM          us          Text is here
but runs down to here
+PhoneNumber          3/5/15 7:16 PM          us          Text is here
but runs down to here or even
longer like this

~ $ awk -f test.awk <test.input  | tail +2
+PhoneNumber          3/5/15 7:16 PM          us          Text is here but runs down to here
+PhoneNumber          3/5/15 7:16 PM          us          Text is here
+PhoneNumber          3/5/15 7:16 PM          us          Text is here but runs down to here
+PhoneNumber          3/5/15 7:16 PM          us          Text is here but runs down to here or even longer like this