Question

我有很多文字如下：

'str_aaa_2012-0000.txt' 'http://weburl.com' 'SHA256' 'hashdata' 'str_aaa_2012-0001.txt' 'http://weburl.com' 'SHA256' 'hashdata' 'str_aaa_2012-0002.txt' 'http://weburl.com' 'SHA256' 'hashdata' 'str_aaa_2012-0003.txt' 'http://weburl.com' 'SHA256' 'hashdata' 'str_aaa_2012-0004.txt' 'http://weburl.com' 'SHA256' 'hashdata' 'str_aaa_2012-0005.txt' 'http://weburl.com' 'SHA256' 'hashdata'

包含str_aaa的第一个字符串在末尾变化，包含URL的第二个字符串变化，包含SHA256的第三个字符串在文本中保持不变，第四个字符串包含{{1}不一样。我想从字符串hashdata开始分割每一行，以便输出看起来像：

'str_aaa_2012*

我如何将此文本拆分为单独的行？

Answer 1

您可以使用此gnu-sed命令：

sed -E "s/(('[^']*' ){3}'[^']*') +/\1\n/g" file
'str_aaa_2012-0000.txt' 'http://weburl.com' 'SHA256' 'hashdata'
'str_aaa_2012-0001.txt' 'http://weburl.com' 'SHA256' 'hashdata'
'str_aaa_2012-0002.txt' 'http://weburl.com' 'SHA256' 'hashdata'
'str_aaa_2012-0003.txt' 'http://weburl.com' 'SHA256' 'hashdata'
'str_aaa_2012-0004.txt' 'http://weburl.com' 'SHA256' 'hashdata'
'str_aaa_2012-0005.txt' 'http://weburl.com' 'SHA256' 'hashdata'

Answer 2

Perl救援：

perl -aF\' -ne '
    shift @F,
        print join "\x27", "", splice(@F, 0, 7), "\n"
        while @F > 1;
' input-file

-aF\'将该行拆分为'上的@F数组。然后，元素将从组中删除并打印出来。

Answer 3

使用Bash参数扩展：

$ str="'str_aaa_2012-0000.txt' 'http://weburl.com' 'SHA256' 'hashdata' 'str_aaa_2012-0001.txt' 'http://weburl.com' 'SHA256' 'hashdata' 'str_aaa_2012-0002.txt' 'http://weburl.com' 'SHA256' 'hashdata' 'str_aaa_2012-0003.txt' 'http://weburl.com' 'SHA256' 'hashdata' 'str_aaa_2012-0004.txt' 'http://weburl.com' 'SHA256' 'hashdata' 'str_aaa_2012-0005.txt' 'http://weburl.com' 'SHA256' 'hashdata'"
$ printf "%b\n" "${str// \'str_aaa/\\n\'str_aaa}"
'str_aaa_2012-0000.txt' 'http://weburl.com' 'SHA256' 'hashdata'
'str_aaa_2012-0001.txt' 'http://weburl.com' 'SHA256' 'hashdata'
'str_aaa_2012-0002.txt' 'http://weburl.com' 'SHA256' 'hashdata'
'str_aaa_2012-0003.txt' 'http://weburl.com' 'SHA256' 'hashdata'
'str_aaa_2012-0004.txt' 'http://weburl.com' 'SHA256' 'hashdata'
'str_aaa_2012-0005.txt' 'http://weburl.com' 'SHA256' 'hashdata'

扩张的细节，

printf "%b\n" "${str// \'str_aaa/\\n\'str_aaa}"

是：

old

${str//old/new}会将new替换为参数str中的old（与${str/old/new}相对，后者取代\'只有第一次出现）
单引号必须转义，因此\n
换行符\\n中的反斜杠也必须进行转义，因此'str_aaa'
我们不是在%b前面插入换行符，而是替换前面的空格，以避免a）开头的空行和b）行尾的额外空格
要进行打印，我们使用printf及其random.random()格式规范，这会扩展反斜杠转义序列

从字符串开头开始拆分文本

3 个答案: