如何使用awk或sed将以下XML标记转换为带有管道分隔文件的文本。 我尝试使用以下awk,但它没有从Content type标签返回全文。任何帮助都会很棒。
Input_file.dat
<entry>
<updated>2014-05-17T16:34:00-07:00</updated>
<id>994568497</id>
<title>No longer usable</title>
<content type="text">I happen to like the new look, but it crashes with each attempt to use it to perform any real action. Fix it quickly please!.</content>
<im:contentType term="Application" label="Application"/>
<im:voteSum>0</im:voteSum>
<im:voteCount>0</im:voteCount>
<im:rating>1</im:rating>
<im:version>4.2.0.165</im:version>
<author><name>Arcdouble</name><uri>https://test.com/us/reviews/id199894255</uri></author>
</entry>
预期的output_file.csv格式
|2014-05-17T16:34:00-07:00|994568497|No longer usable|I happen to like the new look, but it crashes with each attempt to use it to perform any real action. Fix it quickly please!.|1|Arcdouble|https://test.com/us/reviews/id199894255|
答案 0 :(得分:1)
以下代码适合您:
perl -ne '/<\/entry>/ && print "\n"; />(.*?)</ && !/<name>/ && print $1."|"; /<name>/ && /name>?(.*?)<\/.*?(uri>?)(.*)?<\/uri/ && print $1."|".$3'
输入:
tiago@dell:~$ cat file
<entry>
<updated>2014-05-17T16:34:00-07:00</updated>
<id>994568497</id>
<title>No longer usable</title>
<content type="text">I happen to like the new look, but it crashes with each attempt to use it to perform any real action. Fix it quickly please!.</content>
<im:contentType term="Application" label="Application"/>
<im:voteSum>0</im:voteSum>
<im:voteCount>0</im:voteCount>
<im:rating>1</im:rating>
<im:version>4.2.0.165</im:version>
<author><name>Arcdouble</name><uri>https://test.com/us/reviews/id199894255</uri></author>
</entry>
<entry>
<updated>2014-05-17T16:34:00-07:00</updated>
<id>994568497</id>
<title>No longer usable</title>
<content type="text">I happen to like the new look, but it crashes with each attempt to use it to perform any real action. Fix it quickly please!.</content>
<im:contentType term="Application" label="Application"/>
<im:voteSum>0</im:voteSum>
<im:voteCount>0</im:voteCount>
<im:rating>1</im:rating>
<im:version>4.2.0.165</im:version>
<author><name>Arcdouble</name><uri>https://test.com/us/reviews/id199894255</uri></author>
</entry>
执行:
tiago@dell:~$ cat file | perl -ne '/<\/entry>/ && print "\n"; />(.*?)</ && !/<name>/ && print $1."|"; /<name>/ && /name>?(.*?)<\/.*?(uri>?)(.*)?<\/uri/ && print $1."|".$3'
2014-05-17T16:34:00-07:00|994568497|No longer usable|I happen to like the new look, but it crashes with each attempt to use it to perform any real action. Fix it quickly please!.|0|0|1|4.2.0.165|Arcdouble|https://test.com/us/reviews/id199894255
2014-05-17T16:34:00-07:00|994568497|No longer usable|I happen to like the new look, but it crashes with each attempt to use it to perform any real action. Fix it quickly please!.|0|0|1|4.2.0.165|Arcdouble|https://test.com/us/reviews/id199894255