更新 这是我的档案:
<department name="/fighters" id="123879" group="channel" case="none" use="no">
<options index_name="index.html" listing="0" sum="no" allowed="no" />
<target prefix="ttp" suffix=".net" />
<type="effort">
<region="20491" readonly="fs1a" readwrite="fs1a" upload="yes" download="yes" repl="yes" hard="0" soft"0" prio="0" write="no" stage="yes" migrate="no" size="0" >
<read="content" readwrite="content" hard="215822106624" soft="237296943104" prio="5" write="yes" stage="yes" migrate="no" size="0" />
<overflow name="20491-set-writable" />
</replicate>
<region="20576" readonly="fs1a" readwrite="fs1a" upload="yes" download="yes" repl="yes" hard="0" soft"0" prio="0" write="no" stage="yes" migrate="no" size="0" >
<read="content" readwrite="content" hard="215822106624" soft="237296943104" prio="5" write="yes" stage="yes" migrate="no" size="0" />
<overflow name="20576-set-writable" />
</replicate>
</replication>
<user="T:106603" />
<user="T:123879" />
<user="test" />
<user="ele::123456" />
<user="company-temp" />
<user="companymw2" />
<user="bird" />
<user="coding11" />
<user="plazamedia" />
<allow go="123456=abcdefghijklmnopqrstuvwxyz" />
</department>
我写了一个像bash一样的bash:
awk < test.xml -Fuser= '{ print $2 }' | sed '/^$/d' | cut -d" " -f1
结果如下:
"T:106603"
"T:123879"
"test"
"ele::123456"
"company-temp"
"companymw2"
"bird"
"coding11"
"plazamedia"
但想象结果是:
"T:106603" />
"T:123879" />
"test" />
"ele::123456" />
"company-temp" />
"companymw2" />
"bird" />
"coding11" />
"plazamedia" />
首先,如何在第二个"
之后删除所有内容?
其次,如何在" "
?
我喜欢使用sed
或awk
提前谢谢
答案 0 :(得分:2)
试试这个:
awk -F'"' '/<user=/{ print $2 }' file
答案 1 :(得分:1)
试试这个cut
,
cut -d'"' -f 2 test.xml
试试这个sed
,
带引号("
):
sed 's/^.*\("[^"]\+"\).*/\1/g' test.xml
没有引号("
):
sed 's/^.*"\([^"]\+\)".*/\1/g' test.xml
<强>更新强>
sed -e '/^<user/!{d}' -e '/^<user/s/^.*"\([^"]\+\)".*/\1/' test.xml
答案 2 :(得分:1)
如果你想摆脱管道中的sed
和cut
,有很多方法可以做到这一点,具体取决于角落的情况。对我来说最简单的似乎是
awk -F'"' '/<user=/ { print "\"$2\"" }' test.xml
像往常一样,这是必须的don't parse XML with regex链接。
如果字符串中可以引用双引号(但通常XML会使用实体)或者元素可以具有多个属性,那么稍微有趣的极端情况就是如此。如果一行中可能有多个<user=...>
元素,这将很快变得比正确的解决方案更复杂,即<{3}}。
答案 3 :(得分:1)
仅使用sed:
$ sed 's/^<user=\(.*"\).*/\1/' test.xml # With quotes
$ sed 's/^<user="\(.*\)".*/\1/' test.xml # Without quotes
答案 4 :(得分:1)
尝试:
$ awk '/<user=/ && gsub(/<user=|\/>/,x)' file
"T:106603"
"T:123879"
"test"
"ele::123456"
"company-temp"
"companymw2"
"bird"
"coding11"
"plazamedia"
如果您想在Solaris/SunOS
系统上尝试此操作,请将awk
更改为/usr/xpg4/bin/awk
,/usr/xpg6/bin/awk
或nawk
答案 5 :(得分:1)
使用gnu grep
grep -Po 'user=\K"[^"]*"' file