示例1

Question

理想情况下，我希望能够做到的是：

cat xhtmlfile.xhtml |
getElementViaXPath --path='/html/head/title' |
sed -e 's%(^<title>|</title>$)%%g' > titleOfXHTMLPage.txt

Answer 1

这实际上只是对Yuzem's答案的解释，但我不觉得应该对其他人进行这么多编辑，并且评论不允许格式化，所以......

rdom () { local IFS=\> ; read -d \< E C ;}

让我们称之为“read_dom”而不是“rdom”，将其分开并使用更长的变量：

read_dom () {
    local IFS=\>
    read -d \< ENTITY CONTENT
}

好的，它定义了一个名为read_dom的函数。第一行使IFS（输入字段分隔符）成为此函数的本地，并将其更改为＆gt;。这意味着当您读取数据而不是自动拆分空格，制表符或换行符时，它会在“＆gt;”上拆分。下一行表示从stdin读取输入，而不是在换行符处停止，当你看到'＆lt;'时停止character（-d for deliminator标志）。然后使用IFS分割读取的内容并将其分配给变量ENTITY和CONTENT。所以请注意以下几点：

<tag>value</tag>

第一次调用read_dom得到一个空字符串（因为'＆lt;'是第一个字符）。由IFS分成''，因为没有'＆gt;'字符。然后读取为两个变量分配一个空字符串。第二个调用获取字符串'tag＆gt; value'。然后由IFS分成两个字段'tag'和'value'。然后阅读分配变量，例如：ENTITY=tag和CONTENT=value。第三个调用获取字符串'/ tag＆gt;'。由IFS分成两个字段'/ tag'和''。然后阅读分配变量，例如：ENTITY=/tag和CONTENT=。第四个调用将返回非零状态，因为我们已到达文件末尾。

现在他的while循环清理了一下以匹配上述内容：

while read_dom; do
    if [[ $ENTITY = "title" ]]; then
        echo $CONTENT
        exit
    fi
done < xhtmlfile.xhtml > titleOfXHTMLPage.txt

第一行只是说，“当read_dom函数返回零状态时，请执行以下操作。”第二行检查我们刚才看到的实体是否是“标题”。下一行回显标签的内容。四线退出。如果它不是标题实体，则循环在第六行重复。我们将“xhtmlfile.xhtml”重定向到标准输入（用于read_dom函数）并将标准输出重定向到“titleOfXHTMLPage.txt”（循环中早期的回声）。

现在给出input.xml的以下内容（类似于在S3上列出存储桶的内容）：

<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
  <Name>sth-items</Name>
  <IsTruncated>false</IsTruncated>
  <Contents>
    <Key>item-apple-iso@2x.png</Key>
    <LastModified>2011-07-25T22:23:04.000Z</LastModified>
    <ETag>&quot;0032a28286680abee71aed5d059c6a09&quot;</ETag>
    <Size>1785</Size>
    <StorageClass>STANDARD</StorageClass>
  </Contents>
</ListBucketResult>

以及以下循环：

while read_dom; do
    echo "$ENTITY => $CONTENT"
done < input.xml

你应该得到：

 => 
ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/" => 
Name => sth-items
/Name => 
IsTruncated => false
/IsTruncated => 
Contents => 
Key => item-apple-iso@2x.png
/Key => 
LastModified => 2011-07-25T22:23:04.000Z
/LastModified => 
ETag => &quot;0032a28286680abee71aed5d059c6a09&quot;
/ETag => 
Size => 1785
/Size => 
StorageClass => STANDARD
/StorageClass => 
/Contents =>

所以如果我们写了一个像Yuzem那样的while循环：

while read_dom; do
    if [[ $ENTITY = "Key" ]] ; then
        echo $CONTENT
    fi
done < input.xml

我们将获得S3存储桶中所有文件的列表。

修改如果由于某种原因local IFS=\>不能为您工作并且您全局设置它，您应该在函数末尾重置它，如：

read_dom () { ORIGINAL_IFS=$IFS IFS=\> read -d \< ENTITY CONTENT IFS=$ORIGINAL_IFS }

否则，您在稍后在脚本中执行的任何行拆分都会搞砸。

编辑2 要拆分属性名称/值对，您可以增加read_dom()，如下所示：

read_dom () { local IFS=\> read -d \< ENTITY CONTENT local ret=$? TAG_NAME=${ENTITY%% *} ATTRIBUTES=${ENTITY#* } return $ret }

然后编写你的函数来解析并得到你想要的数据：

parse_dom () { if [[ $TAG_NAME = "foo" ]] ; then eval local $ATTRIBUTES echo "foo size is: $size" elif [[ $TAG_NAME = "bar" ]] ; then eval local $ATTRIBUTES echo "bar type is: $type" fi }

然后当您read_dom致电parse_dom时：

while read_dom; do parse_dom done

然后给出以下示例标记：

<example> <bar size="bar_size" type="metal">bars content</bar> <foo size="1789" type="unknown">foos content</foo> </example>

你应该得到这个输出：

$ cat example.xml | ./bash_xml.sh bar type is: metal foo size is: 1789

编辑3 另一个user表示他们在FreeBSD中遇到问题并建议保存退出状态，并在read_dom结束时返回它，如：

read_dom () { local IFS=\> read -d \< ENTITY CONTENT local RET=$? TAG_NAME=${ENTITY%% *} ATTRIBUTES=${ENTITY#* } return $RET }

我认为没有任何理由不适用

Answer 2

只使用bash就可以轻松完成。您只需添加此功能：

rdom () { local IFS=\> ; read -d \< E C ;}

现在你可以使用rdom之类的读取，但是对于html文档。当调用rdom时，将元素分配给变量E，将内容分配给var C.

例如，要做你想做的事：

while rdom; do
    if [[ $E = title ]]; then
        echo $C
        exit
    fi
done < xhtmlfile.xhtml > titleOfXHTMLPage.txt

Answer 3

可以从shell脚本调用的命令行工具包括：

4xpath - 围绕Python的4Suite包
XMLStarlet
xpath - Perl XPath库的命令行包装器
Xidel - 适用于网址和文件。也适用于JSON

我还使用xmllint和xsltproc以及少量XSL转换脚本从命令行或shell脚本执行XML处理。

Answer 4

您可以使用xpath实用程序。它与Perl XML-XPath软件包一起安装。

用法：

/usr/bin/xpath [filename] query

或XMLStarlet。要在opensuse上安装它，请使用：

sudo zypper install xmlstarlet

或在其他平台上试用cnf xml。

Answer 5

这就足够了......

xpath xhtmlfile.xhtml '/html/head/title/text()' > titleOfXHTMLPage.txt

Answer 6

从http://www.ofb.net/~egnor/xml2/查看 XML2 ，将XML转换为面向行的格式。

Answer 7

从chad的答案开始，这里是用于解析UML的COMPLETE工作解决方案，只需2个小函数就可以处理注释，超过2个你可以将它们全部混合起来。我不是说chad的一个根本没用，但它对于格式错误的XML文件有太多的问题：所以你必须更难处理注释和错位的空格/ CR / TAB /等。

这个答案的目的是为任何需要解析UML而不使用perl，python或其他任何东西的复杂工具的人提供现成的2个开箱即用的bash函数。至于我，我不能安装cpan，也不能安装我正在使用的旧生产操作系统的perl模块，并且python不可用。

首先，本文中使用的UML词的定义：

<!-- comment... -->
<tag attribute="value">content...</tag>

编辑：更新的功能，句柄为：

Websphere xml（xmi和xmlns属性）
必须具有256色的兼容终端
24种灰色
为IBM AIX添加的兼容性bash 3.2.16（1）

函数，首先是xml_read_dom，由xml_read递归调用：

xml_read_dom() {
# https://stackoverflow.com/questions/893585/how-to-parse-xml-in-bash
local ENTITY IFS=\>
if $ITSACOMMENT; then
  read -d \< COMMENTS
  COMMENTS="$(rtrim "${COMMENTS}")"
  return 0
else
  read -d \< ENTITY CONTENT
  CR=$?
  [ "x${ENTITY:0:1}x" == "x/x" ] && return 0
  TAG_NAME=${ENTITY%%[[:space:]]*}
  [ "x${TAG_NAME}x" == "x?xmlx" ] && TAG_NAME=xml
  TAG_NAME=${TAG_NAME%%:*}
  ATTRIBUTES=${ENTITY#*[[:space:]]}
  ATTRIBUTES="${ATTRIBUTES//xmi:/}"
  ATTRIBUTES="${ATTRIBUTES//xmlns:/}"
fi

# when comments sticks to !-- :
[ "x${TAG_NAME:0:3}x" == "x!--x" ] && COMMENTS="${TAG_NAME:3} ${ATTRIBUTES}" && ITSACOMMENT=true && return 0

# http://tldp.org/LDP/abs/html/string-manipulation.html
# INFO: oh wait it doesn't work on IBM AIX bash 3.2.16(1):
# [ "x${ATTRIBUTES:(-1):1}x" == "x/x" -o "x${ATTRIBUTES:(-1):1}x" == "x?x" ] && ATTRIBUTES="${ATTRIBUTES:0:(-1)}"
[ "x${ATTRIBUTES:${#ATTRIBUTES} -1:1}x" == "x/x" -o "x${ATTRIBUTES:${#ATTRIBUTES} -1:1}x" == "x?x" ] && ATTRIBUTES="${ATTRIBUTES:0:${#ATTRIBUTES} -1}"
return $CR
}

和第二个：

xml_read() {
# https://stackoverflow.com/questions/893585/how-to-parse-xml-in-bash
ITSACOMMENT=false
local MULTIPLE_ATTR LIGHT FORCE_PRINT XAPPLY XCOMMAND XATTRIBUTE GETCONTENT fileXml tag attributes attribute tag2print TAGPRINTED attribute2print XAPPLIED_COLOR PROSTPROCESS USAGE
local TMP LOG LOGG
LIGHT=false
FORCE_PRINT=false
XAPPLY=false
MULTIPLE_ATTR=false
XAPPLIED_COLOR=g
TAGPRINTED=false
GETCONTENT=false
PROSTPROCESS=cat
Debug=${Debug:-false}
TMP=/tmp/xml_read.$RANDOM
USAGE="${C}${FUNCNAME}${c} [-cdlp] [-x command <-a attribute>] <file.xml> [tag | \"any\"] [attributes .. | \"content\"]
${nn[2]}  -c = NOCOLOR${END}
${nn[2]}  -d = Debug${END}
${nn[2]}  -l = LIGHT (no \"attribute=\" printed)${END}
${nn[2]}  -p = FORCE PRINT (when no attributes given)${END}
${nn[2]}  -x = apply a command on an attribute and print the result instead of the former value, in green color${END}
${nn[1]}  (no attribute given will load their values into your shell; use '-p' to print them as well)${END}"

! (($#)) && echo2 "$USAGE" && return 99
(( $# < 2 )) && ERROR nbaram 2 0 && return 99
# getopts:
while getopts :cdlpx:a: _OPT 2>/dev/null
do
{
  case ${_OPT} in
    c) PROSTPROCESS="${DECOLORIZE}" ;;
    d) local Debug=true ;;
    l) LIGHT=true; XAPPLIED_COLOR=END ;;
    p) FORCE_PRINT=true ;;
    x) XAPPLY=true; XCOMMAND="${OPTARG}" ;;
    a) XATTRIBUTE="${OPTARG}" ;;
    *) _NOARGS="${_NOARGS}${_NOARGS+, }-${OPTARG}" ;;
  esac
}
done
shift $((OPTIND - 1))
unset _OPT OPTARG OPTIND
[ "X${_NOARGS}" != "X" ] && ERROR param "${_NOARGS}" 0

fileXml=$1
tag=$2
(( $# > 2 )) && shift 2 && attributes=$*
(( $# > 1 )) && MULTIPLE_ATTR=true

[ -d "${fileXml}" -o ! -s "${fileXml}" ] && ERROR empty "${fileXml}" 0 && return 1
$XAPPLY && $MULTIPLE_ATTR && [ -z "${XATTRIBUTE}" ] && ERROR param "-x command " 0 && return 2
# nb attributes == 1 because $MULTIPLE_ATTR is false
[ "${attributes}" == "content" ] && GETCONTENT=true

while xml_read_dom; do
  # (( CR != 0 )) && break
  (( PIPESTATUS[1] != 0 )) && break

  if $ITSACOMMENT; then
    # oh wait it doesn't work on IBM AIX bash 3.2.16(1):
    # if [ "x${COMMENTS:(-2):2}x" == "x--x" ]; then COMMENTS="${COMMENTS:0:(-2)}" && ITSACOMMENT=false
    # elif [ "x${COMMENTS:(-3):3}x" == "x-->x" ]; then COMMENTS="${COMMENTS:0:(-3)}" && ITSACOMMENT=false
    if [ "x${COMMENTS:${#COMMENTS} - 2:2}x" == "x--x" ]; then COMMENTS="${COMMENTS:0:${#COMMENTS} - 2}" && ITSACOMMENT=false
    elif [ "x${COMMENTS:${#COMMENTS} - 3:3}x" == "x-->x" ]; then COMMENTS="${COMMENTS:0:${#COMMENTS} - 3}" && ITSACOMMENT=false
    fi
    $Debug && echo2 "${N}${COMMENTS}${END}"
  elif test "${TAG_NAME}"; then
    if [ "x${TAG_NAME}x" == "x${tag}x" -o "x${tag}x" == "xanyx" ]; then
      if $GETCONTENT; then
        CONTENT="$(trim "${CONTENT}")"
        test ${CONTENT} && echo "${CONTENT}"
      else
        # eval local $ATTRIBUTES => eval test "\"\$${attribute}\"" will be true for matching attributes
        eval local $ATTRIBUTES
        $Debug && (echo2 "${m}${TAG_NAME}: ${M}$ATTRIBUTES${END}"; test ${CONTENT} && echo2 "${m}CONTENT=${M}$CONTENT${END}")
        if test "${attributes}"; then
          if $MULTIPLE_ATTR; then
            # we don't print "tag: attr=x ..." for a tag passed as argument: it's usefull only for "any" tags so then we print the matching tags found
            ! $LIGHT && [ "x${tag}x" == "xanyx" ] && tag2print="${g6}${TAG_NAME}: "
            for attribute in ${attributes}; do
              ! $LIGHT && attribute2print="${g10}${attribute}${g6}=${g14}"
              if eval test "\"\$${attribute}\""; then
                test "${tag2print}" && ${print} "${tag2print}"
                TAGPRINTED=true; unset tag2print
                if [ "$XAPPLY" == "true" -a "${attribute}" == "${XATTRIBUTE}" ]; then
                  eval ${print} "%s%s\ " "\${attribute2print}" "\${${XAPPLIED_COLOR}}\"\$(\$XCOMMAND \$${attribute})\"\${END}" && eval unset ${attribute}
                else
                  eval ${print} "%s%s\ " "\${attribute2print}" "\"\$${attribute}\"" && eval unset ${attribute}
                fi
              fi
            done
            # this trick prints a CR only if attributes have been printed durint the loop:
            $TAGPRINTED && ${print} "\n" && TAGPRINTED=false
          else
            if eval test "\"\$${attributes}\""; then
              if $XAPPLY; then
                eval echo "\${g}\$(\$XCOMMAND \$${attributes})" && eval unset ${attributes}
              else
                eval echo "\$${attributes}" && eval unset ${attributes}
              fi
            fi
          fi
        else
          echo eval $ATTRIBUTES >>$TMP
        fi
      fi
    fi
  fi
  unset CR TAG_NAME ATTRIBUTES CONTENT COMMENTS
done < "${fileXml}" | ${PROSTPROCESS}
# http://mywiki.wooledge.org/BashFAQ/024
# INFO: I set variables in a "while loop" that's in a pipeline. Why do they disappear? workaround:
if [ -s "$TMP" ]; then
  $FORCE_PRINT && ! $LIGHT && cat $TMP
  # $FORCE_PRINT && $LIGHT && perl -pe 's/[[:space:]].*?=/ /g' $TMP
  $FORCE_PRINT && $LIGHT && sed -r 's/[^\"]*([\"][^\"]*[\"][,]?)[^\"]*/\1 /g' $TMP
  . $TMP
  rm -f $TMP
fi
unset ITSACOMMENT
}

最后，rtrim，trim和echo2（到stderr）函数：

rtrim() {
local var=$@
var="${var%"${var##*[![:space:]]}"}"   # remove trailing whitespace characters
echo -n "$var"
}
trim() {
local var=$@
var="${var#"${var%%[![:space:]]*}"}"   # remove leading whitespace characters
var="${var%"${var##*[![:space:]]}"}"   # remove trailing whitespace characters
echo -n "$var"
}
echo2() { echo -e "$@" 1>&2; }

彩色化：

哦，您需要先定义一些整洁的着色动态变量，然后导出：

set -a
TERM=xterm-256color
case ${UNAME} in
AIX|SunOS)
  M=$(${print} '\033[1;35m')
  m=$(${print} '\033[0;35m')
  END=$(${print} '\033[0m')
;;
*)
  m=$(tput setaf 5)
  M=$(tput setaf 13)
  # END=$(tput sgr0)          # issue on Linux: it can produces ^[(B instead of ^[[0m, more likely when using screenrc
  END=$(${print} '\033[0m')
;;
esac
# 24 shades of grey:
for i in $(seq 0 23); do eval g$i="$(${print} \"\\033\[38\;5\;$((232 + i))m\")" ; done
# another way of having an array of 5 shades of grey:
declare -a colorNums=(238 240 243 248 254)
for num in 0 1 2 3 4; do nn[$num]=$(${print} "\033[38;5;${colorNums[$num]}m"); NN[$num]=$(${print} "\033[48;5;${colorNums[$num]}m"); done
# piped decolorization:
DECOLORIZE='eval sed "s,${END}\[[0-9;]*[m|K],,g"'

如何加载所有内容：

要么你知道如何创建函数并通过FPATH（ksh）加载它们或模拟FPATH（bash）

如果没有，只需在命令行上复制/粘贴所有内容。

它是如何运作的：

xml_read [-cdlp] [-x command <-a attribute>] <file.xml> [tag | "any"] [attributes .. | "content"]
  -c = NOCOLOR
  -d = Debug
  -l = LIGHT (no \"attribute=\" printed)
  -p = FORCE PRINT (when no attributes given)
  -x = apply a command on an attribute and print the result instead of the former value, in green color
  (no attribute given will load their values into your shell as $ATTRIBUTE=value; use '-p' to print them as well)

xml_read server.xml title content     # print content between <title></title>
xml_read server.xml Connector port    # print all port values from Connector tags
xml_read server.xml any port          # print all port values from any tags

使用调试模式（-d）将注释和已解析的属性打印到stderr

Answer 8

我不知道任何纯shell XML解析工具。因此，您很可能需要使用其他语言编写的工具。

我的XML :: Twig Perl模块附带了这样一个工具：xml_grep，您可能会在xml_grep -t '/html/head/title' xhtmlfile.xhtml > titleOfXHTMLPage.txt处写下您想要的内容（-t选项会将结果作为文本提供给您而不是xml）

Answer 9

另一个命令行工具是我的新Xidel。它还支持XPath 2和XQuery，与已经提到的xpath / xmlstarlet相反。

标题可以如下：

xidel xhtmlfile.xhtml -e /html/head/title > titleOfXHTMLPage.txt

它还有一个很酷的功能，可以将多个变量导出到bash。例如

eval $(xidel xhtmlfile.xhtml -e 'title := //title, imgcount := count(//img)' --output-format bash )

将$title设置为标题，将$imgcount设置为文件中的图像数量，这应该像在bash中直接解析它一样灵活。

Answer 10

好吧，你可以使用xpath实用程序。我猜perl的XML :: Xpath包含它。

Answer 11

在对XML文件中的文件路径的Linux和Windows格式进行翻译的一些研究之后，我发现了有趣的教程和解决方案：

General informations about XPaths
Amara - 用于XML的Pythonic工具集合
使用4Suite（2部分）开发Python / XML

Answer 12

虽然有很多现成的控制台实用程序可能会做你想要的，但是用Python等通用编程语言编写几行代码可能会花费更少的时间，你可以很容易地扩展它们并适应您的需求。

这是一个使用lxml进行解析的python脚本 - 它将文件或URL的名称作为第一个参数，将XPath表达式作为第二个参数，并打印与给定的匹配的字符串/节点表达

示例1

#!/usr/bin/env python
import sys
from lxml import etree

tree = etree.parse(sys.argv[1])
xpath_expression = sys.argv[2]

#  a hack allowing to access the
#  default namespace (if defined) via the 'p:' prefix    
#  E.g. given a default namespaces such as 'xmlns="http://maven.apache.org/POM/4.0.0"'
#  an XPath of '//p:module' will return all the 'module' nodes
ns = tree.getroot().nsmap
if ns.keys() and None in ns:
    ns['p'] = ns.pop(None)
#   end of hack    

for e in tree.xpath(xpath_expression, namespaces=ns):
    if isinstance(e, str):
        print(e)
    else:
        print(e.text and e.text.strip() or etree.tostring(e, pretty_print=True))

lxml可以与pip install lxml一起安装。在ubuntu上，您可以使用sudo apt install python-lxml。

用法

python xpath.py myfile.xml "//mynode"

lxml也接受一个网址作为输入：

python xpath.py http://www.feedforall.com/sample.xml "//link"

注意：如果您的XML有一个没有前缀的默认命名空间（例如xmlns=http://abc...），那么您必须使用p前缀（由＆＃39;提供）。 hack＆＃39;）在你的表达中，例如//p:module从pom.xml文件中获取模块。如果p前缀已经映射到您的XML中，那么您需要修改脚本以使用其他前缀。

示例2

一个一次性脚本，用于从apache maven文件中提取模块名称的狭隘目的。请注意节点名称（module）如何以默认命名空间{http://maven.apache.org/POM/4.0.0}作为前缀：

的pom.xml ：

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modules>
        <module>cherries</module>
        <module>bananas</module>
        <module>pears</module>
    </modules>
</project>

module_extractor.py ：

from lxml import etree
for _, e in etree.iterparse(open("pom.xml"), tag="{http://maven.apache.org/POM/4.0.0}module"):
    print(e.text)

Answer 13

如果您想要XML属性，这是有效的：

$ cat alfa.xml
<video server="asdf.com" stream="H264_400.mp4" cdn="limelight"/>

$ sed 's.[^ ]*..;s./>..' alfa.xml > alfa.sh

$ . ./alfa.sh

$ echo "$stream"
H264_400.mp4

Answer 14

Yuzem的方法可以通过反转<函数和变量赋值中>和rdom符号的顺序来改进，以便：

rdom () { local IFS=\> ; read -d \< E C ;}

变为：

rdom () { local IFS=\< ; read -d \> C E ;}

如果不像这样进行解析，则永远不会到达XML文件中的最后一个标记。如果您打算在while循环结束时输出另一个XML文件，则可能会出现问题。

Answer 15

简介

非常感谢您提供较早的答案。问题标题非常含糊，因为当问卷实际想要解析xml时，问卷会询问如何解析xhtml，谈论歧义。尽管它们很相似，但它们肯定是不一样的。而且由于xml和xhtml是不同的，因此很难提出一种完全适合问卷要求的解决方案。但是，我希望下面的解决方案仍然可以。我想承认我找不到如何专门针对/html/head/title的外观。现在，在写完这篇文章后，我想说的是，我对早期的答案不满意，因为某些回答者不必要地重新发明了轮子，当调查问卷没有说禁止下载软件包时，< / strong>。我根本不理解不必要的编码。我特别想重复此线程中某人所说的话：仅因为您可以编写自己的解析器，并不意味着您应该-@Stephen Niedzielski。关于编程：最简单，最短的方法是按照规则进行选择，而不要使任何事情都变得比以往任何时候都要复杂。该解决方案已经在 Windows 10> Linux的Windows子系统> Ubuntu 上进行了测试，效果良好。如果另一个title元素可能存在并被选择，则可能会导致不好的结果，对此很抱歉。例如：如果<body>标签位于<head>标签之前，并且<body>标签包含<title>标签，但这是非常非常不可能的。

TLDR /解决方案

在一般解决方案上，谢谢@ Grisha，@ Nat，How to parse XML in Bash?

在删除xml标签时，谢谢@Johnsyweb，How to remove XML tags from Unix command line?

1。。安装“软件包” xmlstarlet

2。。以bash xmlstarlet sel -t -m "//_:title" -c . -n xhtmlfile.xhtml | head -1 | sed -e 's/<[^>]*>//g' > titleOfXHTMLPage.txt
执行

如何在Bash中解析XML？

15 个答案:

函数，首先是xml_read_dom，由xml_read递归调用：

彩色化：

如何加载所有内容：

它是如何运作的：

示例1

用法

示例2