Question

我正在尝试使用grep来捕获字符串中的数字，但我遇到了困难。

echo "There are <strong>54</strong> cities | grep -o "([0-9]+)"

我怎么想让它返回“54”？我已经尝试了上面的grep命令，它不起作用。

echo "You have <strong>54</strong>" | grep -o '[0-9]'似乎有点工作但打印

5
4

而不是54

Answer 1

$ echo "There are <strong>54</strong> cities " |
    xmllint --html --xpath '//strong/text()' -

Answer 2

您需要使用＆＃34; E＆＃34;扩展正则表达式支持的选项（或使用egrep）。在我的Mac OSX上：

$ echo "There are <strong>54</strong> cities" | grep -Eo "[0-9]+"
54

您还需要考虑行中是否会出现多个数字。那么行为应该是什么？

编辑1：既然您现在已将要求指定为<strong>标签之间的数字，我建议您使用sed。在我的平台上，grep没有＆＃34; P＆＃34; perl样式正则表达式的选项。在我的另一个方框中，grep的版本指定这是一个实验性功能，所以在这种情况下我会使用sed。

$  echo "There are <strong>54</strong> 12 cities" | sed  -rn 's/^.*<strong>\s*([0-9]+)\s*<\/strong>.*$/\1/p'
54

此处"r"用于扩展正则表达式。

编辑2：如果你有＆＃34; PCRE＆＃34;在您的grep版本中，您还可以使用以下正面lookbehinds和lookaheads。

$  echo "There are <strong>54 </strong> 12 cities" | grep -o -P "(?<=<strong>)\s*([0-9]+)\s*(?=<\/strong>)"
54