Question

说我有一个字符串：

random text before authentication_token = 'pYWastSemJrMqwJycZPZ', gravatar_hash = 'd74a97f

我希望shell命令在"authentication_token = '"之后和下一个'之前提取所有内容。

基本上，我想返回pYWastSemJrMqwJycZPZ。

我该怎么做？

Answer 1

如果你的grep支持-P，那么你可以使用这个PCRE正则表达式，

$ echo "random text before authentication_token = 'pYWastSemJrMqwJycZPZ', gravatar_hash = 'd74a97f" | grep -oP "authentication_token = '\K[^']*"
pYWastSemJrMqwJycZPZ

$ echo "random text before authentication_token = 'pYWastSemJrMqwJycZPZ', gravatar_hash = 'd74a97f" | grep -oP "authentication_token = '\K[^']*(?=')"
pYWastSemJrMqwJycZPZ

\K会丢弃先前在匹配时打印的匹配字符。
[^']*否定了符合任何字符但不符合'零次或多次的字符类。
(?=')肯定前瞻，断言匹配必须跟一个单引号。

Answer 2

使用参数扩展：

#!/bin/bash
text="random text before authentication_token = 'pYWastSemJrMqwJycZPZ', gravatar_hash = 'd74a97f"
token=${text##* authentication_token = \'}   # Remove the left part.
token=${token%%\'*}                          # Remove the right part.
echo "$token"

请注意，即使随机文本包含authentication token = '...'，它也能正常工作。

Answer 3

IMO，grep -oP是最佳解决方案。为了完整起见，有两种选择：

sed 's/.*authentication_token = '\''//; s/'\''.*//' <<<"$string"

awk -F "'" '{for (i=1; i<NF; i+=2) if ($1 ~ /authentication_token = $/) {print $(i+1); break}}' <<< "$string"

Answer 4

使用bash的正则表达式匹配工具。

$ regex="_token = '([^']+)'"
$ string="random text before authentication_token = 'pYWastSemJrMqwJycZPZ', gravatar_hash = 'd74a97f'"
$ [[ $string =~ $regex ]] && hash=${BASH_REMATCH[1]}
$ echo "$hash"
pYWastSemJrMqwJycZPZ

使用变量代替文字正则表达式可以简化引用空格和单引号。

Answer 5

我的简单版本是

sed -r "s/(.*authentication_token = ')([^']*)(.*)/\2/"

子串和子串之前的Bash提取

5 个答案: