Question

我需要了解一个shell代码，该代码使用以下命令使用GOOGLE MAPS API获取从源到目标的路线：

wget --no-parent -O - https://maps.googleapis.com/maps/api/directions/json?origin=$begin\&destination=$finish\&sensor=false > new.txt

接下来我们获取输出的以下行：

**"html_instructions" : "Head \u003cb\u003enorthwest\u003c/b\u003e"**

grep -n html_instructions  new.txt > new1.txt

有人可以告诉我使用的含义：

sed -e 's/\\u003cb//g'

以下命令中的

等：

sed -e 's/\\u003cb//g' -e 's/\\u003e//g' -e 's/\\u003c\/b//g' -e 's/\\u003c//g' -e 's/div.*div//g' -e 's/.*://g' -e 's/"//g' -e 's/ "//g' new1.txt > new2.txt

仅输出Head northwest。

提前致谢！

Answer 1

sed -e 's/\\u003cb//g' -e 's/\\u003e//g' -e 's/\\u003c\/b//g' -e 's/\\u003c//g' -e 's/div.*div//g' -e 's/.*://g' -e 's/"//g' -e 's/ "//g' new1.txt > new2.txt

每个-e之后的字符串是sed命令。 sed命令s/\\u003cb//g搜索所有出现的unicode字符003CB（a greek small letter upsilon with dialytika）并将其替换为空。换句话说，它从字符串中删除字符。

因此，sed命令从行和new1.txt中删除每次出现的unicode字符003cb，u003e和u003c，并将输出发送到new2.txt。

此外，s/div.*div//g会导致以“div”开头和结尾的任何字符串被删除。命令s/.*://g删除行的开头到行中最后一个冒号的任何文本。 s/"//g删除双引号字符的每个出现。 s/ "//g删除每次出现的空格，然后双引号。

通常，sed命令s/new/old/会搜索第一次出现的new并将其替换为old。最后添加g，如s/new/old/g中所示，它会全局替换：查找每次出现的new并将其替换为old。为这些命令添加大量功能，new可能是正则表达式。考虑s/.*: // g . The dot character has the special meaning of "any character at all". The star character means zero or more of the preceding character. Thus the regular expression。*：`表示零个或多个后跟冒号的字符。

Answer 2

您可以使用awk：

一次性完成所有操作

awk -F\" '/html_instructions/ {gsub(/(\\u003(c|cb|e)|\/b)/,x);print $4}'
Head northwest

所以整行应该是：

wget --no-parent -O - https://maps.googleapis.com/maps/api/directions/json?origin=$begin\&destination=$finish\&sensor=false | awk -F\" '/html_instructions/ {gsub(/(\\u003(c|cb|e)|\/b)/,x);print $4}'
Head northwest

将其变为变量

d=$(wget --no-parent -O - https://maps.googleapis.com/maps/api/directions/json?origin=$begin\&destination=$finish\&sensor=false | awk -F\" '/html_instructions/ {gsub(/(\\u003(c|cb|e)|\/b)/,x);print $4}')
echo $d
Head northwest

了解SED命令

2 个答案: