I'm trying to make a bash script that will tell me the latest stable version of the Linux kernel.
The problem is that, while I can remove everything after certain characters, I don't seem to be able to delete everything prior to certain characters.
#!/bin/bash
wget=$(wget --output-document - --quiet www.kernel.org | \grep -A 1 "latest_link")
wget=${wget##.tar.xz\">}
wget=${wget%</a>}
echo "${wget}"
Somehow the output "ignores" the wget=${wget##.tar.xz\">}
line.
答案 0 :(得分:2)
You're trying remove the longest match of the pattern .tar.xz\">
from the beginning of the string, but your string doesn't start with .tar.xz
, so there is no match.
You have to use
wget=${wget##*.tar.xz\">}
Then, because you're in a script and not an interactive shell, there shouldn't be any need to escape \grep
(presumably to prevent usage of an alias), as aliases are disabled in non-interactive shells.
And, as pointed out, naming a variable the same as an existing command (often found: test
) is bound to lead to confusion.
If you want to use command line tools designed to deal with HTML, you could have a look at the W3C HTML-XML-utils (Ubuntu: apt install html-xml-utils
). Using them, you could get the info you want as follows:
$ curl -sL www.kernel.org | hxselect 'td#latest_link' | hxextract a -
4.10.8
Or, in detail:
curl -sL www.kernel.org | # Fetch page
hxselect 'td#latest_link' | # Select td element with ID "latest_link"
hxextract a - # Extract link text ("-" for standard input)
答案 1 :(得分:1)
Whenever I need to extract a substring in bash I always see if I can brute force it in a couple of cut(1) commands. In your case, the following appears to work:
wget=$(wget --output-document - --quiet www.kernel.org | \grep -A 1 "latest_link")
echo $wget | cut -d'>' -f3 | cut -d'<' -f1
I'm certain there's a more elegant way, but this has simple syntax that I never forget. Note that it will break if 'wget' gets extra ">" or "<" characters in the future.
答案 2 :(得分:0)
建议不要使用shell工具grep,awk,sed等来解析HTML文件。
然而,如果你想要一个快速的单线,那么这个awk应该做的工作:
get --output-document - --quiet www.kernel.org |
awk '/"latest_link"/ { getline; n=split($0, a, /[<>]/); print a[n-2] }'
4.10.8
答案 3 :(得分:0)
sed
方法:
wget --output-document - --quiet www.kernel.org | \
sed -n '/latest_link/{n;s/^.*">//;s/<.*//p}'
输出:
4.10.8