Question

我受到了另一个问题的启发，写了一个脚本（或者说是一个单行代码）来抓取随机的维基百科页面。

这是我到目前为止所得到的：

# Grab the HTTP header response from Wikipedia's random page link
curl 'http://en.wikipedia.org/wiki/Special:Random' -sI

# Search STDIN for the Location header and grab its content
perl -wnl -e '/Location: (.*)/ and print $1;'

这很有效。它将随机的Wikipedia URL输出到控制台。但我需要在该网址上添加“？printable = yes”以获取没有所有非文章内容的维基百科页面。

然而，运行：

curl 'http://en.wikipedia.org/wiki/Special:Random' -sI | perl -wnl -e '/Location: (.*)/ and print $1 . "?printable=yes";'

输出：？可印刷= yespedia.org /维基/ James_Keene_（足球运动员）

为什么我的连接没有连接？

更新：

对于好奇的人来说，这是完成的单线程：

curl `curl 'http://en.wikipedia.org/wiki/Special:Random' -sI | perl -wnl -e '/Location: ([^\r]*)/ and print $1 . "?printable=yes";'`

Answer 1

curl 'http://en.wikipedia.org/wiki/Special:Random' -sI | perl -wnl -e '/Location: (.*)/ and chomp($1) and print $1 . "?printable=yes";'

未经测试，但这应该有用。返回到行的开头是由位置行末尾的流氓'\ r'字符引起的。该脚本正在打印维基百科URL，其中包含“\ _”，它返回到行的开头，然后继续打印?printable=yes。 Chomp会删除'\ r'字符。

为什么控制台输出返回到行的开头？或者为什么我的连接没有连接？

1 个答案: