Question

我试图从＆＃34; curl＆＃34;中提取http状态代码。在＆＃34; find -exec＆＃34;的背景下变成一个变量。我需要这个来测试失败，然后发出报告并阻止脚本再次运行。我现在可以使用-write-out提取代码并打印到stdout，但我需要将它存储在脚本中供以后使用。

目前有类似的内容：

find . -cmin -140 -type f -iname '*.gz' -exec curl -T {} --write-out "%{http_code}\n" www.example.com/{} \;

示例输出：

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                             Dload  Upload   Total   Spent    Left  Speed
  101  7778    0     0  101  7778      0  17000 --:--:-- --:--:-- --:--:-- 17000
  000
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                             Dload  Upload   Total   Spent    Left  Speed
  101  7795    0     0  101  7795      0  17433 --:--:-- --:--:-- --:--:-- 17433
  000

＆＃39;＆＃39;是打印到控制台的http状态代码。我希望在控制台窗口中保持curl输出以便手动测试此脚本，但我需要提取状态代码以供以后在脚本中使用。

Answer 1

from bs4 import BeautifulSoup html = """<blockquote class="abstract"> <span class="descriptor"> abstract</span> Abstract text goes here </blockquote>""" soup = BeautifulSoup(html, "html.parser") abstract = soup.find('blockquote', class_='abstract') abstract.span.extract() # Remove span element print abstract.text的{{1}}参数在子进程中运行，然后退出。没有简单的方法来走私控制Abstract text goes here的状态。更好的方法是使用循环。

-exec

从＆＃34; find -exec curl＆＃34;中提取http状态代码

1 个答案: