Question

我在python中创建一个定义，使用urllib2下载状态页面并循环直到满足条件。

状态页面如下所示：

reportId:327686
reportName:report2
status:Running
percent_done:0

我需要;

解析reportId并使用此值
循环直到状态不同于＆＃34;正在运行＆＃34;

我可以在不使用re模块的情况下完成此操作吗？最后，我需要使用pyinstaller将其转换为exe，因此希望避免加载大量模块以保持程序较小。

Answer 1

这应该这样做：

import urllib2


def parse_data(raw_data):  # Name this better
    parsed_data = dict(line.split(':') for line in raw_data.splitlines())
    parsed_data['reportId'] = int(parsed_data['reportId'])
    parsed_data['percent_done'] = int(parsed_data['percent_done'])
    return parsed_data


def get_parsed_data_from_url(url):  # Name this better
    raw_data = urllib2.urlopen(url).read()
    parsed_data = parse_data(raw_data)
    return parsed_data


parsed_data = get_parsed_data_from_url('http://example.com')

# And to loop until status != 'Running', you could do this..

while get_parsed_data_from_url('http://example.com')['status'] == 'Running':
    do_some_stuff()

Answer 2

如果这是您的结果，并且您没有像That1Guy评论的那样拥有更多HTML，则可以使用startswith和endswith。有些事情（我在这里跳过了很多检查和默认值！）......

if line.startswith("reportId:"):
   report_id = line.split(":")[1]

if line.startswith("status:"):
  if line.endswith("Running") == false:
    # abort processing

Answer 3

由于您有匹配的稳定模式，因此非常简单：

reportIds=[]
# not written here: load report into variable s

for line in s.splitlines():
    if 'reportId' in line:
        reportIds.append(line.split(':')[1])
    if 'status' in line:
        if not line.split(':')[1] == 'Running':
            break
    # not written here: pause for some period of time

在没有re模块的情况下在python中搜索文本

3 个答案: