Question

我正在尝试从页面中提取数据（它是内部的，因此链接在我的组织外部不起作用。）我需要做的是将表格导入到我的唯一标准（来自用户输入）的位置连续。

import requests

caseref = input('Case Ref:')

url = 'http://dfu-display/tv'
page = requests.get(url)
#print (page.status_code) SHOWS RESULT OF REQUEST IF FAILS EDIT THIS IN

from bs4 import beautifulsoup

soup = beautifulsoup(page.text, 'html.parser')
#print(soup.prettify())

for row in soup.findAll('table')[0].tbody.findAll('tr'):
    first_column = row.findAll()[0].contents
    second_column = row.findAll()[1].contents
    print (first_column)
    print (second_column)

第1列中的数据始终是唯一的引用号，第2列是任务分配给的人。我需要一些如何在第1列中找到一个点（来自用户输入的caseref），然后计算高于此值的用户“未分配”的次数

如果可能的话，我想在找到用户输入后停止解析表，因为这会加快代码的速度。

Answer 1

如果我理解，第一项任务是找到第1列中出现唯一引用号的位置。

urn = input("Unique Reference Number: ")
location = column1.index(urn)

然后，您希望在该位置之前找到所有出现的名称（即location-1），以便您可以对这些事件进行计数。首先切片列表并将名称计算在内。

names_to_search = column2[:location]
search_name = names_to_count_through[-1]

现在我们可以过滤搜索列表中的名称并检查其长度，以了解到目前为止search_name发生了多少次。

names_to_count = filter(lambda n: n == search_name, names_to_search)
occurrences = len(list(names_to_count))  # list to force evaluation of filter
other_people = len(names_to_search) - occurrences

计算表中'y'位置上方'x'的出现次数（Python）

1 个答案: