Question

我有一个代码，下面使用BeautifulSoup来抓取网页数据。我使用两个不同的for循环来获取两组不同的数据：name和value

from bs4 import BeautifulSoup
import requests
import csv

source = requests.get('https://finance.yahoo.com/quote/' + ticker + '/key-statistics?p=' + ticker).text
soup = BeautifulSoup(source, 'lxml')

csv_file = open('yahoo_key_stats_grab.csv', 'w')

csv_writer = csv.writer(csv_file)
csv_writer.writerow(['name', 'value'])

def yahoo_key_stats_grab(ticker):

    for stat in soup.find_all('span')[12:21]:
        name = stat.text
        print(name)
        csv_writer.writerow([name])

    for stat in soup.find_all('td', class_='Fz(s) Fw(500) Ta(end)'):

        if len(str(stat.text)) > 6:
            break

        else:
            print(stat.text)


    csv_file.close()

如果我运行代码yahoo_key_stats_grab('MIC')，我会得到以下输出：这正是我想要的。

Market Cap (intraday)
Enterprise Value
Trailing P/E
Forward P/E
PEG Ratio (5 yr expected)
Price/Sales
Price/Book
Enterprise Value/Revenue
Enterprise Value/EBITDA
3.23B
6.8B
6.95
16.04
1.64
1.73
1.04
3.65
10.80

但是，我想将已删除的数据保存在包含两列name和value的csv文件中。我可以获取名称列，但我无法弄清楚如何将第二列value添加到csv文件中。

name                             value

Market Cap (intraday)   

Enterprise Value    

Trailing P/E    

Forward P/E 

PEG Ratio (5 yr expected)   

Price/Sales 

Price/Book  

Enterprise Value/Revenue    

Enterprise Value/EBITDA

有人能给我一些建议吗？提前谢谢。

Answer 1

您可以通过将数组传递给csv.write（）方法来向csv文件添加列。

示例：

    import csv

    data = [["key1", "value1"], ["key2", "value2"]

    csv_file = open('testfile.csv', 'w')

    csv_writer = csv.writer(csv_file)
    csv_writer.writerow(['name', 'value'])

    for row in data:
        csv_writer.writerow(data[0], data[1])

    csv_file.close()

更新：在您的情况下，由于您有两个不同的for循环创建数据，您可以将第一组数据存储在列表中：

from bs4 import BeautifulSoup
import requests
import csv

source = requests.get('https://finance.yahoo.com/quote/' + ticker + '/key-statistics?p=' + ticker).text
soup = BeautifulSoup(source, 'lxml')

csv_file = open('yahoo_key_stats_grab.csv', 'w')

csv_writer = csv.writer(csv_file)
csv_writer.writerow(['name', 'value'])

def yahoo_key_stats_grab(ticker):
    names = []

    for stat in soup.find_all('span')[12:21]:
        names.append(stat.text)

    for stat in soup.find_all('td', class_='Fz(s) Fw(500) Ta(end)'):

        if len(str(stat.text)) > 6:
            break

        else:
            csv_writer.writerow([names.pop(0), stat.text])
            # note that this will throw an exception if there
            # are a different number of names and stats!


    csv_file.close()

Answer 2

它可能不是最佳选项，但可以将值附加到已运行的for循环中的列表，然后使用您收集的值打印出所需内容。类似的东西：

field = []
value = []
for stat in soup.find_all('span')[12:21]:
    name = stat.text
    print(name)
    field.append(name)

for stat in soup.find_all('td', class_='Fz(s) Fw(500) Ta(end)'):

    if len(str(stat.text)) > 6:
        break

    else:
        value.append(stat.text)

然后用一个新的for循环打印出来，csv_writer在一行中用csv所需的任何分隔符分隔

运行两个单独的for循环后，在csv文件上保存两列

2 个答案: