Question

我正在使用以下代码，除了我的代码从Excel吐出到CSV文件并且每隔一行跳过它之外，它运行良好。我已经用stackoverflow.com搜索了csv模块文档和其他示例，我发现我需要使用DictWriter和'\ n'设置的lineterminator。我自己试图将其写入代码已被挫败。

所以我想知道有没有办法让我将这个（作为lineterminator）应用到整个文件中，这样我就没有任何行被跳过了？如果是这样的话？

以下是代码：

import urllib2
from BeautifulSoup import BeautifulSoup
import csv

page = urllib2.urlopen('http://finance.yahoo.com/q/ks?s=F%20Key%20Statistics').read()

f = csv.writer(open("pe_ratio.csv","w"))
f.writerow(["Name","PE"])

soup = BeautifulSoup(page)
all_data = soup.findAll('td', "yfnc_tabledata1")
f.writerow([all_data[2].getText()])

提前感谢您的帮助。

Answer 1

首先，由于Yahoo提供了返回CSV文件的API，也许你可以通过这种方式解决问题？例如，this URL会返回一个CSV文件，其中包含该行业所有股票的价格，市值，市盈率和其他指标。有一些more information in this Google Code project。

您的代码只生成两行CSV，因为只有两次调用f.writerow()。如果您想从该页面获得的唯一数据是P / E比率，这几乎肯定不是最好的方法，但您应该将f.writerow()传递给包含每列值的元组。为了与标题行保持一致，这将是：

f.writerow( ('Ford', all_data[2].getText()) )

当然，这假设市盈率永远是列表中的第二位。如果您想要在该页面上提供所有统计信息，您可以尝试：

# scrape the html for the name and value of each metric
metrics = soup.findAll('td', 'yfnc_tablehead1')
values = soup.findAll('td', 'yfnc_tabledata1')

# create a list of tuples for the writerows method
def stripTag(tag): return tag.text
data = zip(map(stripTag, metrics), map(stripTag, values))

# write to csv file
f.writerows(data)

Answer 2

您需要使用正确的选项打开文件，以使csv.writer类正常工作。该模块内部具有通用换行支持，因此您需要在文件级别关闭Python的通用换行支持。

对于Python 2，the docs say：

如果csvfile是一个文件对象，则必须在平台上使用'b'标志打开它，这会产生影响。

对于Python 3，they say：

如果csvfile是文件对象，则应使用newline=''打开它。

此外，您应该使用with语句来处理打开和关闭文件，如下所示：

with open("pe_ratio.csv","wb") as f: # or open("pe_ratio.csv", "w", newline="") in Py3
    writer = csv.writer(f)

    # do other stuff here, staying indented until you're done writing to the file

Python DictWriter / n

2 个答案: