如何在CSV文件中保存已删除的列表?

时间:2016-10-01 16:18:36

标签: python csv web-scraping

我在下面编写了这段代码,它按照主题和日期从OED.com网站上删除,然后将它们打印出来。

import requests
import re
import urllib2
import os
import csv

year_search = 1550
subject_search = ['Law']

path = '/Applications/Python 3.5/Economic'
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
urllib2.install_opener(opener)

user_agent = 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'
header = {'User-Agent':user_agent}
request = urllib2.Request('http://www.oed.com/', None, header)
f = opener.open(request)
data = f.read()
f.close()
print 'database first access was successful'

resultPath = os.path.join(path, 'OED_table.csv')
htmlPath = os.path.join(path, 'OED.html')
outputw = open(resultPath, 'w')
outputh = open(htmlPath, 'w')
request = urllib2.Request(
    'http://www.oed.com/search?browseType=sortAlpha&case-insensitive=true'
    '&dateFilter='+str(year_search)+'&nearDistance=1&ordered=false&page=1'
    '&pageSize=100&scope=ENTRY&sort=entry&subjectClass='
    + str(subject_search) + '&type=dictionarysearch', None, header)
page = opener.open(request)
urlpage = page.read()
outputh.write(urlpage)
new_word = re.findall(
    r'<span class=\"hwSect\"><span class=\"hw\">(.*?)</span>', urlpage)
print str(new_word)
outputw.write(str(new_word))
page.close()
outputw.close()

现在我想将它们打印到CSV文件中,但是每年我输入的内容都会排成一行,而且这些单词都会落在行的行中。

类似:

1550| word1| word2| etc.|
1551| word1| word2| etc.|

有没有人有任何想法?

2 个答案:

答案 0 :(得分:1)

我建议使用csv.writer方法。这是示例代码:

`

with open('/Applications/Python 3.5/Economic/OED_table.csv', 'w') as csv_file:
    csv_writer = csv.writer(csv_file)
    year = ["1550"]
    new_word = ["apple", "banana"]
    complete_row = year + new_word
    csv_writer.writerow(complete_row)
    # writes 1550, apple, banana to OED_table.csv

`

您可以使用for循环修改它以插入多行。

答案 1 :(得分:0)

在您定义new_word的行之后,您可以执行以下操作:

year_info = [str(year_search)] + new_word
print '|'.join(year_info)

这将完全输出

1550 | WORD1 | WORD2 |等|