我需要此脚本的结果,即pygoogle搜索结果,如下所示:
name # of results
name # of results
name # of results
以下是我到目前为止的情况,如何在不重新编写文件的情况下执行此操作:
import re
import pygoogle
import csv
from pygoogle import pygoogle
#creates list
with open('parse2.txt') as f:
lines = [x.strip() for x in f.read().strip('\'"[]').split(' '*6)]
#googles each name in list
for line in lines:
g = pygoogle(line)
g.pages = 1
names = [line + " " + "%s results" %(g.get_result_count())]
if (g.get_result_count()) == 0:
print "ERROR. SEARCH NOT SUCCSESSFUL. TRY AGAIN IN A FEW MINUTES."
elif (g.get_result_count()) > 0:
print names
for name in names:
with open("output.txt", "wb+") as f:
f.writelines(name)
当我运行脚本时,输出只显示最新的脚本,因为它正在重写脚本:
答案 0 :(得分:1)
names
变量将是每次使用时只包含一个项目的列表。这样做:
import re
import csv
from pygoogle import pygoogle
names = []
with open('parse2.txt') as fin:
names = [x.strip() for x in fin.read().strip('\'"[]').split(' '*6)]
with open("output.txt") as fout:
for name in names:
g = pygoogle(name)
g.pages = 1
if (g.get_result_count()) == 0:
print "[Error]: could find no result for '{}'".format(name)
else:
fout.write("{} {} results\n".format(name, g.get_result_count()) )
不覆盖以前的查询
您需要反转with
和for
语句的顺序,这将打开文件一次:
with open("output.txt", "wb+") as f:
for line in lines:
# Stuff...
for name in names:
f.writelines(name)
或者,以追加模式打开文件:
for name in names:
with open("output.txt", "a") as f:
f.writelines(name)
在这种情况下,数据将在最后添加。
为获得你想要的东西而采取的步骤。
如下:
import re
from itertools import *
A = ["blah blah", "blah blah", "blah", "list"]
#
# from itertools doc page
#
def flatten(listOfLists):
"Flatten one level of nesting"
return list(chain.from_iterable(listOfLists))
def pairwise(t):
it = iter(t)
return izip(it,it)
#
# Transform data
#
list_of_lists = [re.split("[ ,]", item) for item in A]
# [['blah', 'blah'], ['blah', 'blah'], ['blah'], ['list']]
a_words = flatten(list_of_lists)
a_pairs = pairwise(a_words)
with open("output.csv", "wb") as f:
writer = csv.writer(f)
writer.writerows(a_pairs)
更简洁地写成:
A_pairs = pairwise(flatten([re.split("[ ,]", item) for item in A]))
with open("output.csv", "wb") as f:
csv.writer(f).writerows(A_pairs)
如果您不想在输出中使用逗号,只需为csvwriter
定义自定义方言:
>>> csv.register_dialect('mydialect', delimiter=' ', quoting=csv.QUOTE_MINIMAL)
>>> csv.writer(open("try.csv", "w"), dialect="mydialect").writerows(a_ps)
给出你想要的东西:
➤ cat try.csv
blah blah
blah blah
blah list
答案 1 :(得分:0)
要写入追加到文件而不重写,请将+
添加到模式:
for name in names:
with open("output.txt", "wb+") as f:
writer = csv.writer(f)
writer.writerows(A)
另一方面,为了提高效率,您只能打开一次文件并使用文件方法代替CSV模块:
with open("output.txt", "wb+") as f:
f.writelines(A)
答案 2 :(得分:0)
这样的事情:
>>> import csv
>>> A = ["blah blah", "blah blah", "blah", "list"]
>>> lis = [y for x in A for y in x.split()]
>>> lis
['blah', 'blah', 'blah', 'blah', 'blah', 'list']
>>> it = iter(lis)
>>> with open("output.csv", "wb") as f:
writer = csv.writer(f, delimiter=' ')
writer.writerows([ [x,next(it)] for x in it])