Question

我需要此脚本的结果，即pygoogle搜索结果，如下所示：

name    # of results
name    # of results
name    # of results

以下是我到目前为止的情况，如何在不重新编写文件的情况下执行此操作：

import re
import pygoogle
import csv
from pygoogle import pygoogle
#creates list
with open('parse2.txt') as f:
    lines = [x.strip() for x in f.read().strip('\'"[]').split(' '*6)]
#googles each name in list
for line in lines:
    g = pygoogle(line)
    g.pages = 1
    names = [line + "    " + "%s results" %(g.get_result_count())]
    if (g.get_result_count()) == 0:
        print "ERROR. SEARCH NOT SUCCSESSFUL. TRY AGAIN IN A FEW MINUTES."
    elif (g.get_result_count()) > 0:
    print names
    for name in names:
        with open("output.txt", "wb+") as f:
            f.writelines(name)

当我运行脚本时，输出只显示最新的脚本，因为它正在重写脚本：

Answer 1

克服对循环行为的混淆：

names变量将是每次使用时只包含一个项目的列表。这样做：

import re
import csv
from pygoogle import pygoogle

names = []

with open('parse2.txt') as fin:
   names = [x.strip() for x in fin.read().strip('\'"[]').split(' '*6)]

with open("output.txt") as fout:
  for name in names:
    g = pygoogle(name)
    g.pages = 1
    if (g.get_result_count()) == 0:
      print "[Error]: could find no result for '{}'".format(name)
    else:
      fout.write("{}    {} results\n".format(name, g.get_result_count()) )

一次写出文件

不覆盖以前的查询

您需要反转with和for语句的顺序，这将打开文件一次：

with open("output.txt", "wb+") as f:
  for line in lines:
    # Stuff...
    for name in names:
      f.writelines(name)

或者，以追加模式打开文件：

for name in names:
    with open("output.txt", "a") as f:
        f.writelines(name)

在这种情况下，数据将在最后添加。

转换数据

为获得你想要的东西而采取的步骤。

将原始列表转换为单词列表。
将列表分组成对。
写出对。

如下：

import re
from itertools import *

A = ["blah blah", "blah blah", "blah", "list"]

#
# from itertools doc page
#
def flatten(listOfLists):
  "Flatten one level of nesting"
  return list(chain.from_iterable(listOfLists))

def pairwise(t):
  it = iter(t)
  return izip(it,it)

#
# Transform data
#
list_of_lists = [re.split("[ ,]", item) for item in A]
# [['blah', 'blah'], ['blah', 'blah'], ['blah'], ['list']]
a_words = flatten(list_of_lists)
a_pairs = pairwise(a_words)

with open("output.csv", "wb") as f:
    writer = csv.writer(f)
    writer.writerows(a_pairs)

更简洁地写成：

A_pairs = pairwise(flatten([re.split("[ ,]", item) for item in A]))
with open("output.csv", "wb") as f:
    csv.writer(f).writerows(A_pairs)

以正确的格式写出

如果您不想在输出中使用逗号，只需为csvwriter定义自定义方言：

>>> csv.register_dialect('mydialect', delimiter=' ', quoting=csv.QUOTE_MINIMAL)
>>> csv.writer(open("try.csv", "w"), dialect="mydialect").writerows(a_ps)

给出你想要的东西：

➤ cat try.csv 
blah blah
blah blah
blah list

Answer 2

要写入追加到文件而不重写，请将+添加到模式：

for name in names:
    with open("output.txt", "wb+") as f:
        writer = csv.writer(f)
        writer.writerows(A)

另一方面，为了提高效率，您只能打开一次文件并使用文件方法代替CSV模块：

with open("output.txt", "wb+") as f:
    f.writelines(A)

Answer 3

这样的事情：

>>> import csv
>>> A = ["blah blah", "blah blah", "blah", "list"]
>>> lis = [y for x in A for y in x.split()]
>>> lis
['blah', 'blah', 'blah', 'blah', 'blah', 'list']
>>> it = iter(lis)
>>> with open("output.csv", "wb") as f:
         writer = csv.writer(f, delimiter=' ')
         writer.writerows([ [x,next(it)] for x in it])

使用csv将结果写入.txt文件

3 个答案:

克服对循环行为的混淆：

一次写出文件

转换数据

以正确的格式写出