Question

import os.path

endofprogram=False
try:

   filename1=input("Enter input file: ")
   filename2=input("Enter output file: ")

   while os.path.isfile(filename2):
       filename2=input("File Exists! Enter new name for output file: ") 

except IOError:
   print("Error opening file - End of program")
   endofprogram=True

if(endofprogram == False):
   infile=open(filename1, "r")
   content=infile.read()
   lines=[]
   words=[]

   lines=content.split('\n')
   print("Total animals=",len(lines))

我一直在研究这个与文件有关的程序。我有一个文件：

#color     size    flesh     class
brown     large    hard      safe
green     large    hard      safe
red       large    soft      dangerous
green     large    soft      safe


red       small    hard      safe
red       small    hard      safe
brown     small    hard      safe
green     small    soft      dangerous
green     small    hard      dangerous
red       large    hard      safe
brown     large    soft      safe
green     small    soft      dangerous
red       small    soft      safe
red       large    hard      dangerous
red       small    hard      safe
green     small    hard      dangerous

我应该回答以下问题：

动物总数？
危险动物总数？
安全的大型动物数量？

到目前为止，我可以打印出动物的总数，但它包括空格以及我不想要的评论行。目前，对于动物的总数，当它应该是16时，打印的是19。而且我不知道在那之后从哪里开始这两个问题。

Answer 1

您应该逐行处理文件，这比阅读整个文件更容易，例如：

infile = open('entrada', 'r')

animals = 0
safe_animals = 0
dangerous_animals = 0

for line in infile:
    line_components = line.strip().split()
    if line_components:
        animals += 1

        if line_components[3] == 'dangerous':
            dangerous_animals += 1
        elif line_components[3] == 'safe' and line_components[1] == 'large':
            safe_animals += 1

print "%i animals" % animals
print "%i safe animals" % safe_animals
print "%i dangerous animals" % dangerous_animals

Answer 2

不是一次读取所有内容，然后拆分content，您可能希望对文件对象使用readlines（）方法，如下所示：

lines = infile.readlines()

然后你可以做一些过滤，在解析列之前删除注释和/或空行。

一个例子是列表推导lines = [line for line in lines if len(line.strip()) > 0]。你可以做类似的事情来摆脱评论的行。

split方法更适合实际解析每一行。

Answer 3

这是一种非常冗长的方式：

color, size, flesh, clas = 0, 1, 2, 3 #column index
animals = []
with open ('animals.txt') as f:
    for line in f:
        if line[0] in '#\n': continue
        animals.append(line.split())
print(animals)
print(len(animals))
print(sum(1 for animal in animals if animal[clas] == 'dangerous'))
print(sum(1 for animal in animals if animal[clas] == 'safe' and animal[size] == 'large'))

说明：迭代所有行。如果该行为空或注释，请跳过它。否则分开线并将其添加到所有动物。每只动物都是四个元素的列表（因此第一行中的列索引）。现在只需过滤并计算匹配的动物。

Answer 4

为什么不使用数据库或yaml或json来保存数据？

比纯文本文件好很多，更容易解析/查询。

json http://docs.python.org/3.3/library/json.html pyYaml http://pyyaml.org/wiki/PyYAML

在python 3中读写文件

4 个答案: