检查以确保
a)每行长4列
b)如果程序末尾有一个新行('\ n'),请确保它不会失败
def ask_for_filename():
filename=raw_input("Please enter file name: ")
return filename
def read_data(filename):
with open(filename) as f:
data = f.readlines()
i = 0
for line in data:
lineContains = line.split('\t')
lineLength = len(lineContains) #calculate elements
i = i+1
if lineLength < 3 and i < len(data):
print "File is invalid format."
f.close()
return data
请您纠正我遇到问题的地方,因为这部分代码不起作用。
i = 0
for line in data:
lineContains = line.split('\t')
lineLength = len(lineContains) #calculate elements
i = i+1
if lineLength < 3 and i < len(data):
print "File is invalid format."
示例文件内容:
完整档案
AUTHOR(S) YEAR TITLE JOURNAL/CONFERENCE
Accot;Zhai 2001 Scale effects in steering law tasks Proc. ACM CHI
Acredolo 1977 Developmental Changes in the Ability to Coordinate Perspectives of a Large-Scale Space Developmental Psychology
Aginsky;Harris;Rensink;Beusmans 1997 Two strategies for learning a route in a driving simulator Journal of Environmental Psychology
文件不完整(上述代码适用于此类文件):
AUTHOR(S) YEAR TITLE JOURNAL/CONFERENCE
Accot;Zhai 2001 Scale effects in steering law tasks Proc. ACM CHI
Acredolo Developmental Changes in the Ability to Coordinate Perspectives of a Large-Scale Space Developmental Psychology
Aginsky;Harris;Rensink;Beusmans 1997 Two strategies for learning a route in a driving simulator Journal of Environmental Psychology
Agrawala;Beers;Frohlich;Hanrahan;McDowall;Bolas 1997 The two-user responsive workbench: Support for collaboration through individual views of a shared space Proc. ACM SIGGRAPH
Ahmadabadi;Eiji 1996 Cooperation strategy for a group of object lifting robots Proc. of IROS
答案 0 :(得分:1)
您抱怨您的代码“不会以任何方式影响程序的其余部分”。
由于相关代码中没有任何内容可以修改任何数据或更改任何控制流,当然它不会影响程序的其余部分。因此read_data
始终返回文件中的所有行,无效或无效。
由于你没有解释如何你想要它影响程序的其余部分,很难向你展示如何做你想要的......但我可以告诉你如何做东西
例如,不是返回所有行,而是返回有效行:
i = 0
result = []
for line in data:
lineContains = line.split('\t')
lineLength = len(lineContains) #calculate elements
i = i+1
if lineLength < 3 and i < len(data):
print "File is invalid format."
else:
result.append(line)
return result
或者,提出异常而不是返回任何内容:
i = 0
for line in data:
lineContains = line.split('\t')
lineLength = len(lineContains) #calculate elements
i = i+1
if lineLength < 3 and i < len(data):
raise ValueError("File is invalid format.")
return data
与此同时,您的代码还存在其他一些问题。
在f.close()
块中使用f
后,您不应该致电with
。通常你会很幸运,它会是无害的,但“通常无害且永远没有用”并不是你想要的那种代码。
如果您想计算某些内容中的所有行,请不要在循环中添加明确的i = i+1
,只需使用enumerate
。
另外,我不确定i < len(data)
应该做什么,因为它永远都是真的。所以我会把它留下来。 (这意味着我也可以完全离开i
,因为它是你使用它的唯一地方......但我会留下它,所以我可以告诉你enumerate
。
几乎没有理由打电话给readlines()
。文件已经是一个可迭代的行,就像readlines
返回的列表一样。你所做的就是强迫你的代码变慢,并通过一次读取整个文件而不是按需读取更多的内存。
所以,这是跳过坏线版本:
def read_data(filename):
result = []
with open(filename) as f:
for i, line in enumerate(f):
lineContains = line.split('\t')
lineLength = len(lineContains) #calculate elements
if lineLength < 3:
print "File is invalid format."
else:
result.append(line)
return result
与此同时,你是否真的想为每一条无效线路打印出警告,如果可能有100000条呢?如果没有,你可以更简单:
def read_data(filename):
def bad_line(line):
lineContains = line.split('\t')
lineLength = len(lineContains) #calculate elements
return lineLength < 3
with open(filename) as f:
return [line for line in f if not bad_line(line)]
答案 1 :(得分:0)
def is_data_valid(filename):
data = open(filename).readlines()
lines = [x.split('\t') for x in data]
no_newlines = [line for line in lines if len(line) > 1]
return all(len(line) == 4 for line in no_newlines)