解析文件并将数据存储在列表中

时间:2014-05-06 21:33:12

标签: python

我有这样的文件" as.txt"

Sr.No.      Name        Enrollment Number   CGPA        Year        
1.          XYZ     1101111             7.1     2014        
2.          ZYX     1101113             8.2     2014        
3.          Abc     1010101             9.1     2014        

我想解析此文件并将数据存储在列表中。我想提取每一行并检查其注册号码,如果注册号码。从第11个开始,将它保存在第二个主人中,然后在firstyearlist中。

这是我尝试的但我认为我错了。

import struct

with open("as.txt") as f:
    # skip first two lines (containing header) and split on whitespace
    # this creates a nested list like: [[val1, i1, i2], [val2, i1, i2]]
    lines = [x.split() for x in f.readlines()[2:]
    # use the list to create the dict, using first item as key, last as values
    dict((x[0], x[1:])for x in lines)
f.close()

请帮我这样做。

3 个答案:

答案 0 :(得分:2)

可能的解决方案

有许多可能的解决方案,这个解释了几个典型的结构

fname = "as.txt"
with open(fname) as f:
    # skip first line (containing header)
    header = f.next() #this has just read one line (header)
    print "header", header # just to show, we have read the header line, not really necessary
    # this creates a list of records with each record being: [srno, name, enrolment, cgpa, year]
    records = [line.split() for line in f]
    # initialize resulting lists
    y_11 = []
    y_others = []
    # loop over records
    # we use value unpacking, each element of record is assigned to one variable
    for srno, name, enrolment, cgpa, year in records:
        if enrolment.startswith("11"):
            y_11.append([srno, name, enrolment, float(cgpa), int(year)])
        else:
            y_others.append([srno, name, enrolment, float(cgpa), int(year)])
# note, as we have left the `with` block, the `f.close()` was done automatically
assert f.closed # this assert would raise an exception if the `f.closed` would not be True

# print the results
print "y_11", y_11
print "y_other", y_others

将其命名为

$ python file2lst.py 
header Sr.No.      Name        Enrollment Number   CGPA        Year        

y_11 [['1.', 'XYZ', '1101111', 7.1, 2014], ['2.', 'ZYX', '1101113', 8.2, 2014]]
y_other [['3.', 'Abc', '1010101', 9.1, 2014]]

几条评论

f.next() - 阅读下一行文字

拥有文件描述符,循环可以迭代它们。所以你不必打电话

lines = f.readlines()

但您也可以这样做:

lines = list(f)

在所有情况下,都会返回行列表。

在for循环中迭代时,使用next()方法隐藏了对iterable的调用:

lines = []
for line in f:
    lines.append(line)

再次,我们有填充的行列表。

我们可以在iterable上使用next()调用来实现相同的功能,在我们的例子中是打开文件描述符。

with open(fname) as f:
    lines = []
    line = f.next()
    lines.append(line)
    line = f.next()
    lines.append(line)
    line = f.next()
    lines.append(line)
    line = f.next()
    lines.append(line)

我们足够聪明到现在就停止,否则一旦我们用完文件中的行就会引发异常StopIterationfor循环自动捕获此异常并停止迭代。

到现在为止,我们将理解,通过电话header = f.next()我们得到了第一行读出。下次在某次迭代中使用f时,它不返回并跟随下一行,再也不会返回标题。

将值解包为变量

我们假设line.split()返回5个元素。

我们可以一步将所有5个元素分配到不同的变量中。

record = ["a11", "b22", "c33", "d44", "e55"]
a, b, c, d, e, = record
print a
print b
# etc.

在我们的解决方案中,我们在for循环中使用它。

,上下文管理器会自动在创建的变量

上调用close()

按照以下方式处理文件是典型的习惯用法:

fname = "something.txt"
with open(fname) as f:
    # process the file

# do not call `f.close()` as it gets closed at the moment inner `with` block is left.

这个with构造使用所谓的"上下文管理器",它可以通过输入块(在with行上)并在最后做一些事情来创建一些值。它,在我们的例子中,它调用close()

答案 1 :(得分:0)

似乎你想要Sr没有,我有列表和词汇。

y_10=[]
y_11=[]
with open("as.txt",'r') as f: # no need for f.close() when you use "with open" as the file is autonatically closed
   lines = [x.split() for x in f.readlines()[2:]]
   for line in lines: 
       if line[2].startswith("10"): # check if the 3rd element starts with "10"
           y_10.append(line) # if so add to year 10 list
       else:
           y_11.append(line) # else it starts with "11" so add to year eleven list
print  y_10,y_11
[['3.', 'Abc', '1010101', '9.1', '2014']] [['1.', 'XYZ', '1101111', '7.1', '2014'], ['2.', 'ZYX', '1101113', '8.2', '2014']]

# make dicts using zip, where the first element of each list is the key and the rest are the values
y_10_dict = dict(zip([x[0] for x in y_10], [y[1:] for y in y_10])) #
y_11_dict = dict(zip([x[0] for x in y_11], [y[1:] for y in y_11]))

print  y_10_dict,y_11_dict
{'3.': ['Abc', '1010101', '9.1', '2014']} {'2.': ['ZYX', '1101113', '8.2', '2014'], '1.': ['XYZ', '1101111', '7.1', '2014']}

答案 2 :(得分:0)

目前还不清楚如何存储问题中的变量。

这将读取文件,跳过前两行,并将数据存储在2个词典中:

fir_y = {} #Sets the variables
sec_y = {}
with open("as.txt") as f: #Opens the file
    raw = f.read().split("\n")[2::] #Reads the file and splits it by newlines
    for v in raw:
        var = v.split(" ")
        if var[2][0:2] == "11": #If enrollment number starts with 11
            sec_y[var[1]] = [var[2],var[3],var[4]]
            #dict[key] = value
        else:
            fir_y[var[1]] = [var[2],var[3],var[4]]


{'Abc': ['1010101', '9.1', '2014']}
{'XYZ': ['1101111', '7.1', '2014'], 'ZYX': ['1101113', '8.2', '2014']}

或者,您可以将其存储为列表。它几乎相同,你只需使用.append():

fir_y = []
sec_y = []
with open("as.txt") as f:
    raw = f.read().split("\n")[2::]
    for v in raw:
        var = v.split(" ")
        if var[2][0:2] == "11":
            sec_y.append([var[1],var[2],var[3],var[4]])
        else:
            fir_y.append([var[1],var[2],var[3],var[4]])

[['Abc', '1010101', '9.1', '2014']]
[['XYZ', '1101111', '7.1', '2014'], ['ZYX', '1101113', '8.2', '2014']]

此外,当您使用“with open(”__“,”_“)作为x打开文件时:”之后不需要关闭文件。它会自动关闭。