在这种情况下,文件处理如何工作?

时间:2012-12-09 04:05:04

标签: python

“地址文件”

100 Main Street 
23  Spring Park Road 
2012 Sunny Lane 
4 Martin Luther King Drive 

“地址列表”

    [['100', 'Main', 'Street'],
 ['23', 'Spring Park', 'Road'], 
 ['2012', 'Sunny', 'Lane'],
 ['4', 'Martin Luther King', 'Drive']]

numbers_file = open("address_file.txt", "r")
def load_addresses(numbers_file):
    addresses = [] # <-- Create a list for sublist
    for line in numbers_file:
        address = [] # <-- Create a sublist 
        parts = line.split() # <-- split into lists by whitespace
        address.append(parts[0]) # <--- I know this will take first elements of the            lists and appended (back of the list) to sublist. 
        name = '' # <--- name to attach such as 'Spring' 'Park' into 'Spring'
        for i in range(1, len(parts) - 1): # <--- Why is the range like this? is it because we ignore first element since its already in good form and since its index we -1?
            name += parts[i] + ' ' # <--- ??
            address.append(name.strip()) # <--- I guess this is to wipe out whitespace front and back 
            address.append(parts[-1]) # <---???
            addresses.append(address) # <--- append the sublist into list

    return addresses
我把???放在旁边的是令人困惑的部分。有人可以澄清一下吗?

2 个答案:

答案 0 :(得分:1)

def line_split(line):
    ls = line.split()
    return [ls[0],' '.join(ls[1:-1]), ls[-1]]

with open(datafile) as fin:
    address_list = [ line_split(line) for line in fin ]
    #address_list = map(line_split,fin) # would also work too.

解释标有线的问题:

for i in range(1, len(parts) - 1):

这循环遍历列表中的索引,但它会跳过第一个和最后一个索引。更为惯用的方法是:

for part in parts[1:-1]:

然后您将在循环中稍后用parts[i]替换part

name += parts[i] + ' ' # <--- ??

这需要name并向其添加parts[i]' '。换句话说,它与以下任何内容相同:

name = name + parts[i] + ' '
name = "%s%s "%(name,parts[i])
name = "{0}{1} ".format(name,parts[i])

和行:

address.append(parts[-1]) # <---???

将零件清单的最后一部分附加到address清单。

答案 1 :(得分:0)

也许这会有所帮助:

>>> line = '100 Main Street'
>>> parts = line.split()
>>> name = ''
>>> len(parts)-1
2
>>> for i in range(1,2):
...    print parts[i] + ' '
...    print parts[-1]
...
Main # <-- there is an extra space here after 'Main'
Street