拆分多个分隔符的字符串

时间:2016-04-05 22:34:05

标签: python list

我正在阅读包含以下内容的文件。

87965164,Paris,Yu,6/27/1997
87965219,Heath,Moss,10/13/1996
87965187,Cale,Blankenship,10/22/1995
87965220,Terrence,Watkins,12/7/1996
87965172,Ansley,Padilla,3/30/1997

我需要将行分为“,”和“/”,然后删除 从最后开始的“\ n”。

我希望我的输出在放入列表时看起来像这样:

[['87965164', 'Paris', 'Yu', 6, 27, 1997], ['87965219', 'Heath', 'Moss', 10, 13, 1996], ['87965187', 'Cale', 'Blankenship', 10, 22, 1995], ['87965220', 'Terrence', 'Watkins', 12, 7, 1996], ['87965172', 'Ansley', 'Padilla', 3, 30, 1997]]

4 个答案:

答案 0 :(得分:2)

You're going to want regular expressions.

import re

results = []
for line in fl:
  # [,/] means "match if either a , or a / is present"
  results.append(re.split('[,/]',line.strip()))

如果你有一个特别大的文件,你可以把它包装在一个生成器中:

import re
def splitter(fl):
   for line in fl:
     # By using a generator, you are only accessing one line of the file at a time.
     yield re.split('[,/]',line.strip())

答案 1 :(得分:1)

比正则表达式简单:

[line.replace('/', ',').split(',') for line in text.split('\n')]

之后您可以将数字转换为int

但是,我相信你正在寻找错误的方法。正确的方法是用逗号分割,然后给特殊领域一个专门的处理。

from datetime import datetime
from collections import namedtuple

Person = namedtuple('Row', ['idn', 'first', 'last', 'birth'])

def make_person(idn, first, last, birth):
    return Person(idn, first, last,
                  datetime.strptime(birth, "%m/%d/%Y"))

records = [make_person(*line.split(',')) for line in text.split('\n')]

答案 2 :(得分:1)

我建议不要将异构数据存储在同类数据类型中,而是建议使用词典或创建类。

使用词典:

results = {}
with open('in.txt') as f:
    for line in f:
        id, first, last, day = line.split(',')
        month, day, year = map(int, day.split('/'))
        results[id] = {'id':id, 'first':first, 'last':last,
                       'month':month, 'day':day, 'year':year}

上课:

class Person:
    def __init__(self, id, first, last, day):
        self.id = id
        self.first = first
        self.last = last
        self.month, self.day, self.year = map(int, day.split('/'))

results = {}
with open('in.txt') as f:
    for line in f:
        id, first, last, day = line.split(',')
        results[id] = Person(id, first, last, day)

请注意,在每种情况下,我都会将每个人的信息存储为字典中的条目,其中包含一个类似于其ID号的密钥。

答案 3 :(得分:0)

对于每一行:

parts = line.split(',')
parts[-1:] = map(int, parts[-1].split('/'))

这将正确处理非日期部分中包含任何斜杠的输入,并且可以轻松地同时处理转换为整数。