拆分包含多个参数的字符串

时间:2016-03-29 14:57:18

标签: python arrays regex

我的文件包含很多字符串:

V004_2aB181500181559aB182000191659

V001_2a194300194359a203100203159

每封信代表一名成员,并在其出勤时间之后。

这意味着在第1行中,成员a和B来自[18h15min00sec - 18h15min59sec]以及来自[18h20min00sec - 19h16min59sec] ......

我想创建一个数组,我总结了所有出席情况,例如:

table = [Member,Start_hour,End_Hour]

1 个答案:

答案 0 :(得分:0)

根据您的输入,看看以下是否有效。

我假设字段是静态的,我们可以在下划线和第一次出现字母数字字符之前删除任何内容。以及一个字符串包含一个用户的数据。

演示代码[已更新]

import re

def mySplit(s):
    #Filter first part
    thisStr = s.rsplit('_2', 1)[1]

    #To split the string at digits you can use re.split with the regular expression \d+:
    return filter(None, re.split(r'(\d+)', thisStr))


def prepMemberData(aList):
    memData = []
    keys = []
    values = []

    for item in aList:
        #print item

        if item.isalpha():
            keys += item

        if item.isdigit():
            hr1,mi1,se1,hr2,mi2,se2 = re.findall(r'.{1,2}',item,re.DOTALL)
            values.append(hr1 + "h" + mi1 + "m" + se1 + "s" + "," + hr2 + "h" + mi2 + "m" + se2 + "s")

    temp = []
    if len(keys) != len(values):
        for i in values:
            temp += i , i
            values = temp

    myLst = [(x ,y) for x, y in zip(keys, values)] #Merge key and value lists

    for i in myLst:        
        #Remove ' () [] 
        newi = re.sub(r'[\'|(|)|[|\]]',r'',str(i))
        memData.append(newi)

    return memData


#Process first String
myStr = "V004_2aB181500181559aB182000191659"
myLst1= mySplit(myStr)
print prepMemberData(myLst1)

#Process second String
myStr2 = "V001_2a194300194359a203100203159"
myLst2= mySplit(myStr2)
print prepMemberData(myLst2)

OutPut [已更新]

    Python 2.7.9 (default, Dec 10 2014, 12:24:55) [MSC v.1500 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> ================================ RESTART ================================
>>> 
['a, 18h15m00s,18h15m59s', 'B, 18h15m00s,18h15m59s', 'a, 18h20m00s,19h16m59s', 'B, 18h20m00s,19h16m59s']
['a, 19h43m00s,19h43m59s', 'a, 20h31m00s,20h31m59s']
>>>