我的文件包含很多字符串:
V004_2aB181500181559aB182000191659
或
V001_2a194300194359a203100203159
每封信代表一名成员,并在其出勤时间之后。
这意味着在第1行中,成员a和B来自[18h15min00sec - 18h15min59sec]
以及来自[18h20min00sec - 19h16min59sec]
......
我想创建一个数组,我总结了所有出席情况,例如:
table = [Member,Start_hour,End_Hour]
答案 0 :(得分:0)
根据您的输入,看看以下是否有效。
我假设字段是静态的,我们可以在下划线和第一次出现字母数字字符之前删除任何内容。以及一个字符串包含一个用户的数据。
演示代码[已更新]
import re
def mySplit(s):
#Filter first part
thisStr = s.rsplit('_2', 1)[1]
#To split the string at digits you can use re.split with the regular expression \d+:
return filter(None, re.split(r'(\d+)', thisStr))
def prepMemberData(aList):
memData = []
keys = []
values = []
for item in aList:
#print item
if item.isalpha():
keys += item
if item.isdigit():
hr1,mi1,se1,hr2,mi2,se2 = re.findall(r'.{1,2}',item,re.DOTALL)
values.append(hr1 + "h" + mi1 + "m" + se1 + "s" + "," + hr2 + "h" + mi2 + "m" + se2 + "s")
temp = []
if len(keys) != len(values):
for i in values:
temp += i , i
values = temp
myLst = [(x ,y) for x, y in zip(keys, values)] #Merge key and value lists
for i in myLst:
#Remove ' () []
newi = re.sub(r'[\'|(|)|[|\]]',r'',str(i))
memData.append(newi)
return memData
#Process first String
myStr = "V004_2aB181500181559aB182000191659"
myLst1= mySplit(myStr)
print prepMemberData(myLst1)
#Process second String
myStr2 = "V001_2a194300194359a203100203159"
myLst2= mySplit(myStr2)
print prepMemberData(myLst2)
OutPut [已更新]
Python 2.7.9 (default, Dec 10 2014, 12:24:55) [MSC v.1500 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> ================================ RESTART ================================
>>>
['a, 18h15m00s,18h15m59s', 'B, 18h15m00s,18h15m59s', 'a, 18h20m00s,19h16m59s', 'B, 18h20m00s,19h16m59s']
['a, 19h43m00s,19h43m59s', 'a, 20h31m00s,20h31m59s']
>>>