我有一个以这种形式输入的列表:
open_info = ['Cube 1, 9:30am to 10:00am, Thursday, March 3, 2016', 'Cube 2, 5:00pm to 5:30pm, Thursday, March 3, 2016']
我想解析这些信息,以此形式创建一个新列表:
open_times = [[9, 30, 'am'],[5, 0, 'pm']]
以第一个索引为小时,第二个为分钟,第三个索引为上午/下午。我只记录每个列表元素的第一次值,因为我处理的间隔总是30分钟。
我通过使用以下python列表推导完成了这个:
open_times = [x.split(",")[1].replace(" ","").split("to") for x in open_info]
open_times = [x[0].split(":")+x[1].split(":") for x in open_times]
open_times = [[int(x[0]),int(x[1][:2]),x[1][2:]] for x in open_times]
我想知道是否要在所有这些中创建嵌套列表解析。我已经查看了python文档并阅读了一些关于这个主题的博客,但是我仍然无法完成这个。
答案 0 :(得分:2)
要回答如何“嵌套”列表推导的问题,你可以这样做来组合第1行和第2行....
open_times = [y[0].split(":")+y[1].split(":") for y in [x.split(",")[1].replace(" ","").split("to") for x in open_info]]
......但这真是一团糟。这三条线在这里更容易理解和清洁。你可能还会考虑把它写成一系列循环,因为在理解中有很多东西会在一个循环之外更清晰。
答案 1 :(得分:2)
您可以使用正则表达式来提取时间:
>>> import re
>>>
>>> [[int(val) if val.isdigit() else val for val in re.search(r'(\d+):(\d+)(am|pm)',item, re.I).groups()] for item in open_info]
[[9, 30, 'am'], [5, 00, 'pm']]
但如果它不能匹配正则表达式,那么它可能会引发AttributeError
,所以如果您不确定,可以使用try-except
表达式来处理错误。
times = []
for item in open_info:
match = re.search(r'(\d+):(\d+)(am|pm)',item, re.I)
try:
h, m, b = match.groups()
except (AttributeError, ValueError):
pass # or append a proper value to times, instead.
else:
times.append([int(h), int(m), b])
times.append(match)
答案 2 :(得分:2)
您可以使用以下内容:
open_info = ['Cube 1, 9:30am to 10:00am, Thursday, March 3, 2016', 'Cube 2, 5:00pm to 5:30pm, Thursday, March 3, 2016']
answer = [[int(s.split(':',1)[0][-2:]), int(s.split(':')[1][:2]),
s.split(':')[1][2:4]] for s in open_info]
print(answer)
<强>输出强>
[[9, 30, 'am'], [5, 0, 'pm']]
但是,在这些情况下,使用map
代替list
理解可能更具可读性:
def func(s):
hour = int(s.split(':')[0][-2:])
minute = int(s.split(':')[1][:2])
suffix = s.split(':')[1][2:4]
return [hour, minute, suffix]
answer = map(func, open_info)
print(answer)
<强>输出强>
[[9, 30, 'am'], [5, 0, 'pm']]
答案 3 :(得分:2)
您可以简单地创建处理函数,而不是将所有逻辑推送到列表理解表达式。
为了更好的可读性,我重命名了一些值。
def extract(s):
time_from, time_to = s.split(",")[1].replace(" ", "").split("to")
hour, min_am_pm = time_from.split(":")
min = min_am_pm[:2]
am_pm = min_am_pm[2:]
return [int(hour), int(min), am_pm]
open_info = ['Cube 1, 9:30am to 10:00am, Thursday, March 3, 2016', 'Cube 2, 5:00pm to 5:30pm, Thursday, March 3, 2016']
open_times = [extract(x) for x in open_info]
答案 4 :(得分:-1)
from csv import reader
answer = [[int(a), int(b[:2]), c[2:]] for a, b, c in (inf[1].split(":")
for inf in reader(open_info, skipinitialspace=True))]
实际上符合您的预期输出:
[[9, 30, 'am'], [5, 0, 'pm']]
一个简单的函数是一个更好的主意,也不需要不断地重复分割同一行:
def spl(l):
for inf in l:
a, b, c = inf.split(", ", 2)[1].split(":", 2)
yield [int(a), int(b[:2]), c[2:]]
print(list(spl(open_info)))
输出:
[[9, 30, 'am'], [5, 0, 'pm']]
或者让csv lib解析项目:
from csv import reader
def spl(l):
for inf in reader(l, skipinitialspace=True):
a, b, c = inf[1].split(":", 2)
yield [int(a), int(b[:2]), c[2:]]
print(list(spl(open_info)))