我有两个每日值和时间步长的numpy数组:
A = [[ 0.1 0.05 0.05 0.05 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 .......]]
T = [['19730101' '19730102' '19730103' '19730104' '19730105' '19730106' ....... '19931231']]
并希望每个月将A
拆分为子数组,例如:
s = numpy.split(A,condition) # condition is when there is a change in month index in T
我不清楚如何跟踪月度数字索引的变化。任何建议都会很有意义。
答案 0 :(得分:2)
我认为应该这样做。 使用numpy可能有更快/更简洁的方法,但我认为这很简单。
A = [0.1, 0.05, 0.05, 0.05, 0.1, 0.1, 0.1]
T = ['19730101', '19730102', '19730103', '19730104', '19730105', '19730106', '19931231']
combined = zip(A, T)
combined = sorted(combined, key=lambda x: x[1]) # Sort on timestamp
splits = []
current_month = None
for a, t in combined:
month = t[4:6]
print month
if not month == current_month:
splits.append([a,]) # Add new split
current_month = month
else:
splits[-1].append(a) # Add to current split
print splits
答案 1 :(得分:2)
您可以使用pandas轻松完成:
>>> T = ['20140101', '20140102', '20140201', '20140202']
>>> A = [0.1, 0.2, 0.3, 0.4]
>>> s = pandas.Series(A, T)
>>> groups = s.groupby(lambda i: i[:6])
>>> for month, group in g:
... print(month)
... print(group)
201401
20140101 0.1
20140102 0.2
dtype: float64
201402
20140201 0.3
20140202 0.4
dtype: float64
或者你可以使用纯python,虽然效率可能较低:
>>> groups = {}
>>> for t, a in zip(T, A):
... month = t[:6]
... groups.setdefault(month, []).append(a))
>>> for month, group in groups.items():
... print(month)
... print(group)
201402
[('20140201', 0.3), ('20140202', 0.4)]
201401
[('20140101', 0.1), ('20140102', 0.2)]