根据条件将numpy数组拆分为子数组

时间:2014-11-28 07:19:33

标签: python arrays python-2.7 numpy

我有两个每日值和时间步长的numpy数组:

A = [[ 0.1   0.05  0.05  0.05  0.1   0.1   0.1   0.1   0.1   0.1   0.1   0.1 .......]]

T = [['19730101' '19730102' '19730103' '19730104' '19730105' '19730106' ....... '19931231']]

并希望每个月将A拆分为子数组,例如:

s = numpy.split(A,condition) # condition is when there is a change in month index in T

我不清楚如何跟踪月度数字索引的变化。任何建议都会很有意义。

2 个答案:

答案 0 :(得分:2)

我认为应该这样做。 使用numpy可能有更快/更简洁的方法,但我认为这很简单。

A = [0.1,   0.05,  0.05,  0.05,  0.1,   0.1,   0.1]
T = ['19730101', '19730102', '19730103', '19730104', '19730105', '19730106', '19931231']

combined = zip(A, T)
combined = sorted(combined, key=lambda x: x[1]) # Sort on timestamp

splits = []
current_month = None
for a, t in combined:
    month = t[4:6]
    print month
    if not month == current_month:
        splits.append([a,]) # Add new split
        current_month = month
    else:
        splits[-1].append(a) # Add to current split
print splits

答案 1 :(得分:2)

您可以使用pandas轻松完成:

>>> T = ['20140101', '20140102', '20140201', '20140202']
>>> A = [0.1, 0.2, 0.3, 0.4]
>>> s = pandas.Series(A, T)
>>> groups = s.groupby(lambda i: i[:6])
>>> for month, group in g:
...     print(month)
...     print(group)
201401
20140101    0.1
20140102    0.2
dtype: float64
201402
20140201    0.3
20140202    0.4
dtype: float64

或者你可以使用纯python,虽然效率可能较低:

>>> groups = {}
>>> for t, a in zip(T, A):
...     month = t[:6]
...     groups.setdefault(month, []).append(a))
>>> for month, group in groups.items():
...     print(month)
...     print(group)
201402
[('20140201', 0.3), ('20140202', 0.4)]
201401
[('20140101', 0.1), ('20140102', 0.2)]