如何遍历一个数据帧中的行并确定值

时间:2019-11-20 10:35:51

标签: python pandas loops dataframe iteration

我有一个称为资源的数据框

资源Xy的使用已在2018年12月6日结束 。直到Per1 for P1使用资源Xy,Per8 for P2才可以使用该资源。我要复制项目P1中使用的最终值资源Xy,以开始替换由Per8使用的项目P2的列,以替换默认值。结束列为(开始时间+为该人分配的时间)。我想一次又一次地遍历行,直到替换所有默认值(1970-01-01 00:00:00)

enter image description here   t_mx = t_m.groupby(['Resource']).agg({'end' : max }).reset_index() t_mx['end'] = t_m.apply(lambda row: row['new']+row['Time'] if row['new'] != pd.to_datetime('1970-01-01 00:00:00') else pd.to_datetime('1970-01-01 00:00:00'),axis = 1) 我已经使用以上代码创建了列

enter image description here

我无法获得进一步的操作方法

项目资源人员分配时间开始结束 P1 Xy每1 04:00 06-12-2018 08:00 06-12-2018 11:00 P1 Z每2 05:00 06-12-2018 08:00 06-12-2018 12:00 P1是每3年07:00 06-12-2018 08:00 06-12-2018 18:00 P1 X每1 03:50 06-12-2018 08:00 06-12-2018 12:00 P2 X每6 02:20 01-01-1970 00:00 01-01-1970 00:00 P2 Y每7 01:25 01-01-1970 00:00 01-01-1970 00:00 P2 Xy每8 02:30 01-01-1970 00:00 01-01-1970 00:00 P2 Xy每9 14:00 01-01-1970 00:00 01-01-1970 00:00 P2 X每7 12:35 01-01-1970 00:00 01-01-1970 00:00 P2 Y每6 11:10 01-01-1970 00:00 01-01-1970 00:00 P2 Z每11 13:45 01-01-1970 00:00 01-01-1970 00:00 P2 Z每12 10:00 01-01-1970 00:00 01-01-1970 00:00 P3 X每5 07:30 01-01-1970 00:00 01-01-1970 00:00

1 个答案:

答案 0 :(得分:0)

首先,我们以熊猫为单位获取数据,并将列转换为合适的类型:

import pandas as pd
from io import StringIO

txt = """
Project Resource    Person  Allocation Time start   end
P1  Xy  Per 1   4:00    06-12-2018 8:00 06-12-2018 11:00
P1  Z   Per 2   5:00    06-12-2018 8:00 06-12-2018 12:00
P1  Y   Per 3   7:00    06-12-2018 8:00 06-12-2018 18:00
P1  X   Per 1   3:50    06-12-2018 8:00 06-12-2018 12:00
P2  X   Per 6   2:20    01-01-1970 0:00 01-01-1970 0:00
P2  Y   Per 7   1:25    01-01-1970 0:00 01-01-1970 0:00
P2  Xy  Per 8   2:30    01-01-1970 0:00 01-01-1970 0:00
P2  Xy  Per 9   14:00   01-01-1970 0:00 01-01-1970 0:00
P2  X   Per 7   12:35   01-01-1970 0:00 01-01-1970 0:00
P2  Y   Per 6   11:10   01-01-1970 0:00 01-01-1970 0:00
P2  Z   Per 11  13:45   01-01-1970 0:00 01-01-1970 0:00
P2  Z   Per 12  10:00   01-01-1970 0:00 01-01-1970 0:00
P3  X   Per 5   7:30    01-01-1970 0:00 01-01-1970 0:00
"""

t_m = pd.read_csv(StringIO(txt), sep='\t')

t_m = t_m.assign(
    start=lambda f: pd.to_datetime(f.start),
    end=lambda f: pd.to_datetime(f.end),
    alloc_time=lambda f: pd.to_timedelta(f["Allocation Time"].apply(lambda x: x+':00')))

我们每次都使用上一个结束作为新起点来遍历资源组:

out = t_m.copy()
for resource, group in t_m.groupby('Resource'):
    start = group.start.iloc[0]
    for ix, row in group.iterrows():
        out.loc[ix, 'start'] = start
        start = start + row['alloc_time']
        out.loc[ix, 'end'] = start

out = out.drop('Allocation Time', axis=1)
print(out)
   Project Resource  Person               start                 end alloc_time
0       P1       Xy   Per 1 2018-06-12 08:00:00 2018-06-12 12:00:00   04:00:00
1       P1        Z   Per 2 2018-06-12 08:00:00 2018-06-12 13:00:00   05:00:00
2       P1        Y   Per 3 2018-06-12 08:00:00 2018-06-12 15:00:00   07:00:00
3       P1        X   Per 1 2018-06-12 08:00:00 2018-06-12 11:50:00   03:50:00
4       P2        X   Per 6 2018-06-12 11:50:00 2018-06-12 14:10:00   02:20:00
5       P2        Y   Per 7 2018-06-12 15:00:00 2018-06-12 16:25:00   01:25:00
6       P2       Xy   Per 8 2018-06-12 12:00:00 2018-06-12 14:30:00   02:30:00
7       P2       Xy   Per 9 2018-06-12 14:30:00 2018-06-13 04:30:00   14:00:00
8       P2        X   Per 7 2018-06-12 14:10:00 2018-06-13 02:45:00   12:35:00
9       P2        Y   Per 6 2018-06-12 16:25:00 2018-06-13 03:35:00   11:10:00
10      P2        Z  Per 11 2018-06-12 13:00:00 2018-06-13 02:45:00   13:45:00
11      P2        Z  Per 12 2018-06-13 02:45:00 2018-06-13 12:45:00   10:00:00
12      P3        X   Per 5 2018-06-13 02:45:00 2018-06-13 10:15:00   07:30:00