子集数据框:从原始df中获取先前值,但不在子集中

时间:2019-11-21 10:28:45

标签: python python-3.x pandas dataframe

我有一个如下数据框:

                        A
2018-10-16 15:11:00     100
2018-10-16 15:11:07     101
2018-10-16 15:11:11     102
2018-10-16 15:11:12     101
2018-10-16 15:11:13     100
2018-10-16 15:11:17     110
2018-10-16 15:11:20     103
2018-10-16 15:11:41     99
2018-10-16 15:11:54     107

以及以下子数据框:

                        A
2018-10-16 15:11:11     102
2018-10-16 15:11:20     103
2018-10-16 15:11:41     99

我需要将其转换为以下内容:

                        A       New
2018-10-16 15:11:11     102     101
2018-10-16 15:11:20     103     110
2018-10-16 15:11:41     99      110

也就是说,对于每一行,在原始数据帧中取前一个值,但在子集数据帧中取不上

2 个答案:

答案 0 :(得分:2)

concat用于具有默认外部联接和DataFrame.shift原始值的索引值,然后将匹配的值替换为丢失的值并向前填充它们,最后通过A列删除丢失的行:

df = pd.concat([df_subset['A'], df['A'].shift()], axis=1, keys=('A','new'), sort=True)
df['new'] = df['new'].mask(df['new'].isin(df['A'])).ffill()
df = df.dropna(subset=['A'])
print (df)
                         A    new
2018-10-16 15:11:11  102.0  101.0
2018-10-16 15:11:20  103.0  110.0
2018-10-16 15:11:41   99.0  110.0

答案 1 :(得分:0)

我终于做到了:

df:

    A
1   1000
2   1000
3   1001
4   1001
5   10
6   1000
7   1010
8   9
9   100000
10  6
11  999
12  10110
13  10111
14  1000

subnet_df:

    A
5   10
8   9
9   100000
10  6
12  10110
13  10111

subnet_indexs = subnet_df.index
aux_df = df.copy()
aux_df['NEW'] = aux_df['A']
aux_df.loc[subnet_indexs, new_column_name] = np.nan
aux_df[new_column_name].fillna(method='ffill', inplace=True)
subnet_df = aux_df.loc[subnet_indexs]

    A       NEW
5   10      1001.0
8   9       1010.0
9   100000  1010.0
10  6       1010.0
12  10110   999.0
13  10111   999.0