Question

我有一个函数可以创建几个带有未排序索引的pandas数据帧。我想根据索引将这些数据框中的值添加到另一个数据框中的现有列。

得到我的意思：

# original dataframe
df_original = pd.DataFrame({'a':range(8), 'b':range(8)})
df_original['c'] = np.nan

   a  b   c
0  0  0 NaN
1  1  1 NaN
2  2  2 NaN
3  3  3 NaN
4  4  4 NaN
5  5  5 NaN
6  6  6 NaN
7  7  7 NaN

我的函数使用未排序的索引逐个返回数据帧：

# first df that is returned
df1 = pd.DataFrame(index=range(1,8,2), data=range(4), columns=['c'])

   c
1  0
3  1
5  2
7  3

# second df that is returned
df2 = pd.DataFrame(index=range(0,8,2), data=range(4), columns=['c'])

   c
0  0
2  1
4  2
6  3

我想通过索引将这两个数据帧中的c列添加到原始数据框的c列中的c列，所以我最终得到：

# original dataframe in the end
    a   b   c
0   0   0   0
1   1   1   0
2   2   2   1
3   3   3   1
4   4   4   2
5   5   5   2
6   6   6   3
7   7   7   3

我怎么能有效地做到这一点？我的真实原始数据帧包含大约100k行，每次调用时函数返回大约100个值。最后，c列中没有np.nan。

我目前正在函数末尾循环每个新数据帧，并使用df_original.set_value()更改原始数据框中的值。必须有更好的方法吗？

我还在考虑使用所有新数据框执行df_temp = pd.concat((df1, df2...), axis=0)，然后使用pd.concat((df_original, df_temp), axis=1)结束。你会怎么做？

Answer 1

在我看来，双concat解决方案很不错。

另一种选择是使用join：

df_temp = pd.concat([df1,df2])
df = df_original.join(df_temp)
print (df)
   a  b  c
0  0  0  0
1  1  1  0
2  2  2  1
3  3  3  1
4  4  4  2
5  5  5  2
6  6  6  3
7  7  7  3

Answer 2

一个简单的任务就足以做到这一点，即

.login-register{
    overflow: scroll; 
    padding: 5% 0; 
}

.login-box{
  width:800px; adjust this value according your requirement
}

输出：

将具有未排序索引的多个pandas数据帧中的值插入另一个数据帧中的现有列

2 个答案: