对我来说,组合以下两个数据框的最佳方法是什么?我想要:
const arr = [
['foo', 'bar', 'hey', 'oi'],
['foo', 'bar', 'hey'],
['foo', 'bar', 'anything'],
['bar', 'anything']
];
var flat = arr.flat();
//▼ filters through the words to see which ones are all included
console.log(flat.filter(v => arr.every(a => a.includes(v)))
.filter((v, i, a) => a.indexOf(v) === i));
//▲ filter through the 4 bars to get only one
对desired_df
和new_df
中任何重复的security
,date
索引使用new_df
的价格(例如,更新stock2下面)old_df
保留desired_df
中所有未出现在old_df
中的条目(保留stock3)new_df
包括desired_df
中未出现在new_df
中的所有条目(添加股票2)以下是我正在寻找的示例:
old_df
以下是old_df = pd.DataFrame({'security': ['stock1', 'stock3'],'date': ['2019-12-23', '2019-12-23'],'price':[10,9]}).set_index(['security','date'])
new_df = pd.DataFrame({'security': ['stock1', 'stock2'],'date': ['2019-12-23', '2019-12-24'],'price':[11,12]}).set_index(['security','date'])
desired_df = pd.DataFrame({'security': ['stock1', 'stock2', 'stock3'],'date': ['2019-12-23', '2019-12-24', '2019-12-23'],'price':[11,12,11]}).set_index(['security','date'])
,old_df
和我的new_df
的打印输出:
desired_df
答案 0 :(得分:2)
IIUC,您可以使用combine_first
:
desired_df = new_df.combine_first(old_df)
price
security date
stock1 2019-12-23 11.0
stock2 2019-12-24 12.0
stock3 2019-12-23 9.0