因此,我已经从3个单独的文件(csv和xls)创建了三个数据帧。我想将它们三个组合成一个包含20列15行的数据框。我已经成功使用底部的代码成功完成了此操作(这是代码的最后部分,在这里我开始合并我创建的所有现有数据框)。但是,发生了一件奇怪的事情,排名最高的国家/地区重复了3次,而15列中有两个值应该存在但仍然缺失,我不确定为什么。
我已将每个数据帧中的索引设置为相同!
所以本质上我的问题是合并数据帧后出现重复的值,而其他值被消除了。
如果有人可以向我解释发生此问题的原因,我将非常感激:)
***merged = pd.merge(pd.merge(df_ScimEn,df_energy[ListEnergy],left_index=True,right_index=True),df_GDP[ListOfGDP],left_index=True,right_index=True))
merged = merged[ListOfColumns]
merged = merged.sort_values('Rank')
merged = merged[merged['Rank']<16]
final = pd.DataFrame(merged)***
***Example: a shorter version of what is happening
expected:
A B C D J K L R
1 x y z j a e c d
2 b c d l a l c d
3 j k e k a m c d
4 d k c k a n h d
5 d k j l a h c d
generated after I run the code above: (the 1 is repeated and the 3 is missing)
A B C D J K L R
1 x y z j a b c d
1 x y z j a b c d
1 x y z j a b c d
4 d k c k a b h d
5 d k j l a h c d***
***Example Input
df1 = {[1:A,B,C],[2:A,B,C],[3:A,B,C],[4:A,B,C],[5:A,B,C]}
df2 = {[1:J,K,L,M],[2:J,K,L,M],[3:J,K,L,M],[4:J,K,L,M],[5:J,K,L,M]}
df3 = {[1:R,E,T],[2:R,E,T],[3:R,E,T],[4:R,E,T],[5:R,E,T]}
So the indexes are all the same for each data frame and then some have a
different number of rows and different number of columns but I've edited them
to form the final data frame. and each capital letter stands for a column
name with different values for each column***