Question

是熊猫的新手，没有经验，但是我觉得这里有个答案，我只是不太了解自己。我有两个数据框，一个有单个索引，有m个行。另一个数据框在其第一个索引中具有唯一的m个索引，而在第二个索引中可以具有可变数量的索引。

示例：

我要添加的

df_offset DataFrame

    0
0   0
1   21080064
2   42729472
3   65017856
4   86253568
...
49  311934976

df_epocs_idx我要添加到的DataFrame：

        onset   offset
0   0   190722  923472
    1   2387988 3120738
    2   4585254 5318004
    3   6782520 7515270
    4   8979786 9712536
... ... ... ...
49  5   1289179 1313604
    6   1533320 1557745
    7   1777461 1801886
    8   2021602 2046027
    9   2265743 2290168

我想要的结果：

        onset   offset
0   0   190722  923472
    1   2387988 3120738
    2   4585254 5318004
    3   6782520 7515270
    4   8979786 9712536
... ... ... ...
49  5   313224155 313248580,
    6   313468296 313492721,
    7   313712437 313736862,
    8   313956578 313981003,
    9   314200719 314225144

我尝试了df_epocs_idx.add(df_offset, axis='rows')并给出了错误：ValueError: cannot join with no overlapping index names和df_epocs_idx.add(df_offset, axis='rows', level=0)似乎只是在前面添加了新列，然后给了我一些NaN。我想避免只使用for循环，因为我了解到在处理大型数据帧时通常效率低下，将来可能会变成这种情况。

Answer 1

我认为您需要为df_offset分配与df_epocs_idx相同的列名。因此，在df_offset中创建两个列，重复您现在在单个列中的值，将列名更改为“ onset”和“ offset”。

我编造了这个示例来重现您的情况，这可能会对您有所帮助。

新的假人df_epocs_idx：

df_epocs_idx  = pd.DataFrame({'1': [0, 0, 0, 1, 1,1,2,2,2],'2': [0, 1, 2, 0, 1,2,0,1,2],
                 'onset': np.random.rand(9),"offset": np.random.rand(9)})

df_epocs_idx= df_epocs_idx.set_index(["1","2"])
df_epocs_idx

        onset    offset
1 2                    
0 0  0.100127  0.231690
  1  0.582593  0.209367
  2  0.598472  0.863339
1 0  0.079973  0.459830
  1  0.245197  0.874727
  2  0.717778  0.041785
2 0  0.750384  0.123909
  1  0.862120  0.169458
  2  0.056572  0.744763

新的假人df_offset：

df_offset = pd.DataFrame({"onset": [3,10,100],"offset":[3,10,100]})
df_offset

  onset offset
0   3   3
1   10  10
2   100 100

基于级别0对两个数据帧求和：

df_epocs_idx.add(df_offset, level=0)

礼物：

          onset      offset
1 2                        
0 0    3.100127    3.231690
  1    3.582593    3.209367
  2    3.598472    3.863339
1 0   10.079973   10.459830
  1   10.245197   10.874727
  2   10.717778   10.041785
2 0  100.750384  100.123909
  1  100.862120  100.169458
  2  100.056572  100.744763

熊猫：按索引在多索引数据框的顶部添加单索引数据框

1 个答案: