Question

我需要填充大型pandas DataFrame。

这是我的代码：

df

我的问题是，当这段代码最终完成计算时，我的DataFrame NaN仍然只包含零。甚至没有插入binary_metric_train。我认为我的索引是正确的。另外，我已经单独测试了我的1 2 3 4 5 ... trains tresholds 1 10 20 30 40 50 60 2 10 20 30 40 50 60 ...函数，它确实返回了一个长度为35的数组。

有人能发现我在这里失踪的东西吗？

编辑：为清楚起见，此DataFrame如下所示：

UserControl

Answer 1

正如@EdChum所说，你应该看看pandas索引。这是一些用于说明目的的测试数据，应该清理它们。

import numpy as np
import pandas as pd

trains     = [ 1,  1,  1,  2,  2,  2]
thresholds = [10, 20, 30, 10, 20, 30]
data       = [ 1,  0,  1,  0,  1,  0]
df = pd.DataFrame({
    'trains'     : trains,
    'thresholds' : thresholds,
    'C1'         : data,
    'C2'         : data
}).set_index(['trains', 'thresholds'])

print df
df.ix[(2, 30), 0] = 3 # using column index
# or...
df.ix[(2, 30), 'C1'] = 3 # using column name
df.loc[(2, 30), 'C1'] = 3 # using column name
# but not...
df.loc[(2, 30), 1] = 3 # creates a new column
print df

在修改之前和之后输出DataFrame：

                   C1  C2
trains thresholds        
1      10           1   1
       20           0   0
       30           1   1
2      10           0   0
       20           1   1
       30           0   0
                   C1  C2   1
trains thresholds            
1      10           1   1 NaN
       20           0   0 NaN
       30           1   1 NaN
2      10           0   0 NaN
       20           1   1 NaN
       30           3   0   3

使用MultiIndex索引DataFrame

1 个答案: