熊猫:遍历一行并将值添加到空列

时间:2018-12-31 11:24:11

标签: python pandas dataframe series

您好,我想遍历CPB%行并将计算结果添加到名为“ Proba”的相关列中。我的数据框如下所示:enter image description here

到目前为止我尝试过的内容如下:

bins = np.linspace(0, 1, num=100)
dCPB = df['CPB%']
df['binnedB'] = pd.cut(dCPB, bins)
dfnew = pd.DataFrame(pd.cut(df['CPB%'], bins=bins).value_counts()).sort_index(ascending = True)
dfnew['binned'] = dfnew.index

total = dfnew['CPB%'].sum()
idx = total

for index,row in dfnew.iterrows():
  idx = idx - row['CPB%']
  row['Proba'] = float(idx) / float(total)

但是我的迭代不会更新我的空列Proba,为什么?谢谢!

2 个答案:

答案 0 :(得分:2)

我认为问题是,您正在将结果分配回row,而该proba = [] for index, row in dfnew.iterrows(): idx = idx - row['CPB%'] proba.append(float(idx) / float(total)) dfnew['Proba'] = proba 不会存储在任何地方。相反,您可以这样做:

.apply

但是,这不是最好的方法,可以将axis=1Sub test() Dim vDB, vR() Dim i As Long, j As Integer, n As Long Dim r As Long vDB = Range("a1").CurrentRegion r = UBound(vDB, 1) For i = 1 To r For j = 1 To 6 n = n + 1 ReDim Preserve vR(1 To 2, 1 To n) vR(1, n) = vDB(i, j) vR(2, n) = vDB(i, j + 6) Next j Next i Sheets.Add Range("a1").Resize(n, 2) = WorksheetFunction.Transpose(vR) End Sub 一起使用来对数据帧进行逐行计算。

答案 1 :(得分:2)

您可以使用pd.Series.cumsum来进行迭代推论:

total = dfnew['CPB%'].sum()
dfnew['Proba'] = 1 - df['CPB%'].cumsum() / total

对于熊猫,您应该着眼于向量化算法,该算法通常涉及列式操作,而不是行式for循环。这是一个完整的演示:

df = pd.DataFrame({'A': list(range(1, 7))})

def jpp(df):
    total = df['A'].sum()
    df['Proba'] = 1 - df['A'].cumsum() / total
    return df

def yolo(df):
    total = df['A'].sum()
    idx = total

    proba = []
    for index, row in df.iterrows():
        idx = idx - row['A']
        proba.append(float(idx) / float(total))

    df['Proba'] = proba
    return df

# check results are the same
assert df.pipe(jpp).equals(df.pipe(yolo))

%timeit df.pipe(jpp)   # 691 µs
%timeit df.pipe(yolo)  # 840 µs