熊猫:在遍历行时有条件地将行插入DataFrame

时间:2018-11-10 10:37:17

标签: python pandas

迭代Pandas DataFrame中特定列的行时,如果当前迭代行中的单元格满足特定条件,我想在当前迭代行下方添加一个新行。

例如:

df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7], 'B': [1500, 1500, 7000]})

DataFrame:

      A     B
0  0.15  1500
1  0.15  1500
2  0.70  7000

尝试:

y = 100                             #An example scalar

i = 1

for x in df['A']:
    if x is not None:               #Values in 'A' are filled atm, but not necessarily.
        df.loc[i] = [None, x*y]     #Should insert None into 'A', and product into 'B'.
        df.index = df.index + 1     #Shift index? According to this S/O answer: https://stackoverflow.com/a/24284680/4909923
    i = i + 1

df.sort_index(inplace=True)         #Sort index?

到目前为止,我还没有成功;得到了一个从0开始的移位索引编号,并且似乎没有以有序的方式插入行:

      A     B
3  0.15  1500
4   NaN    70
5  0.70  7000

我尝试了各种变体,尝试将applymap与lambda函数一起使用,但无法使其正常工作。

所需结果:

      A     B
0  0.15  1500
1  None  15
2  0.15  1500
3  None  15
4  0.70  7000
5  None  70

2 个答案:

答案 0 :(得分:1)

我相信您可以使用:

df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7], 
                          'B': [1500, 1500, 7000],
                          'C': [100, 200, 400]})

v = 100
L = []
for i, x in df.to_dict('index').items():
    print (x)
    #append dictionary
    L.append(x)
    #append new dictionary, for missing keys ('B, C') DataFrame constructor add NaNs 
    L.append({'A':x['A'] * v})

df = pd.DataFrame(L)
print (df)
       A       B      C
0   0.15  1500.0  100.0
1  15.00     NaN    NaN
2   0.15  1500.0  200.0
3  15.00     NaN    NaN
4   0.70  7000.0  400.0
5  70.00     NaN    NaN

答案 1 :(得分:1)

似乎您不需要在此处进行手动循环:

df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7], 'B': [1500, 1500, 7000]})

y = 100

# copy slice of dataframe
df_extra = df.loc[df['A'].notnull()].copy()

# assign A and B series values
df_extra = df_extra.assign(A=np.nan, B=(df_extra['A']*y).astype(int))

# increment index partially, required for sorting afterwards
df_extra.index += 0.5

# append, sort index, drop index
res = df.append(df_extra).sort_index().reset_index(drop=True)

print(res)

      A     B
0  0.15  1500
1   NaN    15
2  0.15  1500
3   NaN    15
4  0.70  7000
5   NaN    70