Question

当特定列具有特定值时，我正在尝试使用pandas数据帧将新行/重复行插入excel。如果列值为TRUE，则复制该行并更改其值。

例如：

Input

    A        B      C   D   
0   Red      111    A   2   
1   Blue     222    B   12  
2   Green    333    B   3
3   Black    111    A   2   
4   Yellow   222    D   12  
5   Pink     333    c   3
6   Purple   777    B   10
Output
    A        B      C   D   
0   Red      111    A   2   
1   Blue     222    Y   12  
2   Blue     222    Z   12
3   Green    333    Y   3
4   Green    333    Z   3
5   Black    111    A   2   
6   Yellow   222    D   12  
7   Pink     333    c   3
8   Purple   777    Y   10
9   Purple   666    Z   10

如果您在此处的C列中看到，则当我遇到特定的Value = B时，我只想复制该行。在原始行和重复行中将其值分别更改为Y和Z。（如果我遇到B以外的任何东西，请不要重复。）

Answer 1

将concat的C列替换为Z，将过滤后的行替换为0.5，以索引正确的sort_index：

df1 = df.replace({'C': {'B':'Y'}})
df2 = df[df['C'].eq('B')].assign(C = 'Z').rename(lambda x: x + .5)

df = pd.concat([df1, df2]).sort_index().reset_index(drop=True)
print (df)
        A    B  C   D
0     Red  111  A   2
1    Blue  222  Y  12
2    Blue  222  Z  12
3   Green  333  Y   3
4   Green  333  Z   3
5   Black  111  A   2
6  Yellow  222  D  12
7    Pink  333  c   3
8  Purple  777  Y  10
9  Purple  777  Z  10

或创建3个没有B值的小型DataFrame，并对其进行过滤和设置值和concat：

mask = df['C'].eq('B')
df0 = df[~mask]
df1 = df[mask].assign(C = 'Y')
df2 = df[mask].assign(C = 'Z').rename(lambda x: x + .5)

df = pd.concat([df0, df1, df2]).sort_index().reset_index(drop=True)

Answer 2

替代方法。

#Replace B with Y & Z first in column C
df.replace({'C': {'B': 'Y,Z'}}, inplace = True)

#Use "explode" Avaible on pandas 0.25 to split the value into 2 columns
df=df.assign(C=df.C.str.split(",")).explode('C')

使用Python /熊猫添加/复制行以获取特定的列值

2 个答案: