创建一个新的列,并在另一列中添加值

时间:2020-05-18 21:39:32

标签: python pandas

嗨,我有一个数据框,例如:

flushdb

,我想添加一个新列Groups Names COLs COLe G1 ABC_DEF.1:2-300():Canis_lupus 2 300 G1 SDDD1 NA NA G1 SKUD.2. NA NA G1 SEQUENCE3 NA NA G1 ABC_DEF.1:400-600():Canis_lupus 400 600 G1 IJK_LMN.1:20-200():Bos_taurus 20 200 G2 OP_D:500-1000():Felis_catus 500 1000 G2 JDJDJ99 NA NA 并将所有不包含Names2的{​​{1}}放入组内,而每个Names都带有()在其内容中:

输出为:

Names

有人对使用熊猫有想法吗?

1 个答案:

答案 0 :(得分:0)

df1 = df[df.Names.str.contains('()', regex=False)]
df2 = df[~df.Names.str.contains('()', regex=False)][['Groups', 'Names']]

print( pd.merge(left=df1, right=df2, on='Groups').rename(columns={"Names_x": "Names", "Names_y": "Names2"}) )

打印:

  Groups                            Names   COLs    COLe     Names2
0     G1    ABC_DEF.1:2-300():Canis_lupus    2.0   300.0      SDDD1
1     G1    ABC_DEF.1:2-300():Canis_lupus    2.0   300.0    SKUD.2.
2     G1    ABC_DEF.1:2-300():Canis_lupus    2.0   300.0  SEQUENCE3
3     G1  ABC_DEF.1:400-600():Canis_lupus  400.0   600.0      SDDD1
4     G1  ABC_DEF.1:400-600():Canis_lupus  400.0   600.0    SKUD.2.
5     G1  ABC_DEF.1:400-600():Canis_lupus  400.0   600.0  SEQUENCE3
6     G1    IJK_LMN.1:20-200():Bos_taurus   20.0   200.0      SDDD1
7     G1    IJK_LMN.1:20-200():Bos_taurus   20.0   200.0    SKUD.2.
8     G1    IJK_LMN.1:20-200():Bos_taurus   20.0   200.0  SEQUENCE3
9     G2      OP_D:500-1000():Felis_catus  500.0  1000.0    JDJDJ99