复制行创建并替换单元格值

时间:2019-03-15 13:11:21

标签: python-3.x pandas

我有一个CSV文件,其中包含以下数据:

  NAME    | AGE  | COLLEGE  | BRANCH  | Qualification
------------------------------------------------------- 
  sai     | 21   |   FG     |   CSE   |   B.Tech
  Kiran   | 22   |   FG     |   EEE   |   M.Tech
  Anil    | 21   |   FG     |   CSE   |   B.Tech
  Ram     | 22   |   KL     |   EEE   |   B.Tech

我用来创建CSV文件的代码:

import pandas as pd

Name=['sai', 'Kiran', 'Anil', 'Ramj']
Age=[21, 22, 21, 22]
college=['FG', 'FG', 'FG', 'KL']
branch=['CSE', 'EEE', 'CSE', 'EEE']
Qualification=['B.Tech', 'M.Tech', 'B.Tech', 'B.Tech']

dict = {'NAME': Name, 'AGE': Age, 'COLLEGE': college, 'BRANCH': branch, 
'Qualification': Qualification }  

df = pd.DataFrame(dict) 
df.to_csv('TESTINGFILE.csv',index=False) 

需要执行以下步骤:


步骤1:

根据条件,我需要创建一个重复行。

条件:College = FG,BRANCH = CSE

如果满足条件,则应创建一个重复的行,其分支名称为ECE。

  NAME    | AGE  | COLLEGE  | BRANCH  | Qualification
------------------------------------------------------- 
  sai     | 21   |   FG     |   CSE   |   B.Tech
  sai     | 21   |   FG     |   ECE   |   B.Tech
  Kiran   | 22   |   FG     |   EEE   |   M.Tech
  Anil    | 21   |   FG     |   CSE   |   B.Tech
  Anil    | 21   |   FG     |   ECE   |   B.Tech
  Ram     | 22   |   KL     |   EEE   |   B.Tech

步骤2:

现在具有相同条件( COLLEGE = FG和BRANCH = CSE ),如果满足,则将分支从CSE更改为IT。

最终预期输出:

  NAME    | AGE  | COLLEGE  | BRANCH  | Qualification
------------------------------------------------------- 
  sai     | 21   |   FG     |   IT    |   B.Tech
  sai     | 21   |   FG     |   ECE   |   B.Tech
  Kiran   | 22   |   FG     |   EEE   |   M.Tech
  Anil    | 21   |   FG     |   IT    |   B.Tech
  Anil    | 21   |   FG     |   ECE   |   B.Tech
  Ram     | 22   |   KL     |   EEE   |   B.Tech

有人可以通过使用熊猫编写代码来帮助我吗?

感谢您的帮助!

2 个答案:

答案 0 :(得分:1)

首先按条件创建掩码,用mask替换值,用concat重复行,并用DataFrame.assign分配值,最后DataFrame.sort_index

mask = (df.COLLEGE == 'FG') & (df.BRANCH == 'CSE')
df.loc[mask, 'BRANCH'] = 'IT' 
df = pd.concat([df, df[mask].assign(BRANCH='ECE')]).sort_index().reset_index(drop=True)
print (df)
    NAME  AGE COLLEGE BRANCH Qualification
0    sai   21      FG     IT        B.Tech
1    sai   21      FG    ECE        B.Tech
2  Kiran   22      FG    EEE        M.Tech
3   Anil   21      FG     IT        B.Tech
4   Anil   21      FG    ECE        B.Tech
5   Ramj   22      KL    EEE        B.Tech

答案 1 :(得分:1)

您可以执行以下操作:
1.首先通过过滤创建子集
2.将值更改为ECE
3.将数据框连接在一起
4.使用np.where有条件地将值更改为IT

df_dup = df[(df.COLLEGE== 'FG') & (df.BRANCH == 'CSE')]
df_dup['BRANCH'] = 'ECE'

df = pd.concat([df, df_dup])

df['BRANCH'] = np.where((df.COLLEGE== 'FG') & (df.BRANCH == 'ECE'), 'IT', df.BRANCH)

df = df.sort_index().reset_index(drop=True)

print(df)
    NAME  AGE COLLEGE BRANCH Qualification
0    sai   21      FG    CSE        B.Tech
1    sai   21      FG     IT        B.Tech
2  Kiran   22      FG    EEE        M.Tech
3   Anil   21      FG    CSE        B.Tech
4   Anil   21      FG     IT        B.Tech
5   Ramj   22      KL    EEE        B.Tech