我有一个数据框
ifa_num=[0.1,0.2,0.3,0.4,0.5]
ak_num=[0.6,0.7,0.8,0.9,0.11]
ch_dist=['if','ak','if','if','ak']
df=pd.DataFrame()
df['if_num']=ifa_num
df['ak_num']=ka_num
df['ch_dist']=ch_dist
dataframe_looks_like click here to see
我必须插入另一列
结果数据框应类似于
resultant data frame after adding if_ak columns
我使用for循环编写了一个幼稚的代码,由于随着数据的增长,for循环将变得效率低下,我无法弄清楚如何对其进行优化。
li=[]
for x in range(df.shape[0]):
if df.loc[x,'ch_dist']=='if':
li.append(df.loc[x,'if_num'])
else:
li.append(df.loc[x,'ak_num'])
df['if_ak']=li
答案 0 :(得分:1)
在下面尝试以下代码:
DT <- fread(
"factorID Date RDate V1 V2 V3 V4 V5 V6
1 1989-02-06 6976 318 351 172 570 260 108
1 1989-05-13 7072 77 NA 591 NA 801 550
1 1989-05-29 7088 NA NA NA NA NA NA
1 1989-06-14 7104 252 305 286 835 271 85
2 1989-02-06 6976 236 389 323 2078 908 373
2 1989-05-13 7072 77 NA 591 NA 801 550
2 1989-05-29 7088 55 62 410 2001 NA NA
2 1989-06-14 7104 351 508 456 1618 780 421")
代替for循环
ifa_num=[0.1,0.2,0.3,0.4,0.5]
ak_num=[0.6,0.7,0.8,0.9,0.11]
ch_dist=['if','ak','if','if','ak']
df=pd.DataFrame()
df['if_num']=ifa_num
df['ak_num']=ak_num
df['ch_dist']=ch_dist
答案 1 :(得分:0)
您可以使用布尔掩码将相应的列相乘:
import pandas as pd
ifa_num=[0.1,0.2,0.3,0.4,0.5]
ak_num=[0.6,0.7,0.8,0.9,0.11]
ch_dist=['if','ak','if','if','ak']
df=pd.DataFrame({'if_num': ifa_num,
'ak_num': ak_num,
'ch_dist': ch_dist})
m_if = df['ch_dist'] == 'if'
df['if_ak'] = m_if * df['if_num'] + (1-m_if) * df['ak_num']
df
if_num ak_num ch_dist if_ak
0 0.1 0.60 if 0.10
1 0.2 0.70 ak 0.70
2 0.3 0.80 if 0.30
3 0.4 0.90 if 0.40
4 0.5 0.11 ak 0.11