我有一个函数,它在两个不同的列中查找某些字符串,并在满足函数条件时返回原始行值:
def functionator(row):
if 'J44' in row['0_c']:
if 'J44' in row['A0']:
return row['0_c']
else:
return np.nan
elif 'I50' in row['0_c']:
if 'I50' in row['A0']:
return row['0_c']
else:
return np.nan
elif 'I51' in row['0_c']:
if ('I50' or 'I51') in row['A0']:
return row['0_c']
else:
return np.nan
elif 'F03X' in row['0_c']:
if ('F00' or 'F01' or 'F02') in row['A0']:
return row['0_c']
else:
return np.nan
elif 'N18' in row['0_c']:
if 'N18' in row['A0']:
return row['0_c']
else:
return np.nan
else:
return np.nan
df['0_c'] = df.apply(functionator, axis=1)
但是,我想在一系列列中应用此功能,因此我只想通过row['0_c']
检查它,而不是只检查row['i_c'] for i in range(n)
,并检查row['Ai'] for i in range(m)
的范围。 {1}}
提前感谢您的帮助!
答案 0 :(得分:2)
这种代码非常低效。你应该首先重组它来对矢量而不是标量进行操作:
X = [1, 2, 3]
Y = [2, 5, 8]
colors=["violet", "green", "yellow"]
fig = plt.figure()
ax = fig.add_subplot(111)
ax.scatter(X, Y, color=colors)
plt.show()
然后它很容易运行几次:
def vectorator(df, col1, col2):
col_0_c = df[col1].str.contains
col_A0 = df[col2].str.contains
J44 = col_0_c('J44') & col_A0('J44')
I50 = col_0_c('I50') & col_A0('I50')
I51 = col_0_c('I51') & col_A0('I5[01]')
F03X = col_0_c('F03X') & col_A0('F0[012]')
N18 = col_0_c('N18') & col_A0('N18')
matches = J44 | I50 | I51 | F03X | N18
df[col1][~matches] = np.nan