我有一个包含字符串和日期在不同列中的df。一栏包含一个字符串,一列包含一个关联的日期。我正在尝试在第一列中搜索特定的字符串,如果找到,则返回第二列的值。问题是在带有85列的df中有15列。能做到吗?
谢谢!
到目前为止,我有:
df['New'] = df.apply(lambda row: row.astype(str).str.contains('Delayed').any(), axis=1)
在字符串中搜索
答案 0 :(得分:1)
这是MCVE的外观。它包括问题中大致描述的数据。这是一个busybox,可用于查看应用lambda时发生的情况,并显示如何使事情正常工作。
import pandas as pd
from pandas.compat import StringIO
print(pd.__version__)
csvdata = StringIO("""date,LASTA,LASTB,LASTC
1999-03-15,2.5597,8.20145,16.900
1999-03-31,delayed,7.73057,16.955
1999-04-01,2.8321,7.63714,17.500
1999-04-06,2.8537,delayed,delayed""")
df = pd.read_csv(csvdata)
# debugging complex lambdas is sometimes better done by
# passing in a function to see what is going on
def row(x):
print(type(x))
match = x.str.contains('delayed').any()
return match
df['function_match'] = df.apply(row, axis=1)
df['lambda_match'] = df.apply(lambda row: row.str.contains('delayed').any(), axis=1)
# use the match column as a boolean mask, and then index by preferred column
print(df[df['lambda_match']]['LASTA'])
这产生
0.20.3
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
1 delayed
3 2.8537
Name: LASTA, dtype: object