搜索值并返回其旁边的字符串

时间:2019-04-16 19:36:19

标签: python pandas

我有一个包含字符串和日期在不同列中的df。一栏包含一个字符串,一列包含一个关联的日期。我正在尝试在第一列中搜索特定的字符串,如果找到,则返回第二列的值。问题是在带有85列的df中有15列。能做到吗?

谢谢!

到目前为止,我有:

df['New'] = df.apply(lambda row: row.astype(str).str.contains('Delayed').any(), axis=1)

在字符串中搜索

1 个答案:

答案 0 :(得分:1)

这是MCVE的外观。它包括问题中大致描述的数据。这是一个busybox,可用于查看应用lambda时发生的情况,并显示如何使事情正常工作。

import pandas as pd
from pandas.compat import StringIO

print(pd.__version__)

csvdata = StringIO("""date,LASTA,LASTB,LASTC
1999-03-15,2.5597,8.20145,16.900
1999-03-31,delayed,7.73057,16.955
1999-04-01,2.8321,7.63714,17.500
1999-04-06,2.8537,delayed,delayed""")

df = pd.read_csv(csvdata)

# debugging complex lambdas is sometimes better done by
# passing in a function to see what is going on
def row(x):
    print(type(x))
    match = x.str.contains('delayed').any()
    return match 

df['function_match'] = df.apply(row, axis=1)
df['lambda_match'] = df.apply(lambda row: row.str.contains('delayed').any(), axis=1)

# use the match column as a boolean mask, and then index by preferred column
print(df[df['lambda_match']]['LASTA'])

这产生

0.20.3
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
1    delayed
3     2.8537
Name: LASTA, dtype: object