从熊猫数据框中提取内容

时间:2020-02-03 05:08:23

标签: python pandas

数据框“名称”包含人的前10个雇主的名称。

我想检索所有包含“基础”的雇主的姓名。

我的目的是更好地理解包含“基础”的雇主名称。

这是我搞砸的代码:

:id = 2

错误是:

name=employ[['nameCurrentEmployer',
       'name2ndEmployer', 'name3thEmployer',
       'name4thEmployer', 'name5thEmployer',
       'name6thEmployer', 'name7thEmployer',
       'name8thEmployer', 'name9thEmployer',
       'name10thEmployer']]
print(name.loc[name.str.contains('foundation', case=False)][['Answer.nameCurrentEmployer',
       'Answer.nameEighthEmployer', 'Answer.nameFifthEmployer',
       'Answer.nameFourthEmployer', 'Answer.nameNinethEmployer',
       'Answer.nameSecondEmployer', 'Answer.nameSeventhEmployer',
       'Answer.nameSixthEmployer', 'Answer.nameTenthEmployer',
       'Answer.nameThirdEmployer']])

谢谢!

2 个答案:

答案 0 :(得分:1)

您得到AttributeError: 'DataFrame' object has no attribute 'str',因为strSeries而不是DataFrame的访问者。

来自docs

Series.str可用于以字符串形式访问系列的值 并对其应用几种方法。这些可以像访问 Series.str.<function/property>

因此,如果您的["name6thEmployer", "name7thEmployer"]中有多个列,例如DataFrame等,称为name,那么最简单的方法是:

columns = ["name6thEmployer", "name7thEmployer", ...]
for column in columns:
    # for example, if you just want to count them up
    print(name[name[column].str.contains("foundation")][column].value_counts())

答案 1 :(得分:0)

尝试:

foundation_serie=df['name'].str.contains('foundation', regex=True)   
print(df[foundation_serie.values])