Question

数据框“名称”包含人的前10个雇主的名称。

我想检索所有包含“基础”的雇主的姓名。

我的目的是更好地理解包含“基础”的雇主名称。

这是我搞砸的代码：

:id = 2

错误是：

name=employ[['nameCurrentEmployer',
       'name2ndEmployer', 'name3thEmployer',
       'name4thEmployer', 'name5thEmployer',
       'name6thEmployer', 'name7thEmployer',
       'name8thEmployer', 'name9thEmployer',
       'name10thEmployer']]
print(name.loc[name.str.contains('foundation', case=False)][['Answer.nameCurrentEmployer',
       'Answer.nameEighthEmployer', 'Answer.nameFifthEmployer',
       'Answer.nameFourthEmployer', 'Answer.nameNinethEmployer',
       'Answer.nameSecondEmployer', 'Answer.nameSeventhEmployer',
       'Answer.nameSixthEmployer', 'Answer.nameTenthEmployer',
       'Answer.nameThirdEmployer']])

谢谢！

Answer 1

您得到AttributeError: 'DataFrame' object has no attribute 'str'，因为str是Series而不是DataFrame的访问者。

来自docs：

Series.str可用于以字符串形式访问系列的值并对其应用几种方法。这些可以像访问 Series.str.<function/property>。

因此，如果您的["name6thEmployer", "name7thEmployer"]中有多个列，例如DataFrame等，称为name，那么最简单的方法是：

columns = ["name6thEmployer", "name7thEmployer", ...]
for column in columns:
    # for example, if you just want to count them up
    print(name[name[column].str.contains("foundation")][column].value_counts())

Answer 2

尝试：

foundation_serie=df['name'].str.contains('foundation', regex=True)   
print(df[foundation_serie.values])

从熊猫数据框中提取内容

2 个答案: