我有一个包含名称的数据框和一个包含名称和每个名称计数的字典列表。我需要根据results
中每个名称的存在来创建一个新列。但是问题不是完全匹配,而是仅基于名字的一部分。到目前为止,我尝试过的所有解决方案都非常笨拙,因此我希望这个美好的社区可能会提出一些更优雅的建议。
dic = {"IDs": ['a','b','c','d','e','f','g','k','l','m'],
"names": ['Ailbhe Yowa',
'Hannah Kirst',
'Morris Hunt',
'Flavia Quor in the UK',
'Sarah Smith and Alexandra Libman',
'Flavia Morris, Mark Torre, Ann Moor',
'Rowena Freez',
'Adam Lion in USA',
'Mahmood Jade in Europe',
'Morris Tool and Francois Lopin']
}
test = pd.DataFrame(dic)
results = [[{'name': 'Ailbhe', 'count': 17}],
[{'name': 'Mahmood', 'count': 2818}],
[{'name': 'Debbie', 'count': 11493}],
[{'name': 'Arthur', 'count': 20587}],
[{'name': 'Clive', 'count': 2703}],
[{'name': 'Flavia', 'count': 10166}],
[{'name': 'Alexandra', 'count': 1939}],
[{'name': 'Sarah', 'count': 88388}],
[{'name': 'Morris', 'count': 3194}],
[{'name': 'Cameron', 'count': 3334}]]
所需的输出应如下所示:
IDs names results
0 a Ailbhe Yowa [{'name': 'Ailbhe', 'count': 17}]
1 b Hannah Kirst
2 c Morris Hunt [{'name': 'Morris', 'count': 3194}]
3 d Flavia Quor in the UK [{'name': 'Flavia', 'count': 10166}]
4 e Sarah Smith and Alexandra Libman [{'name': 'Sarah', 'count': 88388}, {'name': 'Alexandra', 'count': 1939}]
5 f Flavia Morris, Mark Torre, Ann Moor [{'name': 'Flavia', 'count': 10166}]
6 g Rowena Freez
7 k Adam Lion in USA
8 l Mahmood Jade in Europe [{'name': 'Mahmood', 'count': 2818}]
9 m Morris Tool and Francois Lopin [{'name': 'Morris', 'count': 3194}]
答案 0 :(得分:1)
使用[emerg] 1#1: unknown directive "server1.com" in /etc/nginx/conf.d/nginx.conf:2
的一种方式:
pandas.Series.str.findall
输出:
name_dict = {l[0]["name"]: l[0] for l in results}
reg = "(%s)" % "|".join(list(name_dict))
test["results"] = test["names"].str.findall(reg).apply(lambda x: [name_dict[i] for i in x])
print(test)
答案 1 :(得分:0)
resultsToAdd = []
for index, row in test.iterrows():
for result_row in results:
isFound = False
for result in result_row:
if result['name'] in row['names']:
isFound = True
break
if isFound:
break
if isFound:
resultsToAdd.append(result_row)
else :
resultsToAdd.append(" ")
test["results"] = resultsToAdd
print(test)