我需要从索引中获取值名称。 我的数据集如下
try_test = pd.DataFrame({'word': ['apple', 'orange', 'diet', 'energy', 'fire', 'cake'],
'name': ['dog', 'cat', 'mad cat', 'good dog', 'bad dog', 'chicken']})
word name
0 apple dog
1 orange cat
2 diet mad cat
3 energy good dog
4 fire bad dog
5 cake chicken
使用此功能:
def func(name):
matches = try_test.apply(lambda row: (fuzz.partial_ratio(row['name'], name) >= 85), axis=1)
return [i for i, x in enumerate(matches) if x]
try_test.apply(lambda row: func(row['name']), axis=1)
我得到以下值:
0 [0, 3, 4]
1 [1, 2]
2 [1, 2]
3 [0, 3]
4 [0, 4]
5 [5]
我想用单词字段代替索引。
预期输出:
0 [apple, energy, fire]
1 [orange, diet]
2 [orange, diet]
3 [apple, energy]
4 [apple, fire]
5 [cake]
任何建议将不胜感激。
答案 0 :(得分:0)
使用索引获取df之后,只需再次索引df就可以解决您的问题。这样您就可以在函子外或函子内以及IMO中进行操作;
In [2]: import pandas as pd
In [3]: try_test = pd.DataFrame({'word': ['apple', 'orange', 'diet', 'energy', 'fire', 'cake'],
...: 'name': ['dog', 'cat', 'mad cat', 'good dog', 'bad dog', 'chicken']})
In [4]: try_test
Out[4]:
word name
0 apple dog
1 orange cat
2 diet mad cat
3 energy good dog
4 fire bad dog
5 cake chicken
In [5]: rows = [0,3,4]
In [6]: try_test.loc[rows, 'word']
Out[6]:
0 apple
3 energy
4 fire
Name: word, dtype: object
In [7]: try_test.loc[rows, 'word'].values.tolist()
['apple', 'energy', 'fire']
答案 1 :(得分:0)
将功能从i
更改为try_test.word[i]
def func(name):
matches = try_test.apply(lambda row: (fuzz.partial_ratio(row['name'], name) >= 85), axis=1)
return [try_test.word[i] for i, x in enumerate(matches) if x]