dist
从上面的输入中,我如何实现以下输出?我希望通过删除重复的单词并将每个单词分成新行来创建一个包含单词及其分数的文件。而是选择得分更高的重复单词。
Score SynsetTerms
1 soft#18 mild#3 balmy#2
1 love#2 enjoy#3
0.625 love#1
请注意,删除了得分为0.625的“爱”一词,因为得分较高,所以仅保留了得分为1的“爱”。
答案 0 :(得分:0)
import re
lst = []
dict = {}
i = 1
fhand = open('C:/Users/10648879/Documents/python_prog/data/test.csv', 'r')
for line in fhand:
if i == 1:
i = i + 1
continue
line = re.sub('#[0-9]*', '', line).strip()
line = re.split('\s+', line)
for counter in range(len(line)):
if counter == 0:
score = line[counter]
continue
if line[counter] in dict:
if score > dict[line[counter]]:
dict[line[counter]] = score
else :
dict[line[counter]] = score
i = i + 1
print 'score' + ' ' + 'SynsetTerms'
for k, v in dict.items():
print v, k