所以我有一个单词数组,存储为键值对。现在我正在尝试计算字符串数组{/ 1}}中单词的频率。我尝试了以下但是这并没有找到x的索引,因为它只是一个字符串。我在令牌数组中没有@ApplicationPath("/api")
public class MyApplication extends Application {
@Override
public Map<String, Object> getProperties() {
Map<String, Object> props = new HashMap<>();
props.put("jersey.config.server.disableMoxyJson", true);
return props;
}
}
的相应值(如果有)。有没有办法直接访问它而不是再添加一个循环来找到它?
tokens
答案 0 :(得分:3)
要计算字符串数组中单词的频率,您可以使用Counter
中的collections
:
In [89]: from collections import Counter
In [90]: s=r'So I have an array of words, stored as key value pairs. Now I am trying to count the frequency of words in an array of strings, tokens. I have tried the following but this doesnt find the index of x as it is only a string. I do not have the corresponding value, if any, of x in tokens array. Is there any way to directly access it rather than adding one more loop to find it first?'
In [91]: tokens=s.split()
In [92]: c=Counter(tokens)
In [93]: print c
Counter({'of': 5, 'I': 4, 'the': 4, 'it': 3, 'have': 3, 'to': 3, 'an': 2, 'as': 2, 'in': 2, 'array': 2, 'find': 2, 'x': 2, 'value,': 1, 'words': 1, 'do': 1, 'there': 1, 'is': 1, 'am': 1, 'frequency': 1, 'if': 1, 'string.': 1, 'index': 1, 'one': 1, 'directly': 1, 'tokens.': 1, 'any': 1, 'access': 1, 'only': 1, 'array.': 1, 'way': 1, 'doesnt': 1, 'Now': 1, 'words,': 1, 'more': 1, 'a': 1, 'corresponding': 1, 'tried': 1, 'than': 1, 'adding': 1, 'strings,': 1, 'but': 1, 'tokens': 1, 'So': 1, 'key': 1, 'first?': 1, 'not': 1, 'trying': 1, 'pairs.': 1, 'count': 1, 'this': 1, 'Is': 1, 'value': 1, 'rather': 1, 'any,': 1, 'stored': 1, 'following': 1, 'loop': 1})
In [94]: c['of']
Out[94]: 5
当拥有外部循环时手动计算单词。令牌随着每次迭代而变化,@ Alexander认为这是一个好方法。此外,Counter
支持+
运算符,这使累积计数更容易:
In [30]: (c+c)['of']
Out[30]: 10
答案 1 :(得分:3)
您肯定希望使用@zhangzaochen建议的Counter
。
但是,这是编写代码的更有效方法:
words = {}
for x in tokens:
if x in words:
words[x] += 1
else:
words[x] = 1
您还可以使用列表理解:
tokens = "I wish I went".split()
words = {}
_ = [words.update({word: 1 if word not in words else words[word] + 1})
for word in tokens]
>>> words
{'I': 2, 'went': 1, 'wish': 1}