我有一个字符串列表,如下:
string_list=['philadelphia court excessive disappointed court hope','hope jurisdiction obscures acquittal court','mention hope maryland signal held problem internal reform life bolster level grievance']
以及单词列表,例如:
words=['hope','court','mention','maryland']
现在,我只想将字符串列表中出现的列表单词的发生次数放入单独的字典中,关键字为'doc_(index),值作为嵌套字典,值作为嵌套字典,关键字为出现的单词,值作为计数。预期输出为:
words_dict={'doc_1':{'court':2,'hope':1},'doc_2':{'court':1,'hope':1},'doc_3':{'mention':1,'hope':1,'maryland':1}}
我第一步要做的是
docs_dict={}
count=0
for i in string_list:
count+=1
docs_dic['doc_'+str(count)]=i
print (docs_dic)
{'doc_1': 'philadelphia court excessive disappointed court hope', 'doc_2': 'hope jurisdiction obscures acquittal court', 'doc_3': 'mention hope maryland signal held problem internal reform life bolster level grievance'}
在此之后,我不知道如何获得字数统计。我到目前为止所做的:
docs={}
for k,v in words_dic.items():
split_words=v.split()
for i in words:
if i in split_words:
docs[k][i]+=1
else:
docs[k][i]=0
答案 0 :(得分:1)
您可以在python中使用count来获取句子中的字数。
检查此代码:
words_dict = {}
string_list=['philadelphia court excessive disappointed court hope','hope jurisdiction obscures acquittal court','mention hope maryland signal held problem internal reform life bolster level grievance']
words_list=['hope','court','mention','maryland']
for i in range(len(string_list)): #iterate over string list
helper = {} #temporary dictionary
for word in words_list: #iterate over word list
x = string_list[i].count(word) #count no. of occurrences of word in sentence
if x > 0:
helper[word]=x
words_dict["doc_"+str(i+1)]=helper #add temporary dictionary into final dictionary
#Print dictionary contents
for i in words_dict:
print(i + ": " + str(words_dict[i]))
以上代码的输出为:
doc_3: {'maryland': 1, 'mention': 1, 'hope': 1}
doc_2: {'court': 1, 'hope': 1}
doc_1: {'court': 2, 'hope': 1}
答案 1 :(得分:0)
使用Counter获取每个文档中的字数。
尝试一下,
Future<void> createData(docId) {
final Tracker createTracker = Tracker(
id: docId,
comment: _commentController.value,
exercise: _exerciseController.value,
number: _numberController.value,
repetition: _repetitionController.value,
sets: _setsController.value,
weight: _weightController.value
);
print(_commentController.value + _numberController.value.toString());
return trackerDb.createData(docId, createTracker);
}
输出:
Future createData(String docId, Tracker tracker) async {
await db.document(docId).setData(convertTrackerToMap(tracker));
}
答案 2 :(得分:0)
问题here似乎可以解决。
以下是我尝试执行所需代码的尝试。
from collections import Counter
string_list=['philadelphia court excessive disappointed court hope','hope jurisdiction obscures acquittal court','mention hope maryland signal held problem internal reform life bolster level grievance']
words=['hope','court','mention','maryland']
result_dict = {}
for index, value in enumerate(string_list):
string_split = value.split(" ")
cntr = Counter(string_split)
result = { key: cntr[key] for key in words }
result_dict['doc'+str(index)] = result
希望您发现它有用。
答案 3 :(得分:0)
尝试一下
from collections import Counter
string_list = ['philadelphia court excessive disappointed court hope',
'hope jurisdiction obscures acquittal court',
'mention hope maryland signal held problem internal reform life bolster level grievance']
words = ['hope', 'court', 'mention', 'maryland']
result = {f'doc_{i + 1}': {key: value for key, value in Counter(string_list[i].split()).items() if key in words} for i in range(len(string_list))}
print(result)
输出:
{'doc_1': {'court': 2, 'hope': 1}, 'doc_2': {'hope': 1, 'court': 1}, 'doc_3': {'mention': 1, 'hope': 1, 'maryland': 1}}