需要将字典转换为2列数据框
这是我到目前为止的代码:
keywords= ["big","hat",'dress',"fabric","color"]
def keyword(value):
keyword_counts = {key:0 for key in keywords}
strings = value.split()
for word in strings:
if word in keyword_counts.keys():
keyword_counts[word] += 1
return keyword_counts
key_words_mo
result = keyword(key_words_mo)
print(result)
{'big': 0, 'hat': 0, 'dress': 26, 'fabric': 13, 'color': 9}
下面是我的问题,我需要下面的df来显示关键字的正确值...它们都说零,例如“ dress”应该显示26而不是0,而“ fabric”应该显示13而不是0。例如将两个列名称分别称为“ keyword_term”和“ quantity”
import pandas as pd
from ast import literal_eval
df = pd.DataFrame.from_dict(result, orient='index')
df
0
big 0
hat 0
dress 0
fabric 0
color 0
while 0
答案 0 :(得分:0)
尝试一下:
d = {'big':0,'hat':0,'dress':26,'fabric':13,'color':9}
df = pd.DataFrame(list(d.items()),columns = ['keyword_term','quantity'])
这应该给您您想要的东西。
答案 1 :(得分:0)
您可以使用.count
方法对文本中出现的单词进行计数:
import pandas as pd
def create_df(text, keywords):
words = text.split()
count = [words.count(key) for key in keywords]
d = {'keyword_term': keywords, 'quantity': count}
return pd.DataFrame.from_dict(d)
txt = "I was big and had a hat that dress dress fabric and not"
keywords= ["big","hat",'dress',"fabric","color"]
df = create_df(txt, keywords)
print(df)