我对python中的数据框有疑问。
first, this is data frame
index id keyword
0 @aaa_24 @bbb_2 ["@max", "@travel", ... ,"@food"]
1 @aa_1 @c_5 ["@animal", "@weather", ... ,"@coco"]
2 @ab_7 @ba_3 ...
3 @ccc_1 ...
... ... ...
and i want to convert like this
index id keyword
0 @aaa @bbb unique.keyword() --> (count value)
1 @aa @c unique.keyword() --> (count value)
2 @ab @ba unique.keyword() --> (count value)
3 @ccc unique.keyword() --> (count value)
... ...
请检查此问题
答案 0 :(得分:0)
我不确定这里的目标到底是什么。据我了解,您希望在“关键字”列中获得关键字的频率:
def f(x):
dic = {}
ans = []
for val in x:
if val not in dic:
dic[val]=0
dic[val]+=1
for key in dic.keys():
ans.append((key,dic[key]))
return ans
df['new_keyword'] = df['keywords'].apply(f)
让我知道我是否误解了您的问题