Question

如果我在字符串中有一个特定的单词，我想创建一个向量来计算字符串中的单词并将其添加到向量中。

下面是我想要的例子。

word_list = ['a','b']  #this words of list is what I say 'specific word'.

如果发现以上任何一个单词，则下面列表中的列表是要提取的列表。

[ 
 ['a', 'b', 'c']
 ['b', 'c', 'b']
 ['r', 'b', 'h']
 ['q', 'w', 'r']
 ['j', 'a', 'd']
 ['b', 'd', 'a']
]

而我想要的结果就是这个。

word |  a  |  b
-----------------
  a  |  0  |  2
  b  |  2  |  0
  c  |  0  |  2
  d  |  2  |  0
  h  |  0  |  1
  j  |  1  |  0
  r  |  0  |  1

我尝试进行一些编码，但我的技能不足，无法处理所有数据。

下面是我尝试输入的代码...

import pandas as pd
from konlpy.tag import Kkma
import numpy as np

test = pd.DataFrame(['a b c','b c b','r b h','q w r','j a d','b d a'],columns = ['txt'])
test_vec= []

for i in range(len(test)):    
    test_vec.append(operater.morphs(test['txt'][i]))

ext = ['a','b']
word = ['word']                         
result = pd.DataFrame([],columns = word + ext)
locate = 0

for i in range(len(test_vec)):
    for j in range(len(ext)):
        print('step0')
        if ext[j] in test_vec[i]:
            print('step1')
            for k in range(len(test_vec[i])):
                if test_vec[i][k] != ext[j]:
                    print('step2')
                    result.loc[locate] = np.nan
                    if np.size(np.where(result['word'] == result[ext[j]].loc[locate])) == 0: 
                        result[ext[j]].loc[locate] = 1
                        result['word'].loc[locate] = test_vec[i][k] 
                    else:
                        result[ext[j]].loc[locate] = result[ext[j]].loc[locate] + 1
                    locate = locate + 1

如果您知道快速有效的解决方案，请告诉我。

如何计算计数向量-Python

0 个答案: