Question

我有一个清单。我有一个数据框列。我想遍历列表中的dataframe列，并根据dataframe列每一行中包含的列表值创建一个具有相应行的新dataframe列。数据框列是texts_excerpts，列表是我要查找并跟踪的标记。有什么想法吗？

Answer 1

@acodejdatam基于您需要计算列表中有多少个单词出现在text_excerpt列中的假设，您可以尝试以下代码。如果仍然不能解决您的问题，请提供示例，以便我们更好地帮助您解决问题。

 sample dataframe (df)
      index         text
  0      1       I am A
  1      2   My name is
  2      3  Who are you

  sample list (l)
  l = ['My', 'is', 'are']

  def find_match(series, l):
      words = series['text'].split()
      found_words = []
      for word in l:
          if word in words:
              found_words.append(word)
      return found_words

  df['words_contained']= df.apply(find_match, args=(l,), axis=1)

以上示例代码的答案会将df修改为以下

Out[16]: 
          index         text      count
          0      1       I am A      []
          1      2   My name is      [My, is]
          2      3  Who are you      [are]

Answer 2

编辑为原始问题：

这正是我想要做的，除了，我希望列在系列列中的实际单词如下例所示：

      index         text      words_contained
      0      1       I am A      ['I']
      1      2   My name is      ['My', 'name']
      2      3  Who are you      ['are', 'you']

Answer 3

问题的另一个更新：

如果我们没有字典l = ['My'，'is'，'are']，该怎么办呢？ mydict = {'My'：-21，'is'：-12，'is'：1}。您将如何执行与上述类似的操作，但是添加字典值，并根据每行中的单词将“分数”加在一起。我宁愿为每个键（字典中的单词）附加权重（字典中的值）

我正在尝试这样的事情：

`def find_match(series, mydict):
words = series['text'].split()
found_words = []
for word in mydict.keys():
    if word in words:
        found_words.append(mydict.value().sum)
return found_words

df ['words_contained'] = df.apply（find_match，args = [l，），轴= 1）

我不断收到错误消息：AttributeError：（“'list'对象没有属性'keys'”，'发生在索引0'）`

到目前为止，非常感谢您的帮助。这非常有用。：）

新的pandas列，用于在上一列中找到列表的值

3 个答案: