Question

我想使用GCP的情感分析来评估文本列表。为此，我已经编写了准备情绪分析（干净，拆分等）的所有模块。

结果，我现在有了一个Pandas数据框，其中每个文本都有自己的一行，并且每行都有一个单元格，其中包含文本中每个句子的单词列表。

例如，文本为： “我喜欢stackoverflow。这非常有帮助。” 看起来像这样（为简单起见，我省略了整个“清理”：

我喜欢stackoverflow。这是非常有帮助的。 | [[“ I”，“ like”，“ stackoverflow”]，[“ It”，“ is”，“非常”“有用”]]

我的代码现在的目标是遍历数据帧中的每一行，并向GCP请求每个文本（该句子应简单地用。和带空格的单词连接）。尽管我需要它们，但这里的文本已经分为句子和单词，这是前面的编辑步骤。

在同步处理中它已经可以工作了，但是由于我以前从未真正使用过asyncio，因此我错过了一些如何设计嵌套循环的方法，这样不仅可以并行查询一行语句，还可以查询所有语句行。

我正在使用Python 3.7，并且可以正常访问GCP。因此，下面的syncrone代码具有完整的功能，但是我想通过异步来减少运行时间。

我已经尝试将最后一个循环中的代码打包到一个方法中，并将其称为asyncron。当然，这仅导致异步行中单个句子的处理。我也用async进行了测试，但是我无法将列表作为参数传递，也无法创建可执行代码。

def on_document_level(df):
  # Calling a method to create new columns in the df
  df = my_pandas.add_column_to_df(df, 'sentiments_per_sents')
  df = my_pandas.add_column_to_df(df, 'magnitudes_per_sents')

  # Instantiates a client
  client = language.LanguageServiceClient()

  # Run over each row in df
  for index, row in df.iterrows():
      # Run over each Sentence
      all_sents_in_row_list = []
      for sentence in row['processed']:
          # Create String for gcp
          all_sents_in_row_list.append(' '.join(token for token in sentence))

      all_sent_in_row_str = '. '.join(sen for sen in all_sents_in_row_list)
      logging.info(all_sent_in_row_str)

      document = types.Document(
          content=all_sent_in_row_str,
          type=enums.Document.Type.PLAIN_TEXT)


      # Detects the sentiment of the text
      sentiment = client.analyze_sentiment(document=document).document_sentiment

      df.loc[index, 'sentiments_per_sents'] = sentiment.score
      df.loc[index, 'magnitudes_per_sents'] = sentiment.magnitude

  return df```

如何运行嵌套for循环异步？

0 个答案: