Question

对于搜索，我使用Tweepy流过滤器来跟踪要过滤的多个搜索词。流代码今天在工作时间运行了5个小时，并生成了80MB的结果文件。我加载了80MB的文件；创建R数据帧，然后发出grepl（“ yield”）在80MB数据帧（流文件）中搜索一个词。我为所有带有搜索词的列发出grepl（）：“ yield”;但是零（0）个数据帧具有0列和193510行。我也尝试了R dplyr select（包含）。从扭曲过滤器中找到零结果。

agfarm[,grepl("yield", colnames(agfarm$Value3))] #I tried all columns
agfarm %>% select(contains('yield'))

tweepy过滤器结果文件似乎无法成功找到并传送单个搜索词。搜索词（例如“营养产量”或“农作物”或“食品产量”）是否无效？还是tweepy过滤器找不到这样的术语？ tweepy是否仅适用于＃号标签：@，＃？

my_stream_listener = PrintingStreamListener()
my_stream = tweepy.Stream(auth = api.auth, listener=my_stream_listener)

searchTermsFilter = '"soil yield" OR "nutrient yield" OR "managing crops" OR "food yield" OR "nutrient uptake" OR' \
          '"high yielding crop" OR "fertilizer" OR "soil health" OR "crop yield" OR "acre yield" OR' \
        '"nutrient management" OR "imbalance soil" OR "increase yield" OR "micronutrient" OR' \
        '"sustainability" OR "corn yield" OR "farmers management practices"'

my_stream.filter(track=searchTermsFilter)

Tweepy流式过滤器：为什么过滤器找不到搜索字词？

0 个答案: