如何使用TextBlob

时间:2019-04-15 17:16:23

标签: python json

我使用的是我在网上找到的一些代码,该代码在Python中使用TextBlob来分析Tweets的情感,并且它生成的JSON文件使用单引号,而我需要使用双引号。我无法弄清楚如何在代码中进行更改,因此我想知道是否有人会提供比我更多的知识。

我已经尝试过用双引号替换Notepad ++中的单引号,但是显然这有点棘手,因为我不想替换推文中写的实际引号和撇号。

"""
Author: Stephen W. Thomas
Perform sentiment analysis using TextBlob to do the heavy lifting.
"""
from textblob import TextBlob
import csv
import re
import operator

tweets = []

def strip_non_ascii(string):
    stripped = (c for c in string if 0 < ord(c) < 127)
    return ''.join(stripped)

#LOAD AND CLEAN DATA
with open("bachelormonday_tweets.csv", "rt") as csvfile:
    reader = csv.reader(csvfile, delimiter=",")
    next(reader)
    for row in reader:

        tweet= dict()
        tweet["orig"]=row[0]

        tweet["TextBlob"] = TextBlob(tweet["clean"])
        tweets.append(tweet)

# DEVELOP MODELS
for tweet in tweets:
    tweet["polarity"] = float(tweet["TextBlob"].sentiment.polarity)
    tweet["subjectivity"] = float(tweet["TextBlob"].sentiment.subjectivity)

    if tweet["polarity"] >= 0.1:
        tweet["sentiment"] = 'positive'
    elif tweet["polarity"] <= -0.1:
        tweet["sentiment"] = 'negative'
    else:
        tweet["sentiment"] = 'neutral'

tweets_sorted = sorted(tweets, key=lambda k: k["polarity"])
print(tweets)

我想要的是一个文本输出,该输出在元素周围加上双引号,但是我得到的是这样的:

{
    'orig': 'Who else is waiting for that fence jump from #TheBachelor?? Show us the goods already! @chrisbharrison @coltonpic.twitter.com/x2sMwgmVxg',
    'clean': 'who else is waiting for that fence jump from #thebachelor?? show us the goods already! @chrisbharrison @coltonpic.twitter.com/x2smwgmvxg',
    'TextBlob': TextBlob("who else is waiting for that fence jump from #thebachelor?? show us the goods already! @chrisbharrison @coltonpic.twitter.com/x2smwgmvxg"),
    'polarity': 0.0,
    'subjectivity': 0.0,
    'sentiment': 'neutral'
  },

1 个答案:

答案 0 :(得分:0)

使用df1 <- structure(list(id = 1:3, dx_1 = c(401L, 2500L, 18524L), dx_2 = c(NA, 4011L, NA), dx_3 = c(NA, NA, NA), dx_n = c(NA, NA, NA)), class = "data.frame", row.names = c(NA, -3L)) 模块。您可能不得不忽略json元素,因为它没有JSON表示形式。

TextBlob