无法使用Python将数据集保存在CSV文件中

时间:2015-12-07 13:51:49

标签: python csv save dataset

我在CSV文件中有一个数据集。我想在新的CSV文件中保存此csv文件的某些列及其行的情感分数。不幸的是,当我尝试这样做时,唯一的输出是在控制台上,新文件不包含任何内容。有谁知道,为什么会这样?

with  open('semevalSenti80.csv', 'wb' ) as fileOutput:
    writer = csv.writer(fileOutput)
    inpTweets = csv.reader(open('semeval80.csv', 'rb'), delimiter='"', quotechar='|')
    stopWords = getStopWordList('stopwords.txt')
    featureList = []
    tweetsTrain = []

    for row in inpTweets:
        if len(row) != 0:
            score = 0
            tweet = row[1]
            processedTweet = processTweet(tweet)
            featureVector = getFeatureVector(processedTweet, stopWords)
            featureList.extend(featureVector)
            for ft in featureVector:
                score = score + get_scores("SentiWordNet_3.0.0_20130122.txt", ft)
                print score, row
                writer.writerow([row[1], score])
                if score > 0:
                    tweetsTrain.append((featureVector, "positive"))
                elif score < 0:
                    tweetsTrain.append((featureVector, "negative"))
                else:
                    tweetsTrain.append((featureVector, "neutral"))

1 个答案:

答案 0 :(得分:0)

将较大的任务拆分为更简单的部分。最后两行使用较小的部分组成处理管道来完成更大的任务。还有改进的余地,但这会让你有所了解。

import csv

stopWords = getStopWordList('stopwords.txt')

def get_tweets(csv_file):
    "return list of tweet items"
    with open(csv_file, 'rb') as f:
        tweets = csv.reader(f, delimiter='"', quotechar='|')
        return tweets

def process_score(tweet):
    "compute score for given tweet text"
    score = 0
    processedTweet = processTweet(tweet)
    featureVector = getFeatureVector(processedTweet, stopWords)
    for feature in featureVector:
        score = score + get_scores("SentiWordNet_3.0.0_20130122.txt", feature)                

    return score

def save_scores(csv_file, tweet_scores):
    "save given iterable to csv file"
    with open(csv_file, 'wb' ) as f:
        writer = csv.writer(f)
        writer.writerows(tweet_scores)


tweet_scores = [(tweet[1],process_score(tweet[1])) for tweet in get_tweets('semeval80.csv')]
save_scores('semevalSenti80.csv', tweet_scores)