首先发布在这里,希望我能提供所需的信息,以指导我朝着正确的方向发展。
我是Python的新手,我正在开发一个脚本,它基本上从一个CSV文件中获取一个字符串,通过URL将其提供给API,该URL返回一些JSON,而后者又存储在另一个CSV。
问题在于,只有10%的数据被提供给API作为所需的JSON。其余的都返回一个错误,由于我的代码中的异常,它在output.csv中存储为“-----”。
Python-version:2.7.10
input.csv的长度:74826行
以下是代码:
import csv
import urllib2
import json
import datetime
import re
# Sociallytic API-key - reeeaally secret :-)
api_key = 'xxxxxxx' # Not the real key
# File destinations
input_file = 'input.csv'
output_file = 'output.csv'
def prepareRequest(comment):
# Preparing string for parsing to URL
space = '%20'
comment = re.sub('[^A-ZÆØÅa-zæøå0-9]+', ' ', comment)
comment = comment.replace('\t', ' ')
comment = comment.replace('\n', ' ')
comment = comment.replace(' ', space)
return comment
def getAPIFeedback(ready_comment):
# Construct the request
preparedRequest = urllib2.quote('"""http://api.sociallytic.dk/?key=xxxxxx&txt=' + ready_comment + '"""')
request = urllib2.Request(preparedRequest)
# Making the request
json_reply = urllib2.urlopen(request).read()
loaded_json = json_reply
loaded_json = json.loads(json_reply)
return loaded_json
def processAPIFeeback(loaded_json):
# Preparing comment word count
word_count = loaded_json['count_of_words']
# Preparing comment sentiment score - SUM OF INDIVIDUAL WORD SENTIMENT SCORES
sent_score = loaded_json['sentiment_score']
# Preparing comment sentiment score words - SENTIMENT SCORE / SQ(COUNT OF WORDS WITH SENTIMENT SCORE)
sent_score_words = loaded_json['sentiment_score_words']
# Preparing brand sentiment - NEGATIVE, NEUTRAL OR POSITIVE
brand_sentiment = loaded_json['sentiment']
return (word_count, sent_score, sent_score_words, brand_sentiment)
def mainFunction(input_file, output_file, api_key):
# Create new CSV file
with open(output_file, 'wb') as file:
w = csv.writer(file, delimiter=';')
w.writerow(['status_id', 'word_count', 'sent_score', 'sent_score_words', 'brand_sentiment'])
# Counter for returning status to user
counter = 0
# Open and read the CSV file
f = open(input_file, 'r')
csv_f = csv.reader(f, delimiter=';')
# Skip status_id header
next(csv_f, None)
# Displayed loading text
print 'Processing... Please wait...'
# Getting data from CSV file
for row in csv_f:
# For each iteration the counter increases with 1
counter += 1
# Storing status_id
id = row[0]
# Storing comment
comment = row[1]
ready_comment = prepareRequest(comment)
# Making the API-request and writing result to output.csv
try:
# Parsing comment to API
loaded_json = getAPIFeedback(ready_comment)
# Writing status_id and API feedback to CSV file
id_feedback = (id,) + processAPIFeeback(loaded_json)
w.writerow(id_feedback)
except Exception:
w.writerow('-----')
# Output counter to user for each 100 comments processed
if counter % 100 == 0:
print counter, 'comments processed.'
# Closing the CSV files
f.close()
if __name__ == '__main__':
mainFunction(input_file, output_file, api_key)
但是当我将请求直接输入浏览器时,结果会按原样返回。
我在这里做错了什么?