Question

我正在尝试删除文本中的空行，但在使用此代码时here：

import io
with open("outprint6.csv", "r") as f:
for line in f:
    cleanedLine = line.strip()
    if cleanedLine: # is not empty
        print(cleanedLine)

        f = io.open('eliminado', 'a')
        f.write(unicode(cleanedLine, 'ascii'))
        f.write(u'\n')
        f.close()

我收到了这个错误：

'utf8' codec can't decode byte 0xfa in position 21: invalid start byte.

我该如何解决？我在这里找到了一些答案，但在这种情况下无效。（我在编程方面真的很新......）

它解决了空行的问题，但我无法将已处理的文本写入新的csv文件。该文本以西班牙文撰写。我看到写这些字母时出现的错误（í，ó等）

我使用以下代码检索了Twitter数据：

import tweepy
import json
import io

# Authentication details. To  obtain these visit dev.twitter.com
consumer_key = ''
consumer_secret = ''
access_token = ''
access_token_secret = ''

# This is the listener, responsible for receiving data
class StdOutListener(tweepy.StreamListener):
  def on_data(self, data):
    print '1'
    # Twitter returns data in JSON format - we need to decode it first
    decoded = json.loads(data)

    if  not decoded['text'].startswith('RT'):

        try:
            # Also, we convert UTF-8 to ASCII ignoring all bad characters sent by users
            tweet = '@%s; %s; %s; %s; %s; %s; %s; %s; %s; %s; ""[%s]""; %s' % (decoded['user']['id'], decoded['user']['location'], decoded['user']['followers_count'], decoded['user']['created_at'], decoded['user']['utc_offset'], decoded['user']['time_zone'], decoded['coordinates'], decoded['place'], decoded['id'], decoded['created_at'], decoded['text'].encode('ascii', 'ignore'), decoded['retweet_count'])

            print tweet
            f = io.open('outprint6.csv', 'a')
            f.write(tweet)
            f.write(u'\n')
            f.close()           

        except:
            pass    

  def on_error(self, status):
    print status       

# if __name__ == '__main__':
l = StdOutListener()
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

print "Showing all new tweets for"

#There are different kinds of streams: public stream, user stream, multi-user streams
# In this example follow #programming tag
# For more details refer to https://dev.twitter.com/docs/streaming-apis
stream = tweepy.Stream(auth, l)
stream.filter(locations=[-81.397882,-4.972829,-75.288231,0.762316])

对于“文本字段”中的文本，编码为'ascii'，但在使用它写入新的csv文件时，我遇到了问题......

Python - 在csv文件中写入数据时出错

0 个答案: