这些天我正在学习python和django,今天我想写一些有趣的东西。所以我编写了一个从this site下载一些数据的脚本。这是我的代码:
# -*- coding:utf-8 -*-
import sys
reload(sys)
sys.setdefaultencoding('utf8')
import chardet
from django.http import HttpResponse
from bs4 import BeautifulSoup
import urllib2
from getdata.models import trel
class Rel(object):
def __init__(self):
self.result = None
self.codenu = None
self.title = None
self.user = None
def getdata(request):
page = 1
while page <= 1:
url = "http://pythontip.sinaapp.com/coding/code_record?page=" + str(page)
html = urllib2.urlopen(url)
content = html.read()
soup = BeautifulSoup(content).find('tbody')
for tr in soup.find_all('tr'):
r = trel()
td_list = tr.find_all('td')
r.codenu = td_list[0].get_text()
r.title = td_list[1].get_text()
r.user = td_list[2].get_text()
r.result = td_list[3].get_text()
print "The encoding is %s and Type is %s" % (chardet.detect(r.codenu),type(r.codenu))
r.save()
page += 1
注意打印句子。在我的机器上,我得到了这个:
The encoding is {'confidence': 1.0, 'encoding': 'ascii'} and Type is <type 'unicode'>
我感到非常奇怪,因为编码和类型令人困惑。任何人都可能出错了吗?
错误跟踪在这里:
环境:
Request Method: GET
Request URL: http://127.0.0.1:8000/getdata/
Django Version: 1.5.1
Python Version: 2.7.3
Installed Applications:
('django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.sites',
'django.contrib.messages',
'django.contrib.staticfiles',
'django.contrib.admin',
'getdata')
Installed Middleware:
('django.middleware.common.CommonMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware')
Traceback:
File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py" in get_response
115. response = callback(request, *callback_args, **callback_kwargs)
File "/home/leo/challenge/challenge/getdata/views.py" in getdata
25. for tr in soup.find_all('tr'):
Exception Type: AttributeError at /getdata/
Exception Value: 'NoneType' object has no attribute 'find_all'
答案 0 :(得分:0)
您可以添加所有错误跟踪吗?
此错误可能是由您的数据库编码配置(MySQL)引起的。
答案 1 :(得分:0)
我尝试删除我的数据库,然后再次创建它,我发现问题已解决。更改MySQL的字符集后,应删除数据库。