我不完全确定我需要对此错误做些什么。我认为它与需要添加.encode('utf-8')有关。但我不完全确定这是我需要做的,也不应该在哪里应用。
错误是:
function sendRequest($url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
/*curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'GET '.$url.' HTTP/1.1', // Are you sure about this?
'User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3',
'Accept: text/html',
'Accept-Language: ru,en-us;',
'Accept-Charset: windows-1251,utf-8;',
'Connection: close'
));*/
$contents = curl_exec($ch);
curl_close($ch);
return $contents;
}
function getUrlContents($url, $maximumRedirections = null, $currentRedirection = 0)
{
$result = false;
$contents = sendRequest($url);
// Check if we need to go somewhere else
if (isset($contents) && is_string($contents))
{
preg_match_all('/<[\s]*meta[\s]*http-equiv="?REFRESH"?' . '[\s]*content="?[0-9]*;[\s]*URL[\s]*=[\s]*([^>"]*)"?' . '[\s]*[\/]?[\s]*>/si', $contents, $match);
if (isset($match) && is_array($match) && count($match) == 2 && count($match[1]) == 1)
{
if (!isset($maximumRedirections) || $currentRedirection < $maximumRedirections)
{
return getUrlContents($match[1][0], $maximumRedirections, ++$currentRedirection);
}
$result = false;
}
else
{
$result = $contents;
}
}
return $contents;
}
echo getUrlContents('http://wtion');
这是我的python脚本的基础。
line 40, in <module>
writer.writerows(list_of_rows)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 1
7: ordinal not in range(128)
答案 0 :(得分:20)
Python 2.x CSV库已损坏。你有三个选择。按复杂程度排列:
编辑:请参阅下文使用固定库https://github.com/jdunck/python-unicodecsv( pip install unicodecsv
)。用作替代品 - 示例:
with open("myfile.csv", 'rb') as my_file:
r = unicodecsv.DictReader(my_file, encoding='utf-8')
击> <击> 撞击>
阅读有关Unicode的CSV手册:https://docs.python.org/2/library/csv.html(参见底部示例)
将每个项目手动编码为UTF-8:
for cell in row.findAll('td'):
text = cell.text.replace('[','').replace(']','')
list_of_cells.append(text.encode("utf-8"))
编辑,我发现在阅读UTF-16时,python-unicodecsv也被破坏。它抱怨任何0x00
字节。
相反,使用https://github.com/ryanhiebert/backports.csv,它更接近Python 3的实现并使用io
模块..
安装:
pip install backports.csv
用法:
from backports import csv
import io
with io.open(filename, encoding='utf-8') as f:
r = csv.reader(f):
答案 1 :(得分:0)
除了Alastair的优秀建议外,我发现最简单的选择是使用python3而不是python 2.我的脚本中所需要的只是更改wb
open
语句只需accordance with Python3's syntax中的w
语句。
答案 2 :(得分:0)
问题出在python 2中的csv库中。 来自unicodecsv project page
Python 2的csv模块无法轻松处理unicode字符串,从而导致可怕的“'ascii'编解码器无法在位置编码字符...”异常。
如果可以,只需安装unicodecsv
user.get().then(doc => {
//you get user doc value by using data()
const userData = doc.data();
// then you can use all properties from userData
const verified = userData.verified;
});
pip install unicodecsv