请请帮忙。我已经有一段时间难以与之斗争,并在遇到问题后遇到问题。我只想尝试创建一个打开文件夹中每个csv文件的循环。这是循环:
folder = '/Users/jolijttamanaha/Documents/Senior/Thesis/Python/TextAnalysis/datedmatchedngrams2/'
for file in os.listdir (folder):
with codecs.open(file, mode='rU', encoding='utf-8') as f:
m=min(int(line[1]) for line in csv.reader(f))
f.seek(0)
for line in csv.reader(f):
if int(line[1])==m:
print line
这是错误:
Traceback (most recent call last):
File "findfirsttrigram.py", line 11, in <module>
m=min(int(line[1]) for line in csv.reader(f))
File "findfirsttrigram.py", line 11, in <genexpr>
m=min(int(line[1]) for line in csv.reader(f))
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 684, in next
return self.reader.next()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 615, in next
line = self.readline()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 530, in readline
data = self.read(readsize, firstline=True)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 477, in read
newchars, decodedbytes = self.decode(data, self.errors)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x87 in position 0: invalid start byte
我来到这里是因为我有一个&#34; Null Byte&#34;错误,我用这篇文章解决了这个问题:"Line contains NULL byte" in CSV reader (Python)
然后我收到一个整数错误,我在帖子"an integer is required" when open()'ing a file as utf-8?
中解决了这个错误然后我收到一条错误,上面写着:&#39; UnicodeException:UTF-16流不以BOM&#39;我用这篇文章utf-16 file seeking in python. how?
解决了这个问题然后我意识到csv模块需要utf-8所以我在这里。
但我终于达到了现有问题的极限。我无法弄清楚发生了什么。请帮忙。
答案 0 :(得分:1)
我不确定为什么但最终有效:
import csv
import os
import unicodecsv
folder = '/Users/jolijttamanaha/Documents/Senior/Thesis/Python/TextAnalysis/datedmatchedngrams3/'
for file in os.listdir (folder):
with open(os.path.join(folder, file), mode='rU') as f:
try:
m=min(int(line[1]) for line in unicodecsv.reader(f, encoding='utf-8', errors='replace'))
except:
print "one no work"
continue
f.seek(0)
for line in unicodecsv.reader(f):
if int(line[1])==m:
print line
答案 1 :(得分:0)
也许尝试使用os.walk以及使用文件中的文件?
folder = '/Users/jolijttamanaha/Documents/Senior/Thesis/Python/TextAnalysis/datedmatchedngrams2/'
for subdir, dirs, files in os.walk(folder):
for file in files:
with codecs.open(file, mode='rU', encoding='utf-16-be') as f:
#Your code here
答案 2 :(得分:0)
显然,您的文件未以UTF-8编码。尝试其他编码。如果您使用的是Windows,'mbcs'
将使用您的Windows版本的默认编码。