我是python初学者。 我正在尝试将所有8个文本文件中的文本添加(连接)到一个文本文件中以形成主体。 但是,我得到了错误 UnicodeDecodeError:'charmap'编解码器无法解码位置7311处的字节0x9d:字符映射到
filenames = glob2.glob('Final_Corpus_SOAs/*.txt') # list of all .txt files in the directory
print(filenames)
输出: ['Final_Corpus_SOAs \\ 1.txt','Final_Corpus_SOAs \\ 2.txt','Final_Corpus_SOAs \\ 2018 SOA Muir.txt','Final_Corpus_SOAs \\ 3.txt','Final_Corpus_SOAs \\ 4.txt',_ Asinal_Corpus \\ 5.txt','Final_Corpus_SOA \\ 6.txt','Final_Corpus_SOA \\ 7.txt','Final_Corpus_SOAs \\ 8.txt']
with open('output.txt', 'w',encoding="utf-8") as outfile:
for fname in filenames:
with open(fname) as infile:
for line in infile:
outfile.write(line)
输出: UnicodeDecodeError:“ charmap”编解码器无法解码位置7311处的字节0x9d:字符映射为未定义
感谢您的帮助。
答案 0 :(得分:0)
答案 1 :(得分:0)
如果您确定编码,则应在打开文件以进行读写时声明它:
encoding = 'utf8' # or 'latin1' or 'cp1252' or...
with open('output.txt', 'w',encoding=encoding) as outfile:
for fname in filenames:
with open(fname, encoding=encoding) as infile:
for line in infile:
outfile.write(line)
如果不确定或不想被编码打扰,可以通过以二进制形式读取和写入文件来以字节级别复制文件:
with open('output.txt', 'wb') as outfile:
for fname in filenames:
with open(fname, 'rb') as infile:
for line in infile:
outfile.write(line)