我正在尝试打开一个文本文件,然后通读它,将某些字符串替换为存储在词典中的字符串。基于对Replacing words in text file using a dictionary和How to search and replace text in a file using Python?
的回答就像:
# edit print line to print (line)
import fileinput
text = "sample file.txt"
fields = {"pattern 1": "replacement text 1", "pattern 2": "replacement text 2"}
for line in fileinput.input(text, inplace=True):
line = line.rstrip()
for field in fields:
if field in line:
line = line.replace(field, fields[field])
print (line)
我的文件使用utf-8
编码。
运行此命令时,控制台显示此错误:
UnicodeDecodeError: 'charmap' codec can't decode byte X in position Y: character maps to <undefined>
将encoding = "utf8"
添加到fileinput.FileInput()
时显示错误:
TypeError: __init__() got an unexpected keyword argument 'encoding'
将openhook=fileinput.hook_encoded("utf8")
添加到fileinput.FileInput()
时显示错误:
ValueError: FileInput cannot use an opening hook in inplace mode
我不想忽略错误而插入子代码'ignore'
。
我有文件,字典,并希望将字典中的值替换为stdout
之类的文件。
utf-8
中的源文件:
Plain text on the line in the file.
This is a greeting to the world.
Hello world!
Here's another plain text.
And here too!
我想将world
替换为earth
。
在字典中:{"world": "earth"}
utf-8
中的修改文件:
Plain text on the line in the file.
This is a greeting to the earth.
Hello earth!
Here's another plain text.
And here too!
答案 0 :(得分:0)
fileinput
库存在一些我addressed in the past in a blog post的问题;其中之一是您无法设置编码,并且无法使用就地文件重写。
以下代码可以可以做到这一点,但是您必须将print()
调用替换为对传出文件对象的写操作:
from contextlib import contextmanager
import io
import os
@contextmanager
def inplace(filename, mode='r', buffering=-1, encoding=None, errors=None,
newline=None, backup_extension=None):
"""Allow for a file to be replaced with new content.
yields a tuple of (readable, writable) file objects, where writable
replaces readable.
If an exception occurs, the old file is restored, removing the
written data.
mode should *not* use 'w', 'a' or '+'; only read-only-modes are supported.
"""
# move existing file to backup, create new file with same permissions
# borrowed extensively from the fileinput module
if set(mode).intersection('wa+'):
raise ValueError('Only read-only file modes can be used')
backupfilename = filename + (backup_extension or os.extsep + 'bak')
try:
os.unlink(backupfilename)
except os.error:
pass
os.rename(filename, backupfilename)
readable = io.open(backupfilename, mode, buffering=buffering,
encoding=encoding, errors=errors, newline=newline)
try:
perm = os.fstat(readable.fileno()).st_mode
except OSError:
writable = open(filename, 'w' + mode.replace('r', ''),
buffering=buffering, encoding=encoding, errors=errors,
newline=newline)
else:
os_mode = os.O_CREAT | os.O_WRONLY | os.O_TRUNC
if hasattr(os, 'O_BINARY'):
os_mode |= os.O_BINARY
fd = os.open(filename, os_mode, perm)
writable = io.open(fd, "w" + mode.replace('r', ''), buffering=buffering,
encoding=encoding, errors=errors, newline=newline)
try:
if hasattr(os, 'chmod'):
os.chmod(filename, perm)
except OSError:
pass
try:
yield readable, writable
except Exception:
# move backup back
try:
os.unlink(filename)
except os.error:
pass
os.rename(backupfilename, filename)
raise
finally:
readable.close()
writable.close()
try:
os.unlink(backupfilename)
except os.error:
pass
所以您的代码如下:
导入文件输入
text = "sample file.txt"
fields = {"pattern 1": "replacement text 1", "pattern 2": "replacement text 2"}
with inplace(text, encoding='utf8') as (infh, outfh):
for line in infh:
for field in fields:
if field in line:
line = line.replace(field, fields[field])
outfh.write(line)
请注意,您现在不必删除换行符。
答案 1 :(得分:0)
我试图用这个:
with open(fileName1, "r+", encoding = "utf8", newline='') as fileIn, open(fileName1, "r+", encoding = "utf8", newline='') as fileOut:
for line in fileIn:
for field in fields:
if field in line:
line = line.replace(field, fields[field])
fileOut.write(line)
注意:使用一个文件时,废料将被推到文件末尾。 到目前为止,我还没有弄清楚为什么。它不反映替换的数量。 (替换的数量大于废物的数量。)
伪数学:
oriA 我准备好解决它。 编辑:当我使用两个文件时,一切正常。将第二个fileName1
中的open()
更改为fileName2
。并将mod参数更改为"w+"
。