Question

将字符串写入Windows中以二进制模式打开的文件时，新行不会正确编码为Windows。

someText = "some\ntext with\nnew lines in\nit"

with open("newFile.txt", "ab") as f:
    f.write(someText.encode("utf-8"))

给我一个文件，只包含\ n作为换行符而不是Windows中需要的\ r \ n。

我之前使用过以下内容

someText = "some\ntext with\nnew lines in\nit"

with open("newFile.txt", "a", encoding = "utf-8") as f:
    f.write(someText)

在Windows中使用\ r \ n成功编写了该文件作为换行符。不幸的是我无法使用这种方法，因为我在过去遇到了编码问题，导致我改为以二进制模式打开文件。有没有办法解决这个问题而不使用

someText = someText.replace('\n', '\r\n')

在对字符串进行编码之前，这会破坏unix系统下的换行符吗？

Answer 1

您可以使用os.linesep：

获取本机新行字符

import os

someText = "some\ntext with\nnew lines in\nit"

with open("newFile.txt", "ab") as f:
    f.write(someText.encode("utf-8").replace("\n", os.linesep))

如果你的文字可能同时包含\ n和\ r \ n，最好使用正则表达式来替换：

import os
import re

someText = "some\r\ntext with\nnew lines in\r\nit"
# Matches both \n and \r\n
rgx = re.compile("(\\r)?\\n", flags = re.MULTILINE)

with open("newFile.txt", "ab") as f:
    f.write(rgx.sub(os.linesep, someText.encode("utf-8"))

Answer 2

要使用utf8编码和\r\n行结尾以附加模式将字符串写入文件，无论您执行代码的系统是什么，请将其打开为：

f = open('filename', mode='a', encoding='utf-8', newline='\r\n')

您已经确定编码应该是utf8，这是您在打开文件时可以指定的内容。以二进制模式打开对此没有任何帮助，似乎是已经决定解决方案并询问有关而不是需要解决的问题。

在Windows下以二进制模式打开文件的换行符

2 个答案: