我有一个类似于text = u'\xd7\nRecord has been added successfully, record id: 92'
的字符串。我试图从字符串中删除转义字符\xd7
和\n
,以便可以将其用于其他目的。
我尝试了str(text)
。可以,但是无法删除字符\xd7
。
UnicodeEncodeError:'ascii'编解码器无法在其中编码字符u'\ xd7' 位置0:序数不在范围内(128)
有什么办法可以从字符串中删除上述任何转义字符?谢谢
答案 0 :(得分:2)
您可以使用replace尝试以下操作:
text=u'\xd7\nRecord has been added successfully, record id: 92'
bad_chars = ['\xd7', '\n', '\x99m', "\xf0"]
for i in bad_chars :
text = text.replace(i, '')
text
答案 1 :(得分:1)
您可以通过“切片”字符串来实现:
string = '\xd7\nRecord has been added successfully, record id: 92'
text = string[2:]
答案 2 :(得分:1)
似乎您有一个像python 2.x一样的unicode字符串,我们有一个像
inp_str = u'\ xd7 \ n记录已成功添加,记录ID:92'
如果要删除转义字符,这意味着几乎是特殊的字符,我希望这是不使用任何正则表达式或任何硬编码即可仅获取ascii字符的方法之一。
private OutputStream getEncryptingOutputStream(POIFSFileSystem fileSystem, String password) throws IOException, GeneralSecurityException
{
EncryptionInfo encryptionInfo = new EncryptionInfo(EncryptionMode.standard);
Encryptor encryptor = encryptionInfo.getEncryptor();
encryptor.confirmPassword(password);
return encryptor.getDataStream(fileSystem);
}
private void encryptSpreadSheet(String sourcePath, String destinationPath) throws IOException, GeneralSecurityException, InvalidFormatException
{
try (POIFSFileSystem fileSystem = new POIFSFileSystem())
{
//sourcePath = "/var/tmp/dummy.xlsx"
//destinationPath = "/var/tmp/dummy_encrypted.xlsx"
try (Workbook wb = WorkbookFactory.create(new File(sourcePath));
OutputStream out = getEncryptingOutputStream(fileSystem, "password");)
{
wb.write(out);
}
try (FileOutputStream fileOutputStream = new FileOutputStream(new File(destinationPath)))
{
fileSystem.writeFilesystem(fileOutputStream);
}
}
}
首先我做了编码,因为它已经是unicode了,所以在编码为ascii时,如果有任何字符不在ascii级别,它将被忽略。您只需去除'\ n'
希望这对您有所帮助:)
答案 3 :(得分:0)
我相信正则表达式可以提供帮助
import re
text = u'\xd7\nRecord has been added successfully, record id: 92'
res = re.sub('[^A-Za-z0-9]+', ' ', text).strip()
结果:
'Record has been added successfully record id 92'
答案 4 :(得分:0)
您可以使用内置的正则表达式库。
import re
text = u'\xd7\nRecord has been added successfully, record id: 92'
result = re.sub('[^A-Za-z0-9]+', ' ', text)
print(result)
那会吐出Record has been added successfully record id 92
如果您可以生活在没有标点符号的地方,这似乎可以通过您的测试案例。
答案 5 :(得分:0)
尝试regex
。
import re
def escape_ansi(line):
ansi_escape =re.compile(r'(\xd7|\n)')
return ansi_escape.sub('', line)
text = u'\xd7\nRecord has been added successfully, record id: 92'
print(escape_ansi(text))