Question

我需要从文件中提取描述，如下所示： “TES4！\ X01 \ X00 \ X00 \ X00 \ X00 \ X00 \ X00 \ X00 \ X00 \ X00 \ X00 \ X00 \ X00 \ X00 \ X00 \ X0F \ X00 \ X00 \ x00HEDR \ X0C \ X00 \ XD7 \ xa3p？ h \ x03 \ x00 \ x00 \ x00 \ x08 \ x00 \ xffCNAM \ t \ x00Martigen \ x00SNAM \ xaf \ x00Mart的Mutant Mod - RC4 \ n \ nDiverse生物和NPC，新生物和NPC，动态大小和统计缩放，增加产卵，改进AI，改进派系等等。\ n \ n \ x00MAST \ r \ x00Fallout3.esm \ x00DATA \ x08 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00MAST \ x16 \ x00Mart的Mutant Mod.esm \ x00DATA \ x08“

我已经知道如何获得我需要的部分，但是仍然有一些不需要的数据，我不知道如何摆脱： \ xaf \ x00Mart的Mutant Mod - RC4 \ n \ nDiverse creatures＆amp; NPC，新生物和NPC，动态大小和统计缩放，增加了生成，改进了AI，改进了派系等等。\ n \ n \ x00

应该成为： Mart's Mutant Mod - RC4 \ n \ nDiverse creatures＆amp; NPC，新生物和NPC，动态大小和统计缩放，增加了生成，改进了AI，改进了派系等等。\ n \ n \

基本上，我需要一种方法来摆脱\ x ##的东西（如果留在那里将最终显示为GUI中显示的奇怪字符），但我还没有设法成功删除它们。

[如果你想知道，它是FO3的.esp文件，我正在搞乱。]

Answer 1

你可以尝试：

import string

cleaneddata = ''.join(c for c in data if c in string.printable)

这假设您已经在字符串中有data。

以下是它对我有用的方法：

>>> s = """TES4!\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0f\x00\x00\x00HEDR\x0c\x00\xd7\xa3p?h\x03\x00\x00\x00\x08\x00\xffCNAM\t\x00Martigen\x00SNAM\xaf\x00Mart's Mutant Mod - RC4\n\nDiverse creatures & NPCs, new creatures & NPCs, dynamic size and stat scaling, increased spawns, improved AI, improved factions, and much more.\n\n\x00MAST\r\x00Fallout3.esm\x00DATA\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00MAST\x16\x00Mart's Mutant Mod.esm\x00DATA\x08"""
>>> print ''.join(c for c in s if c in string.printable)TES4!HEDR
         p?hCNAM    MartigenSNAMMart's Mutant Mod - RC4

Diverse creatures & NPCs, new creatures & NPCs, dynamic size and stat scaling, increased spawns, improved AI, improved factions, and much more.

Fallout3.esmDATAMASTMart's Mutant Mod.esmDATA
>>>

你看不太理想，但至少可能是迈出良好的第一步。

Answer 2

我们做的第一件事是pull up some docs。如果我们看一下底部，它会显示如何处理SNAM子记录。所以我们使用struct来读取长度，然后我们从字符串中获取那么多字节（我猜你忘了以二进制模式打开文件，因为你的例子中的计数是关闭的），null-终止。然后就没有什么可做的，因为我们拥有的是它们。

Answer 3

如果你达到了

的目的

\ xaf \ x00Mart的Mutant Mod - RC4 \ n \ n多元生物＆amp; NPC，新的生物与NPC，动态尺寸和统计缩放，增加产卵，改善人工智能，改进派系，以及更多。\ n \ n \ x00

您可以通过执行以下操作来删除上一个不需要的\ x ##：

exp = re.compile(r"\\x[\w]")
newStr = [s for s in str.split("\\x00") if not re.search(exp, s)]
newStr = "".join(newStr)

摆脱字符串中的\ x ##（Python）

3 个答案: