Question

我试图应用控制字符，例如＆＃39; \ x08 \ x08＆＃39;应该删除先前的char，字符串（向后移动，写入空格，向后移动）
例如，当我输入python控制台时：

s = "test\x08 \x08"
print s
print repr(s)

我进入我的航站楼：

tes
'test\x08 \x08'

我正在寻找一个功能，让我们说＆＃34;功能＆＃34;，这将适用于＆＃39;控制字符到我的字符串：

v = function("test\x08 \x08")
sys.stdout.write(v)
sys.stdout.write(repr(v))

所以我得到一个＆＃34; clean＆＃34;，无控制字符的字符串：

tes
tes

据我所知，在终端中，这部分由客户端处理，所以可能有一种方法可以使用核心unix函数获取显示的字符串

echo -e 'test\x08 \x08'
cat file.out # control char are here handled by the client
>> tes
cat -v file.out # which prints the "actual" content of the file
>> test^H ^H

Answer 1

实际上，答案比简单的格式更复杂。

进程发送到终端的每个字符都可以看作是有限状态机（FSM）中的转换。该FSM的状态大致对应于显示的句子和光标位置，但是还有许多其他变量，例如终端的尺寸，输入的当前控制序列*，终端模式（例如：VI模式/经典） BASH控制台）等。

可以在pexpect source code中看到此FSM的良好实现。

要回答我的问题，没有核心的unix＆＃34;功能＆＃34;可以将字符串格式化为终端中显示的内容，因为这样的功能特定于呈现进程的终端。输出，你必须重写一个完整的终端来处理每个可能的字符和控制序列。

但是我们可以自己实施一个简单的方法。我们需要定义一个具有初始状态的FSM：

显示字符串：＆＃34;＆＃34; （空字符串）
光标位置：0

和过渡（输入字符）：

任何字母数字/空格字符：单独替换光标位置处的字符（如果没有，则添加）并增加光标位置
\x08十六进制代码：递减光标位置

并将其提供给字符串。

Python解决方案

def decode(input_string):

    # Initial state
    # String is stored as a list because
    # python forbids the modification of
    # a string
    displayed_string = [] 
    cursor_position = 0

    # Loop on our input (transitions sequence)
    for character in input_string:

        # Alphanumeric transition
        if str.isalnum(character) or str.isspace(character):
            # Add the character to the string
            displayed_string[cursor_position:cursor_position+1] = character 
            # Move the cursor forward
            cursor_position += 1

        # Backward transition
        elif character == "\x08":
            # Move the cursor backward
            cursor_position -= 1
        else:
            print("{} is not handled by this function".format(repr(character)))

    # We transform our "list" string back to a real string
    return "".join(displayed_string)

一个例子

>>> decode("test\x08 \x08")
tes

关于控制序列的注释

ANSI控制序列是一组字符，用作终端（显示/光标/终端模式/ ...）状态的转换。它可以看作是我们FSM状态的细化和具有更多子状态和子转换的转换。

例如：当您在经典Unix终端（例如VT100）中按UP键时，实际上输入了控制序列：ESC 0 A其中ESC是十六进制代码\x1b 。 ESC转换为ESCAPE模式，并在A。

之后返回正常模式

某些过程将此序列解释为垂直光标位置（VI）的移动，其他过程在历史记录中向后移动（BASH）：它完全取决于处理输入的程序。

但是，输出过程可以使用相同的序列，但很可能会将光标向上移动到屏幕上：它取决于终端的实现。

有一个很好的ANSI控制序列列表here。

将控制字符应用于字符串 - Python

1 个答案:

Python解决方案

关于控制序列的注释