根据字符位置合并Python中的行

时间:2018-01-15 13:22:33

标签: python

我的文件有交替的线条,和弦后跟歌词:

C       G                 Am
See the stone set in your eyes,
         F                   C
see the thorn twist in your side,
  G         Am F
I wait for you

如何合并后续行以产生如下输出,同时跟踪字符位置:

(C)See the (G)stone set in your (Am)eyes,
see the t(F)horn twist in your s(C)ide,
I (G)wait for y(Am)ou(F)

How do I read two lines from a file at a time using python可以看出,一次迭代文件2行可以用

完成
with open('lyrics.txt') as f:
    for line1, line2 in zip(f, f):
        ...  # process lines

但如何合并线条,以便根据第1行的(弦长)字符位置分割第2行?一个简单的

chords = line1.split()

没有位置信息和

for i, c in enumerate(line1):
    ...

给出单独的字符,而不是和弦。

1 个答案:

答案 0 :(得分:1)

您可以使用regexp match objects从第1行提取和弦的位置和内容。必须小心边缘;相同的和弦可以在下一行继续,并且一行可以包含没有匹配歌词的和弦。这两种情况都可以在示例数据中找到。

import io
import re

# A chord is one or more consecutive non whitespace characters
CHORD = re.compile(r'\S+')

def inline_chords(lyrics):
    for chords, words in zip(lyrics, lyrics):
        # Produce a list of (position, chord) tuples
        cs = [
            # Handles chords that continue to next line.
            (0, None),
            # Unpack found chords with their positions.
            *((m.start(), m[0]) for m in CHORD.finditer(chords)),
            # Pair for the last chord. Slices rest of the words string.
            (None, None)
        ]
        # Remove newline.
        words = words[:-1]

        # Zip chords in order to get ranges for slicing lyrics.
        for (start, chord), (end, _) in zip(cs, cs[1:]):
            if start == end:
                continue

            # Extract the relevant lyrics.
            ws = words[start:end]

            if chord:
                yield f"({chord})"

            yield ws

        yield "\n"

可以不同地处理边缘,例如通过测试第一个和弦是否在循环之前 0 开始,但我觉得单个for循环可以使代码更清晰。

尝试一下:

test = """\
C       G                 Am
See the stone set in your eyes,
         F                   C
see the thorn twist in your side,
  G         Am F
I wait for you
"""

if __name__ == '__main__':
    with io.StringIO(test) as f:
        print("".join(list(inline_chords(f))))

生成所需的格式:

(C)See the (G)stone set in your (Am)eyes,
see the t(F)horn twist in your s(C)ide,
I (G)wait for y(Am)ou(F)