我的文件有交替的线条,和弦后跟歌词:
C G Am
See the stone set in your eyes,
F C
see the thorn twist in your side,
G Am F
I wait for you
如何合并后续行以产生如下输出,同时跟踪字符位置:
(C)See the (G)stone set in your (Am)eyes,
see the t(F)horn twist in your s(C)ide,
I (G)wait for y(Am)ou(F)
从How do I read two lines from a file at a time using python可以看出,一次迭代文件2行可以用
完成with open('lyrics.txt') as f:
for line1, line2 in zip(f, f):
... # process lines
但如何合并线条,以便根据第1行的(弦长)字符位置分割第2行?一个简单的
chords = line1.split()
没有位置信息和
for i, c in enumerate(line1):
...
给出单独的字符,而不是和弦。
答案 0 :(得分:1)
您可以使用regexp match objects从第1行提取和弦的位置和内容。必须小心边缘;相同的和弦可以在下一行继续,并且一行可以包含没有匹配歌词的和弦。这两种情况都可以在示例数据中找到。
import io
import re
# A chord is one or more consecutive non whitespace characters
CHORD = re.compile(r'\S+')
def inline_chords(lyrics):
for chords, words in zip(lyrics, lyrics):
# Produce a list of (position, chord) tuples
cs = [
# Handles chords that continue to next line.
(0, None),
# Unpack found chords with their positions.
*((m.start(), m[0]) for m in CHORD.finditer(chords)),
# Pair for the last chord. Slices rest of the words string.
(None, None)
]
# Remove newline.
words = words[:-1]
# Zip chords in order to get ranges for slicing lyrics.
for (start, chord), (end, _) in zip(cs, cs[1:]):
if start == end:
continue
# Extract the relevant lyrics.
ws = words[start:end]
if chord:
yield f"({chord})"
yield ws
yield "\n"
可以不同地处理边缘,例如通过测试第一个和弦是否在循环之前 0 开始,但我觉得单个for循环可以使代码更清晰。
尝试一下:
test = """\
C G Am
See the stone set in your eyes,
F C
see the thorn twist in your side,
G Am F
I wait for you
"""
if __name__ == '__main__':
with io.StringIO(test) as f:
print("".join(list(inline_chords(f))))
生成所需的格式:
(C)See the (G)stone set in your (Am)eyes,
see the t(F)horn twist in your s(C)ide,
I (G)wait for y(Am)ou(F)