假设我有一个跨越多行的字符串(不是文件):
multiline_string = '''I met a traveller from an antique land
Who said: Two vast and trunkless legs of stone
Stand in the desert... near them, on the sand,
Half sunk, a shattered visage lies, whose frown,
And wrinkled lip, and sneer of cold command,
Tell that its sculptor well those passions read
Which yet survive, stamped on these lifeless things,
The hand that mocked them and the heart that fed;
And on the pedestal these words appear:
'My name is Ozymandias, king of kings;
Look on my works, ye Mighty, and despair!'
Nothing beside remains. Round the decay
Of that colossal wreck, boundless and bare
The lone and level sands stretch far away.'''
我只想获取字符串的某些行,作为单个字符串(而不是字符串列表)。一种实现方法是:
pedestal_lines = "\n".join(multiline_string.splitlines()[9:12])
print(pedestal_lines)
输出:
And on the pedestal these words appear:
'My name is Ozymandias, king of kings;
Look on my works, ye Mighty, and despair!'
但是这种方法不是很好:它必须将字符串拆分为字符串列表,对该列表进行索引,然后使用str.join()
方法将列表重新结合在一起。更不用说,它看起来丑陋且不易读。有没有更优雅/ Python化的方法来实现这一目标?
答案 0 :(得分:4)
如果您不想想要分割字符串,则可以执行以下操作:
您将原谅我在以下代码中可能犯的一次性错误。
正则表达式:
import re
print(re.sub("^(.*\n){8}((?:.*\n){3})(.*\n){1,}",r"\2",multiline_string))
(创建一组8行,然后一组3行,然后其余部分,由第二组替换)
位置提取+切片:
linefeed_pos = [i for i,c in enumerate(multiline_string) if c=="\n"]
print(multiline_string[linefeed_pos[7]:linefeed_pos[11]])
(使用列表理解力在原始字符串上提取换行字符的位置,然后使用这些行索引位置进行切片)。这种方法的缺点是,它不仅计算上限,而且还计算 all 个索引。通过将生成器理解包装在列表理解中以在不再需要索引时停止,可以很容易地解决此问题:
linefeed_pos = [next (i for i,c in enumerate(multiline_string) if c=="\n") for _ in range(12)]
也许切片/提取要比拆分和合并要好(我知道看到一个大的列表仅仅浪费3行是不可接受的),但我不会称其为pythonic。
如果性能/内存很重要,那么如果行很多,上述两种方法都应该比您的方法快。如果没有,请坚持您的解决方案。