Question

对我来说这似乎很简单，但出于某种原因，我无法让python在以下内容中正确分割。

f = open('text', 'r')
x = f.read()
f.close()
result = x.split('^ggggg', 1)[0]

文件“text”具有以下内容：

aaaaa1234
bbbbb1234
ccccc1234
ggggg1234
hhhhh1234

我认为“结果”会包含ggggg行之前的所有内容，但它只包含整个文本。如何让python以“ggggg”开头的前面分开？

Answer 1

首先，str.split()仅对文字文本进行拆分，或者在使用None（默认值）的情况下，任意空格。不支持正则表达式。您可以在\nggggg上分割文件内容：

x.split('\nggggg', 1)[0]

如果必须使用正则表达式，请使用re.split() function。

为了效率，您可以改为循环，然后测试该行是否以ggggg开头并停止迭代：

result = []

with open('text', 'r') as f:
    for line in f:
        if line.startswith('ggggg'):
            break
        result.append(line)

这样你就不必阅读整个文件了。您也可以使用itertools.takewhile()：

from itertools import takewhile
with open('text', 'r') as f:
    result = list(takewhile(lambda l: not l.startswith('ggggg'), f))

两个选项都会产生一个字符串列表。

Answer 2

str.split()不接受正则表达式。

然而，您可以使用字符串'\ nggggg'，它将匹配\n，如果它不在文件的顶部。

另一种可能性是使用正则表达式函数documented here。

Answer 3

不读取所有文件更好，但是对于一般知识，这里有如何轻松处理您的问题，字符串明智...

result = x[0:x.find("ggggg")]

Answer 4

如果我正确理解您的问题，您想将result设置为ggggg行之前的所有内容吗？

您可以尝试以下方法：

result = ''
with open('text','r') as f: // Open file 'text' as 'r'eadonly,
    f.seek(0) // move the readcursor to the beginning of the document
    for line in f: // for each line...
        if not line.startswith('ggggg'): // If 'ggggg' isn't at the beginning of the line..
            result = "{0}\n{1}".format(result, line) // append the line to the result variable.
        else:
            break
f.close()

如果您宁愿这样做，只是忽略ggggg行并获得其他所有内容，请尝试：

result = ''
with open('text','r') as f: // Open file 'text' as 'r'eadonly,
    f.seek(0) // move the readcursor to the beginning of the document
    for line in f: // for each line...
        if not line.startswith('ggggg'): // If 'ggggg' isn't at the beginning of the line..
            result = "{0}\n{1}".format(result, line) // append the line to the result variable.
        else:
            continue
f.close()

Answer 5

根本不需要Python分割功能。我用简单的字符串函数得到了相同的结果如果您需要严格按照列表和拆分功能进行回答，请道歉。

#!/usr/bin/python
fh=open('text', 'r')

for line in fh:
    if line.startswith(ggggg): break
    print line

print "DONE"
fh.close()

Python：如果行以“ggggg”开头，你如何拆分字符串？

5 个答案: