从文本中提取特定部分

时间:2018-04-09 08:27:28

标签: python

我最喜欢

以字符串作为输入:

n [time%1:28:03::] [time] clock time
n [year%1:28:01::] [year] twelvemonth, yr
v [lily%1:20:00::] [lily] flower
v [man%1:05:01::] [man] homo, human being, human
a [government%1:14:00::] [government] authorities, regime

预期输出为:

[time] clock time
[year] twelvemonth, yr
[lily] flower
[man] homo, human being, human
[government] authorities, regime

我已经尝试过这些代码来分割和搜索我需要的文本

def space_split(a):
    if a.count(" ") == 1:
        return a.split(" ")[0]
    else:
        return " ".join(a.split(" ", 2)[2:])


print(space_split("v [wind%2:35:00::] [wind] wind up, coil the spring of a mechanism"))

我得到的输出是:

[wind] wind up, coil the spring of a mechanism

现在我如何为多个输入运行这些?任何人都可以帮忙吗?

2 个答案:

答案 0 :(得分:3)

您可以使用:

for line in list_input.split('\n'):
    print(space_split(line))

如果输入来自文件:

with open('file_path', 'r') as your_file:
    for line in your_file.readlines():
        print(space_split(line))

将这些内容写入文件:

with open('output_file', 'w') as your_file:
    for line in list_input.split('\n'):
        output_file.write(space_split(line) + '\n')

答案 1 :(得分:0)

我相信您正在寻找一种创建分区而不是分割的方法。 它应该解决识别需要从中获取它的相应分割的问题。

def getPartitionResult(string_input):
    _, partition_parameter ,needed_result = string_input.rpartition("[")
   needed_result = partition_parameter + needed_result
   return needed_result

if __name__ == "__main__":
    list_input = [
        "n [time%1:28:03::] [time] clock time",
        "n [year%1:28:01::] [year] twelvemonth, yr",
        "v [lily%1:20:00::] [lily] flower",
        "v [man%1:05:01::] [man] homo, human being, human",
        "a [government%1:14:00::] [government] authorities, regime"
        ]
    for input_arg in list_input:
        print getPartitionResult(input_arg)