Question

我是python的新手，我正在寻找使用如下数据解析几个文本文件（~5000）：

随机文字......
   ID：ABC123456

随机文字......

标题

包含文字

结束

随机文字......

每个文件大约有3000行，我想将标题和结束之间的ID和文本提取到csv文件中，帽子看起来像这样：

ID文字

ABC123456包含文字1

ABC123457包含文字2

非常感谢任何帮助！

这就是我所拥有的：

{
  "name": "example-mean-app-client",
  "dependencies": {},
  "devDependencies": {},
  "ambientDependencies": {
    "bootstrap": "github:DefinitelyTyped/DefinitelyTyped/bootstrap/bootstrap.d.ts#4de74cb527395c13ba20b438c3a7a419ad931f1c",
    "es6-promise": "github:DefinitelyTyped/DefinitelyTyped/es6-promise/es6-promise.d.ts#830e8ebd9ef137d039d5c7ede24a421f08595f83",
    "es6-shim": "github:DefinitelyTyped/DefinitelyTyped/es6-shim/es6-shim.d.ts#4de74cb527395c13ba20b438c3a7a419ad931f1c",
    "jasmine": "github:DefinitelyTyped/DefinitelyTyped/jasmine/jasmine.d.ts#dd638012d63e069f2c99d06ef4dcc9616a943ee4",
    "karma": "github:DefinitelyTyped/DefinitelyTyped/karma/karma.d.ts#02dd2f323e1bcb8a823269f89e0909ec9e5e38b5",
    "karma-jasmine": "github:DefinitelyTyped/DefinitelyTyped/karma-jasmine/karma-jasmine.d.ts#661e01689612eeb784e931e4f5274d4ea5d588b7",
    "systemjs": "github:DefinitelyTyped/DefinitelyTyped/systemjs/systemjs.d.ts#83af898254689400de8fb6495c34119ae57ec3fe",
    "zone.js": "github:DefinitelyTyped/DefinitelyTyped/zone.js/zone.js.d.ts#9027703c0bd831319dcdf7f3169f7a468537f448"
  }
}

Answer 1

尝试在readline行之后的while循环中输入类似的内容：

id = None
title_set = True
f = open("test.txt",'r')
while True:
    text = f.readline()
    if text.startswith("ID: "):
        id = text[4:].strip() # The strip() is to remove the newline
    if text == "End":
        title_set = False
    if text == "Title":
        title_set = True
    if title_set and id is not None:
        print(id + " " + text.strip())

这应该按照您的需要打印所有行（除非格式化）。

将这些行写入另一个文件归结为将print(...)替换为other_file.write(...)，其中other_file是使用写入权限打开的其他文件的句柄。

解析文本文件

1 个答案: