用Python解析文本

时间:2017-06-19 18:13:31

标签: python-3.x split

我在文本文件中有类似下面的示例数据的数据。我想要做的是搜索文本文件并返回" SpecialStuff"之间的所有内容。和下一个&#34 ;;",就像我已经完成了示例输出。我对python很陌生,所以任何提示都非常受欢迎,像.split()之类的工作会有效吗?

Example Data:

stuff:
    1
    1
    1
    23

];

otherstuff:
    do something
    23
    4
    1

];

SpecialStuff
    select
        numbers
        ,othernumbers
        words
;

MoreOtherStuff
randomstuff
@#123


Example Out Put:

select
        numbers
        ,othernumbers
        words

3 个答案:

答案 0 :(得分:1)

你可以试试这个:

file = open("filename.txt", "r") # This opens the original file
output = open("result.txt", "w") # This opens a new file to write to
seenSpecialStuff = 0 # This will keep track of whether or not the 'SpecialStuff' line has been seen.
for line in file:
    if ";" in line:
        seenSpecialStuff = 0 # Set tracker to 0 if it sees a semicolon.
    if seenSpecialStuff == 1:
        output.write(line)  # Print if tracker is active 
    if "SpecialStuff" in line:
        seenSpecialStuff = 1 # Set tracker to 1 when SpecialStuff is seen

这将返回名为result.txt的文件,其中包含:

  select
    numbers
    ,othernumbers
    words

此代码可以改进!由于这可能是家庭作业,你可能想要做更多有关如何提高效率的研究。希望它对你来说是一个有用的起点!

干杯!

修改

如果您希望代码专门读取“SpecialStuff”行(而不是包含“SpecialStuff”的行),您可以轻松更改“if”语句以使其更具体:

file = open("my.txt", "r")
output = open("result.txt", "w")
seenSpecialStuff = 0
for line in file:
    if line.replace("\n", "") == ";":
        seenSpecialStuff = 0
    if seenSpecialStuff == 1:
        output.write(line)
    if line.replace("\n", "") == "SpecialStuff":
        seenSpecialStuff = 1

答案 1 :(得分:0)

git config --global http.sslverify "false"

答案 2 :(得分:0)

不要使用str.split() - str.find()绰绰有余:

parsed = None
with open("example.dat", "r") as f:
    data = f.read()  # load the file into memory for convinience
    start_index = data.find("SpecialStuff")  # find the beginning of your block
    if start_index != -1:
        end_index = data.find(";", start_index)  # find the end of the block
        if end_index != -1:
            parsed = data[start_index + 12:end_index]  # grab everything in between
if parsed is None:
    print("`SpecialStuff` Block not found")
else:
    print(parsed)

请记住,这将捕获这两者之间的所有内容,包括新行和其他空格 - 如果您不想要,还可以parsed.strip()删除前导和尾随空格它们。