从文本文件删除多行

时间:2018-11-05 22:09:14

标签: python python-3.x

因此,我对此进行了一些研究,到目前为止,我发现我需要逐行将文件读取到内存中,因为此文件最终将变得很大,请检查我不需要的字符串并从那里继续读/写。

我的程序按日期搜索文本文件,读取日期下方的行,并在到达“结束”时停止。我需要能够从日期到“结束”删除一个表,并用存储在字典中的相同格式的另一个表替换它。

这是我到目前为止所拥有的。

这是文本文件:

05/11/18
test1 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
test2 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A 
test3 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A 
test4 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A 
end

06/11/18
test1 N/A N/A 09:30 18:00 09:30 18:00 09:30 18:00 09:30 18:00
test2 08:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00
test3 09:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00
test4 10:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00
end

这是带有新表的字典:

  {'test1': ['N/A', 'N/A', '09:30', '18:00', '09:30', '18:00', '09:30', '18:00', '09:30', '18:00'], 
'test2': ['08:30', '18:00', '10:30', '18:00', '10:30', '18:00', '10:30', '18:00', '10:30', '18:00'], 
'test3': ['09:30', '18:00', '07:30', '18:00', '07:30', '18:00', '07:30', '18:00', '07:30', '18:00'], 
'test4': ['10:30', '18:00', '08:30', '18:00', '08:30', '18:00', '08:30', '18:00', '08:30', '18:00']}

顺便说一句,我试图用日期05/11/18替换表格。

这是用于读取文件中的每一行并查找以日期开头的行的代码。

received="05/11/18"
with open("StaffTimes.txt","r+") as file:
    new_f=file.readlines()
    file.seek(0) #Puts pointer to start of file
    for line in new_f: #For every line in the file

        if received not in line: #If the date is not in the line
            file.write(line) #Re-write the line into the file

        if received in line:
            while True:
                nextLine=next(file, "").strip() #Stores the next line in nextLine
                if nextLine=="end": #Loops until end is found
                    next(file, "") #Now pointer is at line after end
                    break

这是用于将字典写回文本的代码。 (这不是问题,只需提供上下文即可。)

file.write(received)
    file.write("\n")
    usernameList=["test1", "test2", "test3", "test4"] #This will be received from client
    for username in usernameList:
        file.write(username)
        file.write(" ")
        workTimes=times.get(username)
        for time in workTimes:
            file.write(time)
            file.write(" ")
        file.write("\n")
    file.write("end")
    file.write("\n")
    file.write("\n")

总的来说,我的问题是我似乎只能删除日期,而不能删除日期。无论如何,它也只是重写整个内容,包括带有和不带有日期的新表。

我需要文本文件在重写后看起来像这样:

05/11/18
test1 N/A N/A 09:30 18:00 09:30 18:00 09:30 18:00 09:30 18:00 
test2 08:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00 
test3 09:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00 
test4 10:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00 
end

06/11/18
test1 N/A N/A 09:30 18:00 09:30 18:00 09:30 18:00 09:30 18:00
test2 08:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00
test3 09:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00
test4 10:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00
end

3 个答案:

答案 0 :(得分:0)

也许更好的方法是:

old_file = open("/path/to/old_file.txt", "r")
new_file = open("/path/to/new_file.txt", "w")

for line in old_file:
    if received in line:
        write_replacement_lines_to_new_file()

        # now skip lines in old_file until we get to the "end" marker.
        # the "for" loop will continue reading from the current
        # position in old_file
        for line in old_file: 
            if "end" in line:
                break

    else:
        new_file.write(line)

old_file.close()
new_file.close()

然后最后,只需将new_file复制到old_file上(也许使用os.rename()

答案 1 :(得分:0)

首先制作一个简单的迭代器,为您提供每个区块

public static void main(String[] args) throws Exception {
    for (int l = 0; l < 9; l++) {
        java.io.File myfile;
        String mypath;
        mypath = "/Users/tonyg/Downloads";
        myfile = new java.io.File(mypath + "/file.txt");
        Scanner myinfile = new Scanner(myfile);
        int val1;
        val1 = myinfile.nextInt();
        System.out.println(val1);
    }
}

然后您可以测试每个块

def iter_dates_in_file(filehandle_in):
  for line in filehandle_in:
     if re.match("\d{1,2}/\d{1,2}/\d{2,4}",line.strip()):
        matched = [line]
        while not matched[-1].strip() == "end":
          matched.append(next(filehandle_in))
        yield ''.join(matched)

答案 2 :(得分:0)

  • 您阅读文件,直到找到所需的日期
    • 直到您将所有行(包括具有查找日期的行)简单复制到第二个文件。
  • 如果您碰到了日期,请跳过一行,直到找到下一个结尾(使用布尔)。
  • 将新数据和“ end”写入新文件,重置bool并继续

测试文件创建:

t = """ Some other data

05/11/18 
test1 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A 
test2 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A  
test3 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A  
test4 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A  
end

06/11/18 
test1 N/A N/A 09:30 18:00 09:30 18:00 09:30 18:00 09:30 18:00 
test2 08:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00 
test3 09:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00 
test4 10:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00 
end"""

with open ("file.txt","w") as f: 
    f.write(t)

用于读写新文件的代码:

look_for ="05/11/18"     
data = { 'test1': ['N/A', 'N/A', '09:30', '18:00', '09:30', '18:00', '09:30', '18:00', '09:30', '18:00'],  
'test2': ['08:30', '18:00', '10:30', '18:00', '10:30', '18:00', '10:30', '18:00', '10:30', '18:00'],  
'test3': ['09:30', '18:00', '07:30', '18:00', '07:30', '18:00', '07:30', '18:00', '07:30', '18:00'],  
'test4': ['10:30', '18:00', '08:30', '18:00', '08:30', '18:00', '08:30', '18:00', '08:30', '18:00']}

with open("file.txt","r") as f, open("file_2.txt","w") as f_new:
    # remember if we found it
    found_it = False
    for line in f.readlines():

        # handles the case we are currently in the region we need to skip lines till end
        if found_it:
            if line.startswith("end"):
                found_it = False

                # write replacement data and add end
                for k in data:
                    f_new.write(' '.join( [k] + data[k] +["\n"] ) )
                f_new.write(line) # add the end

            else:

                # found it but still reading its data: 
                # skip line from output
                continue

        # not in the critical region, just transfer lines
        if not line.startswith( look_for ):
            f_new.write(line)
            continue
        else:
            found_it = True
            f_new.write(line) # still need the date

测试代码:

with  open("file_2.txt","r") as f:
    print(f.read())

输出:

Some other data

05/11/18
test1 N/A N/A 09:30 18:00 09:30 18:00 09:30 18:00 09:30 18:00 
test2 08:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00 
test3 09:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00 
test4 10:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00 
end
end

06/11/18
test1 N/A N/A 09:30 18:00 09:30 18:00 09:30 18:00 09:30 18:00
test2 08:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00
test3 09:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00
test4 10:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00
end

将新文件重命名为旧文件并玩得开心。