因此,我编写了一个代码,该代码可以完全读取文本文件,并检查带有特定字符串标记的文本。我是通过使用线程库完成此操作的,在阅读之后,我收集了所有这些标记的文本,并将它们全部附加到字符串数组中,然后将其写入csv文件中,最后再使用pandas库转换excel。 >
这是我的代码:
import substring
import codecs
import threading
import re, csv
import io
import pandas as pd
import time
class TimeThread(threading.Thread):
def run(self):
t=[]
with io.open('output.txt', "r",encoding="utf-8") as input:
for line in input:
if line.startswith('Warmateba: '):
s = (next(input))
p = s[5:len(s)]
print(p)
t.append(p)
time.sleep(10)
with io.open("userInfo.csv", "a+") as time:
writer = csv.writer(time)
writer.writerow(t)
class NameThread(threading.Thread):
def run(self):
n=[]
with io.open('output.txt', "r",encoding="utf-8") as input:
for line in input:
if line.startswith('Dro: '):
s = (next(input))
p = s[6:len(s)]
print(p)
n.append(p)
time.sleep(10)
with io.open("userInfo.csv", "a+") as time:
writer = csv.writer(time)
writer.writerow(n)
class GelThread(threading.Thread):
def run(self):
g=[]
with io.open('output.txt', "r",encoding="utf-8") as input:
for line in input:
if line.startswith('Name: '):
s = (next(input))
p = s[5:len(s)]
print(p)
g.append(p)
time.sleep(10)
with io.open("userInfo.csv", "a+") as time:
writer = csv.writer(time)
writer.writerow(g)
time_thread=TimeThread()
name_thread=NameThread()
gel_thread=GelThread()
time_thread.start()
name_thread.start()
gel_thread.start()
time_thread.join()
name_thread.join()
gel_thread.join()
pd.read_csv('userInfo.csv', header=None).T.to_csv('userInfo.csv', header=False, index=False)
我遇到的问题是,在运行此代码时,某些被标记的文本将被跳过而不被读取,我认为我需要某种延迟才能逐个读取文件并考虑使用时间库为了延迟阅读,但是没有用,我用尽了所有的想法,找不到与我的问题类似的东西。是否有人遇到过类似的问题,并且知道如何解决我面临的问题?
非常感谢。