所以,我尝试在Python中构建自己的string.find()方法/函数。我是为我在的计算机科学课做的。
基本上,这个程序打开一个文本文件,获取用户输入他们要在文件中搜索的文本,并输出字符串所在的行号,或输出“未找到”# 39;如果字符串不存在于文件中。
但是,这需要大约34秒才能完成250,000行XML。
我的代码中的瓶颈在哪里?我也用C#和C ++制作了这个,并且在450万行中运行大约0.3秒。我还使用Python内置的string.find()执行相同的搜索,对于250,000行XML,这需要大约4秒。所以,我试图理解为什么我的版本如此之慢。
感谢,
https://github.com/zach323/Python/blob/master/XML_Finder.py
fhand = open('C:\\Users\\User\\filename')
import time
str = input('Enter string you would like to locate: ') #string to be located in file
start = time.time()
delta_time = 0
def find(str):
time.sleep(0.01)
found_str ='' #initialize placeholder for found string
next_index = 0 #index for comparison checking
line_count = 1
for line in fhand: #each line in file
line_count = line_count +1
for letter in line: #each letter in line
if letter == str[next_index]: #compare current letter index to beginning index of string you want to find
found_str += letter #if a match, concatenate to string placeholder
#print(found_str) #print for visualization of inline search per iteration
next_index = next_index + 1
if found_str == str: #if complete match is found, break out of loop.
print('Result is: ', found_str, ' on line %s '%(line_count))
print (line)
return found_str #return string to function caller
break
else:
#if a match was found but the next_index match was False, reset the indexes and try again.
next_index=0 # reset indext back to zero
found_str = '' #reset string back to empty
if found_str == str:
print(line)
if str != "":
result = find(str)
delta_time = time.time() - start
print(result)
print('Seconds elapsed: ', delta_time)
else:
print('sorry, empty string')
答案 0 :(得分:0)
试试这个:
with open(filename) as f:
for row in f:
if string in row:
print(row)
答案 1 :(得分:0)
以下代码在大小与文件大小相当的文本文件上运行。您的代码在我的计算机上运行得太慢。
fhand = open('test3.txt')
import time
string = input('Enter string you would like to locate: ') #string to be located in file
start = time.time()
delta_time = 0
def find(string):
next_index_to_match = 0
sl = len(string)
ct = 0
for line in fhand: #each line in file
ct += 1
for letter in line: #each letter in line
if letter == string[next_index_to_match]: #compare current letter index to beginning index of string you want to find
# print(line)
next_index_to_match += 1
if sl == next_index_to_match: #if complete match is found, break out of loop.
print('Result is: ', string, ' on line %s '%(ct))
print (line)
return True
else:
#if a match was found but the next_index match was False, reset the indexes and try again.
next_index_to_match=0 # reset indext back to zero
return False
if string != "":
find(string)
delta_time = time.time() - start
print('Seconds elapsed: ', delta_time)
else:
print('sorry, empty string')