Python XML搜索字符串。

时间:2018-02-07 19:57:12

标签: python xml linear

所以,我尝试在Python中构建自己的string.find()方法/函数。我是为我在的计算机科学课做的。

基本上,这个程序打开一个文本文件,获取用户输入他们要在文件中搜索的文本,并输出字符串所在的行号,或输出“未找到”# 39;如果字符串不存在于文件中。

但是,这需要大约34秒才能完成250,000行XML。

我的代码中的瓶颈在哪里?我也用C#和C ++制作了这个,并且在450万行中运行大约0.3秒。我还使用Python内置的string.find()执行相同的搜索,对于250,000行XML,这需要大约4秒。所以,我试图理解为什么我的版本如此之慢。

感谢,

https://github.com/zach323/Python/blob/master/XML_Finder.py

fhand = open('C:\\Users\\User\\filename')



import time

str  = input('Enter string you would like to locate: ') #string to be located in file

start = time.time()

delta_time = 0



def find(str):

    time.sleep(0.01)

    found_str ='' #initialize placeholder for found string

    next_index = 0 #index for comparison checking

    line_count = 1



    for line in fhand: #each line in file

        line_count = line_count +1

        for letter in line: #each letter in line

            if letter == str[next_index]: #compare current letter index to beginning index of string you want to find



                found_str += letter #if a match, concatenate to string placeholder





                #print(found_str) #print for visualization of inline search per iteration

                next_index = next_index + 1







                if found_str == str: #if complete match is found, break out of loop.



                    print('Result is: ', found_str, ' on line %s '%(line_count))

                    print (line)

                    return found_str #return string to function caller

                    break

            else:

                #if a match was found but the next_index match was False, reset the indexes and try again.

                next_index=0 # reset indext back to zero



                found_str = '' #reset string back to empty



        if found_str == str:

            print(line)



if str != "":   

    result = find(str)

    delta_time = time.time() - start



    print(result)

    print('Seconds elapsed: ', delta_time)  



else:

    print('sorry, empty string')

2 个答案:

答案 0 :(得分:0)

试试这个:

with open(filename) as f:
    for row in f:
        if string in row:
            print(row)

答案 1 :(得分:0)

以下代码在大小与文件大小相当的文本文件上运行。您的代码在我的计算机上运行得太慢。

fhand = open('test3.txt')

import time
string = input('Enter string you would like to locate: ') #string to be located in file
start = time.time()
delta_time = 0


def find(string):
    next_index_to_match = 0 
    sl = len(string)
    ct = 0

    for line in fhand: #each line in file
        ct += 1
        for letter in line: #each letter in line
            if letter == string[next_index_to_match]: #compare current letter index to beginning index of string you want to find
                # print(line)
                next_index_to_match += 1

                if sl == next_index_to_match: #if complete match is found, break out of loop.
                    print('Result is: ', string, ' on line %s '%(ct))
                    print (line)
                    return True

            else:
                #if a match was found but the next_index match was False, reset the indexes and try again.
                next_index_to_match=0 # reset indext back to zero
    return False

if string != "":   
    find(string)
    delta_time = time.time() - start
    print('Seconds elapsed: ', delta_time)  
else:
    print('sorry, empty string')