在文本中查找某个代码块

时间:2012-06-12 12:52:38

标签: python tkinter readline

到目前为止,我有以下代码:

import sys
from Tkinter import *
import tkFileDialog
from tkFileDialog import askopenfile # Open dialog box


fen1 = Tk()                              # Create window
fen1.title("Optimisation")               # Window title

menu1 = Menu(fen1)

def open():
    filename = askopenfile(filetypes=[("Text files","*.txt")], mode='r')

filename.seek(0)
    numligne = 0
    line     = []
    ok       = 0
    k        = -1

    while (ok == 0)  &  (k == -1):
    line = filename.readline()
    k    = line.find( "*load" )
    if k == 0 :
        l = filename.readlines()

fen1.mainloop()

我搜索的文本文件格式如下:

*test
1 2 3 4

*load
2 7 200
3 7 150

*stiffness
2 9 8 7

etc..

到目前为止,我设法找到以“* load”开头的行,但我希望将'* load'和'* stiffness'之间的值分配给变量,例如a,b,c。我的问题是,在这个加载部分,可能有几行,我需要每次检测每一行,拆分行中的值并给它们一个名称。如果有人可以请求帮助解释一个循环或类似的东西,只会伎俩,我将非常感激!谢谢!

更新:我遇到的问题是我现在想在同一个文本文件中找到SEVERAL单独的部分。我如何创建一个循环来进一步查找'* geo'和'* house'之间的行,还有'* name'和'* surname'?我试图创建一个完全独立的定义,但希望最小化我使用的代码行...谢谢!代码我一直在使用类似的结构(正如我原来的问题所提供的,感谢mgilson!),因此我想编辑这些类型的代码。

def parse_file(ff):     
    out=[]     
    append=False     
    for line in ff:         
        if(append and line.strip()):
            out.append(line)          
            if(line.startswith('*load')): 
                append=True
            elif(line.startswith('*stiffness')):  
                return [map(int,x.split()) for x in out[:-1] ] 

5 个答案:

答案 0 :(得分:1)

我们假设代码的“块”由标题分隔(例如*header)。在每个块中存储数据的最直观方式是列表列表。例如[ row1, row2, ...](其中row1=[elem1,elem2,elem3,...])。然后,您可以将块存储在字典中,以便通过block=dictionary['headername']访问块。

这将执行您想要的操作(此版本未经测试)。

import sys

def convert_type(ss):
    try:
        return int(ss)
    except ValueError:
        try:
            return float(ss)
        except ValueError:
            return ss

def parse_file(ff):
    out={}
    block=None
    for i,line in enumerate(ff):
        #Allow for comments to start with '#'.  We break off anything after a '#'
        #and ignore it.  After that, we 
        data=line.split('#',1)
        line=data[0]  #comments (if in line) are in data[1] ... ignore those.
        line=line.strip() #remove whitespace from front and back of line.
        if(line.startswith('*')):
            #python supports multiple assignment.  
            #e.g. out['header'] is the same object as block.  
            #     changing block also changes out['header']
            block=out[line.strip()[1:]]=[]
        elif (block is not None) and line: #checks to make sure there is an active block and the line wasn't empty.
            #If the file could also have floats, you should use float instead of int
            #We also put the parsing in a try/except block.  If parsing fails (e.g. a
            #element can't be converted to a float, you'll know it and you'll know the
            #line which caused the problem.)
            try:
                #block.append(map(int,line.split()))
                block.append(map(convert_type,line.split()))  
            except Exception:
                sys.stderr.write("Parsing datafile choked on line %d '%s'\n"%(i+1,line.rstrip()))
                raise
    return out

with open('textfile.txt','r') as f:
    data_dict=parse_file(f)

#get information from '*load' block:
info=data_dict['load']
for row in info:
    a,b,c=row
    ##same as:
    #a=row[0]
    #b=row[1]
    #c=row[2]
    ##as long as row only has 3 elements.

    #Do something with that particular row. 
    #(each row in the 'load' block will be visited once in this loop)

#get info from stiffness block:
info=data_dict['stiffness']
for row in info:
    pass #Do something with this particular row.

请注意,如果您确保某个标题下的数据文件中的每一行都具有相同的条目数,您可以将变量info视为一个二维行,其索引为{{ 1}} - 但您也可以按element=info[row_number][column_number]

获取整行

答案 1 :(得分:1)

也许是这样的:

line = filename.readline()
if line.find("*load") == 0:
    line = filename.readline()
    while line != "\n" and line != "":
        vars = line.split(" ")

vars只是存储此代码运行后['2', '7', '200']值的示例(因此您需要将它们转换为浮点数或整数)。然后,您可以将这些附加到数组或根据需要重命名它们。

编辑:源自上述的工作计划。

filename = open("fff.txt", 'r')
values = {}

line = filename.readline()
while line:
    while line.find("*") != 0:
        line = filename.readline()

    sectionheader = line.strip()[1:]
    values[sectionheader] = []
    line = filename.readline()
    while line != "\n" and line != "":
        vals = [float(i) for i in line.split(" ")]
        values[sectionheader].append(vals)
        line = filename.readline()

print values

答案 2 :(得分:0)

虽然我无法帮助您解决语法问题,但最好使用自我调用。

编写一个检测所需行的函数并存储字节偏移量。 接下来,使该函数调用自身以查找下一行(结束操作),同时存储其偏移量并将其与先前保存的值进行比较。

现在您有足够的数据来确定需要更改的字节数。

然而,自我调用函数在使用时非常有效,它们可以提高性能并且易于重用。

在php中,我构建了一个类似于.net中的编写器,它以这种方式工作。 因此,我知道这个理论是有效的,但这似乎是蟒蛇。

遗憾的是,我对这种语言知之甚少。祝你的项目好运!

答案 3 :(得分:0)

这是我对你的代码所做的:

import sys
from Tkinter import *
import tkFileDialog
from tkFileDialog import askopenfile # Open dialog box


fen1 = Tk()                              # Create window
fen1.title("Optimisation")               # Window title

menu1 = Menu(fen1)

def do_open(interesting_parts=[]):
    thefile = askopenfile(filetypes=[("Text files","*.txt")], mode='r')

    data = {} # Create a dictionary to store all the data stored per part name
    part = None
    for line in thefile:
        if line.startswith("*"):
            # A * in the beginning signals a new "part"
            # And another one means the end.
            part = line[1:] # Remove the * in the beginning via slicing to get the name without it.
            if part in interesting_parts:
                data[part] = [] # Create a list inside the dictionary, so that you can add data to it later.
                # Only do this if we are interested in storing anything for this part
        elif part in interesting_parts:
            # Add an a condition to check if the part name is in the list of "parts" you are interested in having.
            line_data = get_something_from_this(part, line) # just a function that returns something based on the line and the part
            if line_data is not None: # Ignore it if it's None (just an option, as there might be newlines that you want to ignore)
                data[part].append(line_data)

    # Here, we return the dictionary to act on it.
    return data

def get_something_from_this(part, line):
    try:
        ints = [int(p) for p in line.split()]
    except ValueError:
        print "Ignoring Line", repr(line), "in", part
        return None # We don't care about this line!
    else:
        print "in", part, ints
        return ints # Store this line's data

data = do_open(["test", "egg"]) # Pass as an argument a list of the interesting "parts"

print data # this is a dictionary

# How do you access it?
print data["test"] # to get all the lines' data in a list
print data["egg"][0] # to get the first line of the data

for part, datalines in data.iterkeys():
    print part, datalines # datalines is the list of all the data, part is the dictionary's key which is the part name
    # Remember data[part] = ... <- Here part is the key.

fen1.mainloop()
  1. 当变量filename不是“文件名”而是“文件”时,不要将其命名。
  2. 您可以使用for循环逐行循环。
  3. 使用拆分拆分字符串
  4. 使用startswith来了解字符串是否以另一个字符串开头
  5. 跟踪您是否在变量的“* load”部分。
  6. <强>更新: 不要使用open作为函数名,它已经在python的内置函数中了。 另外,为了避免解析*load*stiffness行,我修改了一些代码:每行的解析都是在elif语句中完成的。

    更新2

    根据OP的需要更新代码。使用此文件进行测试:

    *test
    1 2 3
    
    *load
    2 7 200
    3 7 150
    
    *stiffness
    2 9 8
    
    *egg
    1 2 3
    2 4 6
    
    *plant
    23 45 78
    

    更新3 :重评:)

答案 4 :(得分:-2)

这样的事情应该做

data=[]
check=false
for i in fid:
    if line.find("*load"):
        check=true
    if check==true and not line.find("*stiffness"):
        line=split(i)
        data.append(map(lambda x: float(x), line))
    if line.find("*stiffness"):
        break

fid.close()
for i in data:
    a=i[0]
    b=i[1]
    c=i[2]

将此作为代码作为粗略的建议...(我认为异常现在已经修复,如果不是我也不在乎......)