Python,修改文本文件,搜索术语的问题

时间:2011-08-05 15:14:54

标签: python text-parsing

我昨天在这里发了一个问题:Finding and adding to a .kml file using python

我已经阅读了一堆教程,现在对python有了更好的理解,这很好。然而我似乎仍然无法使我的剧本正确。我知道我非常接近。基本上我想在.kml文件中添加一堆jpg,在google earth中基本上是.xml。我希望我的程序在xml文件中找到一个google earth“placemarker”,名为:     TO-XXX

其中XXX与TO-XXX.jpg匹配。我已经有一个包含一堆.jpgs的文件夹,其文件名与每个地标的名称相匹配。我需要程序才能找到

<name> (for example <name>TO-101</name>) 

并在名称行下方添加一行:

<description> <img src=TO-101.jpg></description>. 

所以,我已经编写了代码,但我似乎无法找到它。这总是写的:

"\t\t\t<name>TO-XXX</name>\n".

所以,这是代码:

import os

infile = 'TO-Hand-Holes2.kml' # the file I am reading
outfile = 'TO-Hand-Holes-Output.kml' # the file I plan to write to, using print for now
images = os.listdir("./images") # the images folder, all image names match names

source = open(infile, 'r')
target = open(outfile, 'w')

x = 0 #an incrementer
i = 0 # an incrementer

readxml = source.readline
while x < 20000:   #There are about 17,000 lines in the .kml file
    readxml = source.readline()
    while i < len(images):
        word = images[i]
        if readxml == "\t\t\t<name>%s</name>\n" % word[:6]: #!!!!!!!!! the problem is here
            print readxml #output the <name>
            print word[:6] #output the <description>
            hit = 'true'
            i = i + 1
        else:
            hit = 'false'
            #print "test%s" % word[:6]
            i = i + 1
    if hit == 'false':
        print ("%s") % readxml
    x = x + 1

我似乎无法让它识别线条。有什么建议吗?

3 个答案:

答案 0 :(得分:1)

因为缩进是python中的语法,所以真的需要注意事物的位置。这可能会更接近。这不是100%完成,但它会指出你正确的方向:

with open(infile) as fhi, open(outfile, 'w') as fho:
  for line in fhi:
    if line == 'myMatchString':
      fho.write(line.replace('this', 'that'))

它使用2.7中引入的with语句的多文件语法。在2.7之前,您必须嵌套第二个with以获取第二个文件。

答案 1 :(得分:1)

我做了一些更改,但我没有要测试它的文件。为了您的学习目的,我做了一些与Python相关的更改。我认为你应该测试你想要的信息是否在字符串中而不是检查等价性。如果要检查等效性,则应使用line.strip(),因为它将包含您可能不会考虑的选项卡,换行符等(并且您不希望明确说明,tbh)。

import os

infile = 'TO-Hand-Holes2.kml' # the file I am reading
outfile = 'TO-Hand-Holes-Output.kml' # the file I plan to write to, using print for now
images = os.listdir("./images") # the images folder, all image names match names

source = open(infile, 'r')
target = open(outfile, 'w')

for line in source.readlines():   #read all of the source lines into a list and iterate over them
    for image in images:  #you can iterate over a list like this
        word = image[:6] #i moved the list slicing here so you only have to do it once
        if "<name>" in line and word in line: #!!!!!!!!! the problem is here
            print line #output the <name>
            print word #output the <description>
            hit = True  #use the built-in Python boolean type
        else:
            hit = False
            #print "test%s" % word[:6]
        target.write(line)
        if hit:
            target.write("<description> <img src={0}></description>\n".format(image)

source.close()
target.close()

答案 2 :(得分:1)

这会更好:

import os

infile = 'TO-Hand-Holes2.kml' # the file I am reading
outfile = 'TO-Hand-Holes-Output.kml' # the file I plan to write to, using print for now
images = os.listdir("./images") # the images folder, all image names match names


with open(infile, 'r') as source:
    with open(outfile, 'w') as target:
        for readxml in source:
            for word in images:
                hit = readxml == "\t\t\t<name>%s</name>\n" % word[:6]
                if hit: #!!!!!!!!! the problem is here
                    print readxml #output the <name>
                    print >> target, word[:6] #output the <description>