用python编写的代码有什么问题

时间:2013-09-07 17:07:11

标签: python string file

鉴于infile包含:

aaaaaaa"pic01.jpg"bbbwrtwbbbsize 110KB
aawerwefrewqa"pic02.jpg"bbbertebbbsize 100KB
atyrtyruraa"pic03.jpg"bbbwtrwtbbbsize 190KB

如何获取outfile:

pic01.jpg 110KB
pic02.jpg 100KB
pic03.jpg 190KB

我的代码是:

with open ('test.txt', 'r') as infile, open ('outfile.txt', 'w') as outfile:
    for line in infile:
        lines_set1 = line.split ('"')
        lines_set2 = line.split (' ')
        for item_set1 in lines_set1:
            for item_set2 in lines_set2:
                if item_set1.endswith ('.jpg'):
                    if item_set2.endswith ('KB'):
                            outfile.write (item_set1 + ' ' + item_set2 + '\n')  

但代码会生成空白文件。怎么了???

5 个答案:

答案 0 :(得分:3)

您的代码只有一个主要问题:if item_set2.endswith ('KB')检查不起作用,因为每行末尾都有一个换行符。将其替换为(注意strip()电话):

if item_set2.strip().endswith('KB'):

此外,您不需要+ '\n',因为item_set2最后已经包含一个新行:

outfile.write (item_set1 + ' ' + item_set2.strip())

仅供参考,您可以将正则表达式与保存组一起使用以提取数据:

import re


with open('test.txt', 'r') as infile, open('outfile.txt', 'w') as outfile:
    for line in infile:
        match = re.search(r'"(.*)"\w+\s(\w+)', line)
        outfile.write(' '.join(match.groups()) + "\n")

运行代码后outfile.txt的内容:

pic01.jpg 110KB
pic02.jpg 100KB
pic03.jpg 190KB

答案 1 :(得分:3)

无需导入re的解决方案。条件可以改善为单线条件。

with open('test.txt', 'r') as infile, open('outfile.txt', 'w') as outfile:
    for line in infile:
        filename = line.strip().split('"')[1]
        size = line.rsplit(None, 1)[-1]
        if filename.endswith('.jpg') and size.endswith('KB'):
            outfile.write('%s %s\n' % (filename, size))

答案 2 :(得分:2)

您应该使用正则表达式,这将简化您的代码。有类似的东西:

import re
with open ('test.txt', 'r') as infile, open ('outfile.txt', 'w') as outfile:
    for line in infile:
        obj = re.match('.+"(.+\.jpg)".+\s(\d+KB)', line)
        if obj:
             outfile.write (obj.group(1) + ' ' + obj.group(2) + '\n') 
此脚本返回的

outfile.txt:

pic01.jpg 110KB
pic02.jpg 100KB
pic03.jpg 190KB

答案 3 :(得分:2)

首先,在空格处分割线并取第二个项目(在0基础列表中,第一个项目),这将给出大小部分。

接下来,将第一项拆分为“并取第二项。这将给出文件名。

检查在线演示,如果你想知道它是如何拆分的。

with open ('test.txt', 'r') as infile, open ('outfile.txt', 'w') as outfile:
    for line in infile:
        Parts = line.split()
        outfile.write (Parts[0].split('"')[1] + " " + Parts[1] + "\n")

<强>输出:

pic01.jpg 110KB
pic02.jpg 100KB
pic03.jpg 190KB

在线演示:

http://ideone.com/EOcuXL

答案 4 :(得分:0)

使用sed

$ sed 's/.*"\(.*\)".*size \(.*\)/\1 \2/' foo.txt
pic01.jpg 110KB
pic02.jpg 100KB
pic03.jpg 190KB