Question

当我运行以下程序时，我没有得到预期的输出。

import os
import re

f = open('outputFile','w')


#flag set to 1 when we are processing the required diffs
diff_flag=0   #Initialized to 0 in beginning

#with open('Diff_output.txt') as fp:
with open('testFile') as fp:
    for line in fp:
        if re.match('diff --git',line):
                #fileExtension = os.path.splitext(line)[1]
                words=line.split(".")   
                diff_flag=0
#               print fileExtension
                str=".rtf"      

                print words[-1]

                if words[-1] != "rtf":
                        print "Not a text file.."       
                        diff_flag = 1
                        f.write(line)
                        print "writing -> " + line      

        elif diff_flag == 1:
                f.write(line)
        else:
                continue

我得到如下输出：

python read.py 
rtf

Not a text file..
writing -> diff --git a/archived-output/NEW/action-core[best].rtf b/archived-output/NEW/action-core[best].rtf

这是一个文本文件，if条件应该评估为false。当我打印单词[-1]或fileExtension时，我得到了正确的扩展名。但后来我无法理解为什么这种情况会失败。这两个变量的内容是否有问题，因为条件评估为真（不等于）。我试图逐行读取文件并在此处提取文件名的扩展名。

Answer 1

当您按照自己的方式迭代文件时，这些行将包含换行符＆＃34; \ n＆＃34;，您应该做的是：

words = line.strip().split(".").

或

if words[-1].strip() != "rtf":

但如果我是你，我会做的是：

if line.strip().endswith(".rtf"):

而不是拆分线。

顺便说一下，新行的证明是你的输出：

rtf
 <-- empty line here.

Answer 2

2分：

1. re.match()尝试匹配行开头的模式如果要在字符串中的任何位置找到匹配项，请改用re.search()。（另见search() vs. match()）

2. words=line.split(".")没有为您提供单词列表，因为它会在您需要首先\n的文件的尾随或前导处包含strip这样的空格。你的台词：

words=line.strip().split(".")

无法从文件行中提取文件扩展名

2 个答案: