我正在通过python中的解析器读取文件,但它不起作用

时间:2015-11-17 17:05:58

标签: python python-2.7 parsing

这是我的解析器的一部分。问题是它不能阅读和打印DeltaE和Intensity可能有人可以提供帮助吗?

    if (line[1:24] == "Mulliken atomic charges" or
        line[1:22] == "Lowdin Atomic Charges"):
        if not hasattr(self, "atomcharges"):
            self.atomcharges = {}
        ones = inputfile.next()
        charges = []
        nline = inputfile.next()
        while not "Sum of" in nline:
            charges.append(float(nline.split()[2]))
            nline = inputfile.next()
        if "Mulliken" in line:
            self.atomcharges["mulliken"] = charges
        else:
            self.atomcharges["lowdin"] = charges

    if line[0:6] == 'DeltaE':
        while not "DeltaE" in nline:
            deltae = []
            intensity = []
            line=line.strip()
            #then we have a line like: DeltaE =    13.5423 | TDMI^2 = 0.6670E-01, Intensity =  6553.
            self.deltae = float(line.split('|')[0].strip().split('=')[1].strip())
            line = inputfile.next()
            self.intensity = float(line.split('|')[1].strip().split(',')[1].strip().split('=')[1].strip())
            line = inputfile.next()
            print deltae, ',', intensity

output.log文件的一部分(此output.log非常大,如15mb)

   Initial state: <0|
   Final state: |1^1>
   DeltaE =    13.5423 | TDMI^2 = 0.6670E-01, Intensity =  6553.    
   ........................................
   Initial state: <0|
   Final state: |2^1>
   DeltaE =    17.9918 | TDMI^2 = 0.2693    , Intensity = 0.2668E+05
   ........................................
   Initial state: <0|
   Final state: |3^1>
   DeltaE =    22.4523 | TDMI^2 = 0.4740E-01, Intensity =  4644.    
   ........................................

我想在使用解析方法后打印DeltaE和Intensity,但没有任何工作我可以获得其他值但不是DeltaE和Intensity:

 >>> mylogfile.parse()
[Gaussian BChla.out INFO] Creating attribute charge: 0
[Gaussian BChla.out INFO] Creating attribute mult: 1
[Gaussian BChla.out INFO] Creating attribute natom: 82
[Gaussian BChla.out INFO] Creating attribute atommasses[]
[Gaussian BChla.out INFO] Creating attribute atomnos[]
[Gaussian BChla.out INFO] Creating attribute vibsyms[]
[Gaussian BChla.out INFO] Creating attribute vibfreqs[]
[Gaussian BChla.out INFO] Creating attribute vibirs[]
[Gaussian BChla.out INFO] Creating attribute vibdisps[]
[Gaussian BChla.out INFO] Creating attribute temperature: 298.15
[Gaussian BChla.out INFO] Creating attribute enthaply: -2225.475525
[Gaussian BChla.out INFO] Creating attribute freeenergy: -2225.601048
[Gaussian BChla.out INFO] Creating attribute grads[]
[Gaussian BChla.out INFO] Creating attribute entropy: 0.000421006204931
[Gaussian BChla.out INFO] Creating attribute atomcoords[]
[Gaussian BChla.out INFO] Creating attribute coreelectrons[]
<cclib.parser.data.ccData object at 0x02FA1890>
>>>

1 个答案:

答案 0 :(得分:1)

如果你想保持相同的方法,我认为你需要做的就是删除你从原子电荷解析中错误复制的额外代码,这有效:

    if line[0:6] == 'DeltaE':
        line=line.strip()
        #then we have a line like: DeltaE =    13.5423 | TDMI^2 = 0.6670E-01, Intensity =  6553.
        self.deltae = float(line.split('|')[0].strip().split('=')[1].strip())
        self.intensity = float(line.split('|')[1].strip().split(',')[1].strip().split('=')[1].strip())
        print(self.deltae, self.intensity)

以下是解析该行的不同方法的示例 - 指定与行结构匹配的正则表达式。您需要在文件顶部的某处添加import re

        match = re.search(r"DeltaE =\s+(\S+).* Intensity =\s+(\S+)", line)
        if match is not None:
            self.deltae = float(match.group(1))
            self.intensity = float(match.group(2))
            print(self.deltae, self.intensity)

这是一个测试代码的完整示例:

import re

class Parser:
    def parseline(self, line):
        match = re.search(r"DeltaE =\s+(\S+).* Intensity =\s+(\S+)", line)
        if match is not None:
            self.deltae = float(match.group(1))
            self.intensity = float(match.group(2))

p = Parser()
p.parseline("DeltaE =    17.9918 | TDMI^2 = 0.2693    , Intensity = 0.2668E+05")
print(p.deltae, p.intensity)

输出:

17.9918 26680.0