使用XML解析的结果搜索元组字符串匹配的元组

时间:2012-06-03 01:38:16

标签: python xml list tuples string-matching

经过多次操作后,我获得了一个模式列表:

[(u'DAY1 KWH', u'300.000000'), 
(u'DAY2 KWH', u'300.000000'), 
(u'DAY3 KWH', u'300.000000'), 
(u'DAY4 KWH', u'300.000000'), 
(u'DAY5 KWH', u'300.000000'), 
(u'DAY6 KWH', u'300.000000'), 
(u'DAY7 KWH', u'300.000000'), 
(u'DAY8 KWH', u'300.000000'), 
(u'DAY9 KWH', u'300.000000'), 
(u'DAY10 KWH', u'300.000000'), 
(u'DAY11 KWH', u'300.000000'), 
(u'DAY12 KWH', u'300.000000'), 
(u'DAY13 KWH', u'300.000000'), 
(u'DAY14 KWH', u'300.000000'),
...

并非所有元素都包含“day”或“kwh”这个词 - 实际上它们可能是本月天然气使用量的测量值,也可能是本周的用水量等。我想做的是提取所有将“day”值放入一个列表中,将所有“周”值放入另一个列表中,并将所有“月”值放入第三个列表中。最终目标是能够绘制每日,每周和每月的实用程序使用情况。请注意,这些是测试值。

我注释掉的五种不同方法中没有一种实际上有效,但每次尝试都是通过读取不同的Stack Overflow线程产生的。

我知道代码肯定会更高效/更快,所以如果您有任何优化建议,请随时添加。非常感谢您的帮助!

import urllib
import urllib2
from xml.dom import minidom
import matplotlib.pyplot as plt

def main(): 

    path = "http://128.226.6.214/bacrest/bacnet_device_70200/"
    BACrest = 'urn:BACrestService'
    xlink = 'http://www.w3.org/1999/xlink'

    dom = minidom.parse(urllib.urlopen(path)) 

    values = [] 
    descriptions = []
    for node in dom.getElementsByTagNameNS(BACrest, 'ChildNode'):  

        href = node.getAttributeNS(xlink, 'href')

        descriptionDomain = href + '/Description'
        descriptionSubDom = minidom.parse(urllib.urlopen(descriptionDomain))
        descriptionElements = descriptionSubDom.getElementsByTagNameNS(BACrest, 'return')
        descriptions.append(descriptionElements) 

        valueDomain = href + '/Value'
        valueSubDom = minidom.parse(urllib.urlopen(valueDomain))
        valueElements = valueSubDom.getElementsByTagNameNS(BACrest, 'return')[0].firstChild.data
        values.append(valueElements)

    combination = zip(descriptions,values)

解决方案尝试

    #print filter(lambda x: 'DAY1 ' in x, combination)

    #dayInfo = []
    #for sublist in combination:
        #if 'DAY1 ' in sublist:
            #dayInfo.append(sublist)
    #print dayInfo

    #dayInfo = [s for s in combination if 'DAY1 ' in s]
    #print dayInfo

    #dayInfo = [i for i,v in combination if i.startswith('DAY1') in i]
    #print dayInfo

    #dayInfo = []
    #if any('DAY1 ' in x for x in combination):
        #dayInfo.append(x)
    #print dayInfo

main()

1 个答案:

答案 0 :(得分:0)

如果data[(u'DAY1 KWH', u'300.000000'), (u'DAY2 KWH', u'300.000000'), ...]形式的某个列表,正如您所描述的那样,则以下代码应该有效:

for desc, val in data:
    if 'DAY' in desc:
        # do something with val
    elif 'WEEK' in desc:
       # do something else with val
    # etc...

请注意,这并不是特别强大,因为它会因为大写等方面的差异而失败。在执行字符串搜索之前,在每次迭代中使用正则表达式匹配或更新desc可能会非常优越,除非您绝对可以确定输入将始终为大写。