剥离字符串的Pythonic方法

时间:2014-10-08 00:33:42

标签: python xml regex python-2.7

我正在尝试实现与以下bash命令等效的python:

VERSION=$( curl --silent "http://nexus:8080/nexus/service/local/lucene/search?g=com.xxx.yyy&a=zzz" | sed -n 's|<latestRelease>\(.*\)</latestRelease>|\1|p' | sed -e 's/^[ \t]*//' | tail -1 ) 

我提出了下面的代码片段,它可以部分工作并获得一堆<latestRelease>1.0.11</latestRelease>输出,这是完全可以预料到的。但是,我被困住了,并希望只获得1.0.11版本作为python脚本的输出。同样1.0.11可能会因nexus中的最新版本而有所不同,所以如果专家可以提出动态解决方案以pythonic方式去掉bash中sed和tail部分中完成的部分,那就太好了

#!/usr/bin/env python

import os;
import subprocess;
import re
import string;

proc = subprocess.Popen(["curl", "--silent", "http://nexus:8080/nexus/service/local/lucene/search?g=com.xxx.yyy&a=zzz"], stdout=subprocess.PIPE)
out = proc.communicate()[0]
search_string = "<latestRelease>"
for line in out.splitlines():
if search_string in line:
    re.sub(r'\s*latestRelease\s*', '', line)
    print line

输出:

  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>
  <latestRelease>1.0.11</latestRelease>

期望的输出:1.0.11

1 个答案:

答案 0 :(得分:2)

Python有相关的库来帮助你:

示例(应该适用于您的情况,但我无法测试它是否有效):

import xml.etree.ElementTree as ET
from urllib2 import urlopen

url = 'http://nexus:8080/nexus/service/local/lucene/search?g=com.xxx.yyy&a=zzz'
tree = ET.parse(urlopen(url))
print tree.findtext('.//latestRelease')