在python中提取2个字符串之间的字符串?

时间:2016-08-03 02:07:35

标签: python

我需要在{ "name": "sage", "homepage": "https://roots.io/sage/", "authors": [ "Ben Word <ben@benword.com>" ], "license": "MIT", "private": true, "dependencies": { "materialize": "^0.97.7" }, "overrides": { "materialize": { "main": [ "./js/initial.js", "./js/jquery.easing.1.3.js", "./js/animation.js", "./js/velocity.min.js", "./js/hammer.min.js", "./js/jquery.hammer.js", "./js/global.js", "./js/collapsible.js", "./js/dropdown.js", "./js/leanModal.js", "./js/materialbox.js", "./js/parallax.js", "./js/tabs.js", "./js/tooltip.js", "./js/waves.js", "./js/toasts.js", "./js/sideNav.js", "./js/scrollspy.js", "./js/forms.js", "./js/slider.js", "./js/cards.js", "./js/chips.js", "./js/pushpin.js", "./js/buttons.js", "./js/transitions.js", "./js/scrollFire.js", "./js/date_picker/picker.js", "./js/date_picker/picker.date.js", "./js/character_counter.js", "./js/carousel.js", "./sass/materialize.scss", "./fonts/**/*" ] } } } ~

之间给我一个字符串

我有这样的字符串

^

我需要用python

获取它们之间的字符串

我想这样做是因为我试图从html页面中提取文本。像这个例子

:::ABC???,:::DEF???

2 个答案:

答案 0 :(得分:1)

您可以在生成器表达式中使用isalpha()函数。然后使用string将字符组合为单个join()

def extract_string(s):
    return ''.join(i for i in s if i.isalpha())

示例输出:

print extract_string(':::ABC???,:::DEF???')
>>> ABCDEF

但是,如果您只想在~...^之间提取字符,那么这仅用于提取所有字符:

import re
def extract_string(s):
    match = re.findall(r"~([a-zA-z]*)\^", s)
    return match

示例输出:

s = '&nbsp;~ABC^,~DEF^'
print extract_string(s)
>>> ['ABC', 'DEF']

请注意:如果您正在使用正则表达式和/或字符串操作解析 HTML ,那么{{3建议,请使用HTML解析器;例如famous S.O. reply库代替:D!

答案 1 :(得分:1)

好像你想要ABC和DEF,所以你需要写这样的(。*?)

import re
target = ' <td class="cell-1"><div><span class="value-frame">&nbsp;~ABC^,~DEF^</span></div></td>'
matchObj = re.findall(r'~(.*?)\^', target)
print matchObj 
# ['ABC', 'DEF']

您可以了解有关重新模块的更多信息