我有一个像这样的列表
x=['hello@thepowerhouse.group', 'ThePowerHouse\xa0 is a part of the House of ElektroCouture', 'Our Studio is located at Bikini Berlin Terrace Level, 2nd floor Budapester Str. 46 10787 Berlin', '\xa0', 'Office:\xa0+49 30 20837551', '\xa0', '\xa0']
我想提取这个元素Our Studio is located at Bikini Berlin Terrace Level, 2nd floor Budapester Str. 46 10787 Berlin'
由于我正在为多个网站执行此操作,因此我希望使用正则表达式添加元素,以便它可以与其他人一起使用。我认为我可以通过说元素是否具有小写字母和大写字母,数字,逗号,有时是句点来获取元素。这是我尝试过的,但它没有用。
import re
for element in x:
if re.findall("([A-Za-z0-9,])",element)==True:
print("match")
答案 0 :(得分:1)
您可以将规则拆分为几个简单的正则表达式并按顺序测试它们,而不是制作一些怪物表达式。
import re
def is_location(text):
"""Returns True if text contains digits, uppercase and lowercase characters."""
patterns = r'[0-9]', r'[a-z]', r'[A-Z]'
return all(re.search(pattern, text) for pattern in patterns)
x = [
'hello@thepowerhouse.group',
'ThePowerHouse\xa0 is a part of the House of ElektroCouture',
'Our Studio is located at Bikini Berlin Terrace Level, 2nd floor Budapester Str. 46 10787 Berlin',
'\xa0', 'Office:\xa0+49 30 20837551', '\xa0', '\xa0'
]
print(next(filter(is_location, x)))