我是python的新手,我想知道如何完成字符串比较
假设我有一个包含状态名称的字符串列表,如
states = ["New York", "California", "Nebraska", "Idaho"]
我还有另一个包含地址的字符串,如
postal_addr = "1234 1st E St San Jose California 95112"
如何解析此地址字符串并找到状态列表中项目的匹配项?在上面的例子中,加利福尼亚将是一场比赛。然后,我如何在匹配后提取"California"
并将其存储为单独的字符串?
答案 0 :(得分:1)
>>> states = ["New York", "California", "Nebraska", "Idaho"]
>>> postal_addr = "1234 1st E St San Jose California 95112"
>>> first_match = next(state for state in states if state in postal_addr)
>>> first_match
'California'
但是,如果您需要匹配字边界,最好使用正则表达式。
答案 1 :(得分:1)
我愿意
matches = [ s for s in states if s in postal_addr ]
然后,如果你想从邮政地址获取字符串:
import re
if matches:
extracted = re.findall( matches[0], postal_addr)[0]
编辑:..但这不适用于城市名称包含不同州的城市/州组合,例如postal_adr = '1 Arrowhead Dr, Kansas City, Missouri 64129'
和states = ["New York", "California", "Nebraska", "Idaho", "Missouri", "Kansas"]
等。在这种情况下
import re
if matches:
extracted = [(re.search(m, postal_addr).start() , m) for m in matches ]
extracted = sorted( extracted )[-1][1]
答案 2 :(得分:0)
states = ["New York", "California", "Nebraska", "Idaho"]
postal_addr = "1234 1st E St San Jose California 95112"
result = None
for state in states:
if state in postal_addr:
result = state
print(result)
不幸的是,这也会匹配包含Idahoba等州名的单词。
答案 3 :(得分:0)
这是使用正则表达式的另一个替代答案:
import re
states = ["New York", "California", "Nebraska", "Idaho"]
pattern = re.compile(r'.*(' + r'|'.join(states) + ').*')
postal_addr = "1234 1st E St San Jose California 95112"
match = pattern.match(postal_addr)
if match:
state = match.group(1)
答案 4 :(得分:0)
你可以这样试试,
DataEvent
答案 5 :(得分:-1)
要查找字符串中的所有匹配项,您可以执行以下操作:
matches = [m for m in postal_addr.split() if m in states]