尝试制作可以从地址获取状态的正则表达式
1- 1234 Bellaire Blvd,Suite 123,Houston,TX 77036
2- 1234 BELLAIRE BL#123,HOUSTON,TX 77036
我有这个状态
\ W {2}(?= \ S \ d {1,5})
这是Zip
(小于?= \ W {2} \ S)\ d {5}
FOR STATE
在第一种情况下,正则表达式从“套件”返回“te”,TX用于正确的状态
然而,在第二种情况下,它什么都没有返回
FOR ZIP
第一种情况下返回77036,第二种情况下返回null
答案 0 :(得分:1)
我不认为正则表达式是最好的方法。相反,我会使用API将地址解析为其组件。您将需要state_abbreviation并进行排序。回复示例:
[
{
"input_index": 0,
"candidate_index": 0,
"delivery_line_1": "1 Santa Claus Ln",
"last_line": "North Pole AK 99705-9901",
"delivery_point_barcode": "997059901010",
"components": {
"primary_number": "1",
"street_name": "Santa Claus",
"street_suffix": "Ln",
"city_name": "North Pole",
"state_abbreviation": "AK",
"zipcode": "99705",
"plus4_code": "9901",
"delivery_point": "01",
"delivery_point_check_digit": "0"
},
"metadata": {
"record_type": "S",
"zip_type": "Standard",
"county_fips": "02090",
"county_name": "Fairbanks North Star",
"carrier_route": "C004",
"congressional_district": "AL",
"rdi": "Commercial",
"elot_sequence": "0001",
"elot_sort": "A",
"latitude": 64.75233,
"longitude": -147.35297,
"precision": "Zip8",
"time_zone": "Alaska",
"utc_offset": -9,
"dst": true
},
"analysis": {
"dpv_match_code": "Y",
"dpv_footnotes": "AABB",
"dpv_cmra": "N",
"dpv_vacant": "N",
"active": "Y",
"footnotes": "L#"
}
},
{
"input_index": 1,
"candidate_index": 0,
"addressee": "Apple Inc",
"delivery_line_1": "1 Infinite Loop",
// truncated for brevity
}
]
希望有所帮助。
答案 1 :(得分:0)
您可以匹配',([A-Z] {2})'状态将是括号匹配的子模式。在python中它看起来像这样。
import re
s1 = "1- 1234 Bellaire Blvd, Suite 123, Houston, TX 77036"
s2 = "2- 1234 BELLAIRE BL #123, HOUSTON, TX 77036"
m = re.search(', ([A-Z]{2}) ', s1)
print(m.group(1))