我使用Google Vision Api扫描图像并从图像中提取文本。现在我面临着从扫描文本中提取名称和地址的问题。通过一些正则表达式,我能够从文本中检测街道代码和邮政编码,但不能检测整个地址和名称。
def find_between_r(s, first, last):
try:
start = s.index(first) + len(first)
end = s.index(last, start)
return s[start:end]
except ValueError:
return ""
text=""" 17000 AJHshkjadj dakd ext ESTES RICHMOND VA 23230 On Coll UNIFORM STRAIGHT BILL OF LADING - Original - Not Negotiable - Short Form (EXLA, 3901 WEST BROAD STREET Date 10/12/2017 OBOL No W093556Shippers No P.0. No 16846 very shipments, the letters 'COD appear befo For Payment Bill To Bill being paid by Shipper Consignee ENCANTADA RESORT Ken Smith 407-997-3731 Sp tio 3070 SECRET LAKE DR Ruwes Turiff EXLA 105. KISSIMMEE FL 34747 Shipper WINDWARD DESIGN NO ACCESSORIAL SERVICESADDED WITHOUT PRIOR APPROVAL FROM WINDWARD 941-359-0890 1130 COMMERCE BLVD N SARASOTA FL 34243 ird Is M trial E cy P le # O00 000 0000 NOTE: Liability Limitation for loss or damage on this shipment may be applicable. See 49 U.S.C. 14706 (c)(1XA) and (B No. Pkgs HMI Kind or Peak ange. Description of Articles, Special Marks and Exceptions NMFC Declared Valius TW. (Sub Com) | Chass/Rate Ohk22 CT PATIO FURNITURE 1400 200.0 22 1400 Quote# 4867496 APPOINTMENT CHARGE LIFTGATE DELIVERY CHARGE Rade doled val jeclared Excess WARNING Additional dam Mc LiabiRef IOS the d rges Advanced S Received $. to apply NOTE: Wh JOTE: Commodi requnng speci Subject to Sect 7 of Condit this shipmentrequired specific handling marked and to be delivered to the consignee without recourse wning the agreed or declared ith ordinary ignor, the ignor shall sign the property. The agreed or declared value See Sec NMFC 360. follohereby sp Sally stated by the The fibe booxes used thms shipment Innke del shipp 6pecif forth in the box Lake itbour payme freight and all other lawfuland all other repair Consolidated US NMFC | charge the shi PT is byBill of Lacing shall in the prepayment of the charges on the property described hereof. igned, destination RECEIVED. ly determined have been agroced upo icable otherwise the classif and rules (EstaExpros Linehave been castablished by shippes. stThe property described abovein apparent good ords, and codi of packages unknown) rked, ared destined otion said Juleotherwis Ily agreed. property 1y porton desertion and as to cocb party a serty, that every performed thereunder shall be sult all the 1d oCodhi Bill of Lacing the National Motor Freight Cassifit 100-X and also agreed liable or any consequental damages arising from the delaysery dates (Subject of any app! Gold M Service Ageroement) SHIPPER CERTIFICATION CARRIER CERTIFICATION Gignatueits agreement to all o orching to the applicable regulations Express Lines-EXLA ized Signature Date (Dae Iolel who ret TPMLD Colon coDfee & Shipper O Collect On Delivery C.OD, Amount Certified Check Freight Charges are PREPAID unless marked collect CHECK BOX to be paid by { Consignee Consignee Check Accepted IF COLLECT Mark Ig the PLTS STC PC and Loose Place Guaranteed Sticker Here Tsos ITIL TS Elections or AO eControl IDOT- Pro# 000000028388396 PAGE 624318 O489b24AL8"""
data=[]
Ship_Cons = re.findall(r'\b(?=SHIP|Ship|SHIPPER|Shipper|ONSIGNEE|Onsignee|CONSIGNEE|Consignee|FROM|TO).*',value)
val=" ".join(map(str,Ship_Cons))
zip_code = re.findall(
r"((?=AL|AK|AS|AZ|AR|CA|CO|CT|DE|DC|FM|FL|GA|GU|HI|ID|IL|IN|IA|KS|KY|LA|ME|MH|MD|MA|MI|MN|MS|MO|MT|NE|NV|NH|NJ|NM|NY|"
r"NC|ND|MP|OH|OK|OR|PW|PA|PR|RI|SC|SD|TN|TX|UT|VT|VI|VA|WA|WV|WI|WY)[A-Z]{2}[, ])"
r"(\d{5}(?:-\d{4})?|\d{4}(?:-\d{4})?|\d{3}(?:\s\d{2})|\d{3}(?:\s\d{1}\s\d{1})"
r"|\d{2,5}(?:\s\d{2,5})(?:-\d{4})?)",val)
# print(zip_code)
for item in zip_code:
data.append("".join(item))
address = re.findall(r"\s\d{4}\s|\w*[a-z]\s\w*[a-z]\s\d{4}\s|\s\d{5}\s",val)
print("Address",address)
print(print(find_between_r(val,address[0],data[0])))
我正在
SECRET LAKE DR Ruwes Turiff EXLA 105. KISSIMMEE
作为上述代码的输出。如何避免像Turiff EXLA 105这样的不必要的价值。并且地址也不能得到名字。任何人都可以帮我解决这个问题。谢谢