在两列中标识类似的字符串值

时间:2016-08-12 18:43:49

标签: sql sql-server tsql ssis

例如,我的下表包含两列Address1refAddr

表格中的一些样本数据如下所示。

enter image description here

我想比较两列匹配。显然在这张表中5235 JFK BLVD& 5235 John F Kennedy是一对,424 N 2ND ST& 424 NORTH SECOND是一对。

无论如何在SQL或SSIS中我可以用来摆脱非配对结果并保留配对吗?

1 个答案:

答案 0 :(得分:3)

一个选项是您可以使用GOOGLE API对地址进行地理编码,解析JSON结果以返回更标准化的结果。这可能非常耗时,但您对数据更有信心。

API允许(我相信)每天2,500次点击,但您可以购买更多。

例如,我选择了5232 JFK Blvd并添加了72116的邮政编码以缩小搜索范围。没有邮政编码,它返回了多个地址(NY,NJ,AR等)

https://maps.googleapis.com/maps/api/geocode/json?address=5232%20JFK%20Blvd&72116sensor=false

关键要素可能是:

formatted_address: "5232 J.F.K. Blvd, North Little Rock, AR 72116, USA",
or
long_name: "John F. Kennedy Boulevard",

返回

{
results: [
{
address_components: [
{
long_name: "5232",
short_name: "5232",
types: [
"street_number"
]
},
{
long_name: "J.F.K. Boulevard",
short_name: "J.F.K. Blvd",
types: [
"route"
]
},
{
long_name: "North Little Rock",
short_name: "North Little Rock",
types: [
"locality",
"political"
]
},
{
long_name: "Hill Township",
short_name: "Hill Township",
types: [
"administrative_area_level_3",
"political"
]
},
{
long_name: "Pulaski County",
short_name: "Pulaski County",
types: [
"administrative_area_level_2",
"political"
]
},
{
long_name: "Arkansas",
short_name: "AR",
types: [
"administrative_area_level_1",
"political"
]
},
{
long_name: "United States",
short_name: "US",
types: [
"country",
"political"
]
},
{
long_name: "72116",
short_name: "72116",
types: [
"postal_code"
]
}
],
formatted_address: "5232 J.F.K. Blvd, North Little Rock, AR 72116, USA",
geometry: {
bounds: {
northeast: {
lat: 34.8032656,
lng: -92.2538364
},
southwest: {
lat: 34.8032599,
lng: -92.2538538
}
},
location: {
lat: 34.8032599,
lng: -92.2538364
},
location_type: "RANGE_INTERPOLATED",
viewport: {
northeast: {
lat: 34.8046117302915,
lng: -92.2524961197085
},
southwest: {
lat: 34.8019137697085,
lng: -92.2551940802915
}
}
},
place_id: "EjI1MjMyIEouRi5LLiBCbHZkLCBOb3J0aCBMaXR0bGUgUm9jaywgQVIgNzIxMTYsIFVTQQ",
types: [
"route",
"street_address"
]
},
{
address_components: [
{
long_name: "5232",
short_name: "5232",
types: [
"street_number"
]
},
{
long_name: "John F. Kennedy Boulevard",
short_name: "John F. Kennedy Blvd",
types: [
"route"
]
},
{
long_name: "West New York",
short_name: "West New York",
types: [
"locality",
"political"
]
},
{
long_name: "Hudson County",
short_name: "Hudson County",
types: [
"administrative_area_level_2",
"political"
]
},
{
long_name: "New Jersey",
short_name: "NJ",
types: [
"administrative_area_level_1",
"political"
]
},
{
long_name: "United States",
short_name: "US",
types: [
"country",
"political"
]
},
{
long_name: "07093",
short_name: "07093",
types: [
"postal_code"
]
}
],
formatted_address: "5232 John F. Kennedy Blvd, West New York, NJ 07093, USA",
geometry: {
bounds: {
northeast: {
lat: 40.78574,
lng: -74.0231416
},
southwest: {
lat: 40.7857366,
lng: -74.0231598
}
},
location: {
lat: 40.78574,
lng: -74.0231416
},
location_type: "RANGE_INTERPOLATED",
viewport: {
northeast: {
lat: 40.78708728029149,
lng: -74.02180171970849
},
southwest: {
lat: 40.7843893197085,
lng: -74.0244996802915
}
}
},
place_id: "Ejc1MjMyIEpvaG4gRi4gS2VubmVkeSBCbHZkLCBXZXN0IE5ldyBZb3JrLCBOSiAwNzA5MywgVVNB",
types: [
"route",
"street_address"
]
}
],
status: "OK"
}