我在SSIS作业中有一个c#脚本任务,它调用API进行地理编码。 API是专有的,可以像这样工作,接收请求,接收地址字符串,然后尝试将字符串匹配到一个巨大的地址列表(数百万),如果它找不到它,那么去google等其他服务获取地理数据信息。
你可以想象这个字符串匹配会占用每个请求很多时间。有时候它的速度和每分钟一个请求一样慢,而且我有4M地址需要这样做。在API方面开展任何开发工作都不是一种选择。为了更好地了解这个过程,我现在正在做的是:
我从数据库中提取地址列表(大约4M)并将其放入数据表并设置变量:
// Fill c# datatable with query results
sdagetGeoData.Fill(dtGeoData);
// check to ensure datable has rows
if (dtGeoData.Rows.Count > 0)
{
// if datatable has rows, for every row set the varible
foreach (System.Data.DataRow row in dtGeoData.Rows)
{
localID = row[0].ToString();
address = row[1].ToString();
city = row[2].ToString();
state = row[3].ToString();
zip = row[4].ToString();
country = row[5].ToString();
// after varaibles are set, now run this method to post, get response and insert the string
GetGLFromAddress();
}
}
GetGLFromAddress()
的工作原理如下:
从上面获取变量并形成JSON。使用“POST
”和httpWebRequest
发送JSON。等待请求(耗时)。退货请求。使用return设置新变量。使用这些变量更新/插入数据库,然后循环遍历原始数据表中的下一行。
了解此流程非常重要,因为我需要能够为每个请求保留 localID
变量,以便我可以更新数据库中的正确记录。
这是 GetGLFromAddress()
:
private void GetGLFromAddress()
{
// Request JSON data with Payload
var httpWebRequest = (HttpWebRequest)WebRequest.Create("http:");
httpWebRequest.Headers.Add("Authorization", "");
httpWebRequest.ContentType = "application/json";
httpWebRequest.Method = "POST";
using (var streamWriter = new StreamWriter(httpWebRequest.GetRequestStream()))
{
// this takes the variables from your c# datatable and formats them for json post
var jS = new JavaScriptSerializer();
var newJson = jS.Serialize(new SeriesPost()
{
AddressLine1 = address,
City = city,
StateCode = state,
CountryCode = country,
PostalCode = zip,
CreateSiteIfNotFound = true
});
//// So you can see the JSON thats output
System.Diagnostics.Debug.WriteLine(newJson);
streamWriter.Write(newJson);
streamWriter.Flush();
streamWriter.Close();
}
try
{
var httpResponse = (HttpWebResponse)httpWebRequest.GetResponse();
using (var streamReader = new StreamReader(httpResponse.GetResponseStream()))
{
var result = streamReader.ReadToEnd();
// javascript serializer... deserializing the returned json so that way you can set the variables used for insert string
var p1 = new JavaScriptSerializer();
// after this line, obj is a fully deserialzed string of json Notice how I reference obj[x].fieldnames below. If you ever want to change the fiels or bring more in
// this is how you do it.
var obj = p1.Deserialize<List<RootObject>>(result);
// you must ensure the values returned are not null before trying to set the variable. You can see when that happens, I'm manually setting the variable value to null.
if (string.IsNullOrWhiteSpace(obj[0].MasterSiteId))
{
retGLMID = "null";
}
else
{
retGLMID = obj[0].MasterSiteId.ToString();
}
if (string.IsNullOrWhiteSpace(obj[0].PrecisionName))
{
retAcc = "null";
}
else
{
retAcc = obj[0].PrecisionName.ToString();
}
if (string.IsNullOrWhiteSpace(obj[0].PrimaryAddress.AddressLine1Combined))
{
retAddress = "null";
}
else
{
retAddress = obj[0].PrimaryAddress.AddressLine1Combined.ToString();
}
if (string.IsNullOrWhiteSpace(obj[0].Latitude))
{
retLat = "null";
}
else
{
retLat = obj[0].Latitude.ToString();
}
if (string.IsNullOrWhiteSpace(obj[0].Longitude))
{
retLong = "null";
}
else
{
retLong = obj[0].Longitude.ToString();
}
retNewRecord = obj[0].IsNewRecord.ToString();
// Build insert string... notice how I use the recently created variables
// string insertStr = retGLMID + ", '" + retAcc + "', '" + retAddress + "', '" + retLat + "', '" + retLong + "', '" + localID;
string insertStr = "insert into table " +
"(ID,GLM_ID,NEW_RECORD_IND,ACCURACY) " +
" VALUES " +
"('" + localID + "', '" + retGLMID + "', '" + retNewRecord + "', '" + retAcc + "')";
string connectionString = "Data Source=; Initial Catalog=; Trusted_Connection=Yes";
using (SqlConnection connection = new SqlConnection(connectionString))
{
SqlCommand cmd = new SqlCommand(insertStr);
cmd.CommandText = insertStr;
cmd.CommandType = CommandType.Text;
cmd.Connection = connection;
connection.Open();
cmd.ExecuteNonQuery();
connection.Close();
}
}
}
{
string insertStr2 = "insert into table " +
"(ID,GLM_ID,NEW_RECORD_IND,ACCURACY) " +
" VALUES " +
"('" + localID + "', null, null, 'Not_Found')";
string connectionString2 = "Data Source=; Initial Catalog=; Trusted_Connection=Yes";
using (SqlConnection connection = new SqlConnection(connectionString2))
{
SqlCommand cmd = new SqlCommand(insertStr2);
cmd.CommandText = insertStr2;
cmd.CommandType = CommandType.Text;
cmd.Connection = connection;
connection.Open();
cmd.ExecuteNonQuery();
connection.Close();
}
}
}
当我尝试使用 Parallel.Foreach
时,我遇到了变量问题。我想要运行多个请求,但是如果有意义的话,每个请求保留变量的每个实例。我无法将 localID
传递给API并将其返回,或者这是理想的。
这甚至可能吗?
我需要如何构建此调用才能实现我的目标?
基本上我希望能够发送多个电话,以加快整个过程。
编辑:添加了GetGlFromAddress()
的代码。是的,我是新手,所以请善待:)
答案 0 :(得分:0)
将所有数据放入数组中,您可以一次调用多个请求,最好使用多任务或异步方法来调用API。