我编写了一个从外部API读取一些数据的函数。我的功能是什么,它从磁盘读取文件时调用该API。我想为大文件(35000条记录)优化我的代码。你能否就此提出建议?
以下是我的代码。
public void readCSVFile() {
try {
br = new BufferedReader(new FileReader(getFileName()));
while ((line = br.readLine()) != null) {
String[] splitLine = line.split(cvsSplitBy);
String campaign = splitLine[0];
String adGroup = splitLine[1];
String url = splitLine[2];
long searchCount = getSearchCount(url);
StringBuilder sb = new StringBuilder();
sb.append(campaign + ",");
sb.append(adGroup + ",");
sb.append(searchCount + ",");
writeToFile(sb, getNewFileName());
}
} catch (Exception e) {
e.printStackTrace();
}
}
private long getSearchCount(String url) {
long recordCount = 0;
try {
DefaultHttpClient httpClient = new DefaultHttpClient();
HttpGet getRequest = new HttpGet(
"api.com/querysearch?q="
+ url);
getRequest.addHeader("accept", "application/json");
HttpResponse response = httpClient.execute(getRequest);
if (response.getStatusLine().getStatusCode() != 200) {
throw new RuntimeException("Failed : HTTP error code : "
+ response.getStatusLine().getStatusCode());
}
BufferedReader br = new BufferedReader(new InputStreamReader(
(response.getEntity().getContent())));
String output;
while ((output = br.readLine()) != null) {
try {
JSONObject json = (JSONObject) new JSONParser()
.parse(output);
JSONObject result = (JSONObject) json.get("result");
recordCount = (long) result.get("count");
System.out.println(url + "=" + recordCount);
} catch (Exception e) {
System.out.println(e.getMessage());
}
}
httpClient.getConnectionManager().shutdown();
} catch (Exception e) {
e.getStackTrace();
}
return recordCount;
}
答案 0 :(得分:1)
由于远程调用比本地磁盘访问慢,因此您需要以某种方式并行化或批量远程调用。如果您无法对远程API进行批量调用,但它允许多个并发读取,那么您可能希望使用类似线程池的内容来进行远程调用:
public void readCSVFile() {
// exception handling ignored for space
br = new BufferedReader(new FileReader(getFileName()));
List<Future<String>> futures = new ArrayList<Future<String>>();
ExecutorService pool = Executors.newFixedThreadPool(5);
while ((line = br.readLine()) != null) {
final String[] splitLine = line.split(cvsSplitBy);
futures.add(pool.submit(new Callable<String> {
public String call() {
long searchCount = getSearchCount(splitLine[2]);
return new StringBuilder()
.append(splitLine[0]+ ",")
.append(splitLine[1]+ ",")
.append(searchCount + ",")
.toString();
}
}));
}
for (Future<String> fs: futures) {
writeToFile(fs.get(), getNewFileName());
}
pool.shutdown();
}
但理想情况下,如果可能的话,您真的希望从远程API中进行单批读取。