我正在编写一个Java程序来从LTA DataMall API获取一些列表。问题是每次调用LTA DataMall API最多返回50条记录,我想获取所有记录(一次只能完成50条记录)并将其写入CSV文件而不是前50条记录仅
对于此示例,我将使用BusStops
API。
根据this userguide,API响应通常每次调用最多返回50条记录。
假设我正在调用BusStops
API,其中GET请求将类似于http://datamall2.mytransport.sg/ltaodataservice/BusStops
。得到的响应将如下所示:
{
"odata.metadata": "http://datamall2.mytransport.sg/ltaodataservice/$metadata#BusStops",
"value": [
{
"BusStopCode": "01012",
"RoadName": "Victoria St",
"Description": "Hotel Grand Pacific",
"Latitude": 1.29684825487647,
"Longitude": 103.85253591654006
},
{
"BusStopCode": "01013",
"RoadName": "Victoria St",
"Description": "St. Joseph's Ch",
"Latitude": 1.29770970610083,
"Longitude": 103.8532247463225
},
{
"BusStopCode": "01019",
"RoadName": "Victoria St",
"Description": "Bras Basah Cplx",
"Latitude": 1.29698951191332,
"Longitude": 103.85302201172507
},
{
"BusStopCode": "01029",
"RoadName": "Nth Bridge Rd",
"Description": "Cosmic Insurance Bldg",
"Latitude": 1.2966729849642,
"Longitude": 103.85441422464267
}
// and so on, up till ...
{
"BusStopCode": "02099",
"RoadName": "Raffles Blvd",
"Description": "Marina Ctr Ter",
"Latitude": 1.29101661693418,
"Longitude": 103.86255772172497
},
{
"BusStopCode": "02101",
"RoadName": "Raffles Ave",
"Description": "Bef Temasek Ave",
"Latitude": 1.28939197625331,
"Longitude": 103.8618029276249
}
]
}
请记住,每个请求都会返回50条记录,因此要获得另外50条记录,我必须将$skip
附加到网址,因此要检索第51条到第100条记录,网址将如下所示:{ {1}}。
仅供参考,共有5296条http://datamall2.mytransport.sg/ltaodataservice/BusStops?$skip=49
条记录。
考虑到这一点,我正在尝试将上述响应写入CSV文件。以下方法是为此目的而创建的:
BusStop
编辑:对于那些对public static void writeBusStops() {
String file = "./csv/bus_stops.csv";
// mainQueryItems is to get the JsonObjects. They will also be written to the CSV file as headers.
String[] mainQueryItems = new String[]{"BusStopCode", "RoadName",
"Description", "Latitude", "Longitude"};
Iterator<JsonElement> jIter;
int count = 0;
try (CSVWriter writer = new CSVWriter(new FileWriter(file))) {
writer.writeNext(mainQueryItems); // write the header
// ApiLTA.getBusStops(0) just gets the raw JSON from LTA's DataMall API from the 1st to 50th record.
// To get the 51st to 100th record, change '0' to '49', i.e. ApiLTA.getBusStops(49).
// "value" is the JsonArray I want to get, all others like "odata.metadata" are redundant.
jIter = jsonIterator(ApiLTA.getBusStops(0),"value");
while (jIter.hasNext()) { // each bus stop
JsonObject jObject = jIter.next().getAsJsonObject();
// Get the individual items
String[] svcItems = new String[mainQueryItems.length];
for (int i = 0; i < mainQueryItems.length; i++) {
svcItems[i] = getStringItem(jObject, mainQueryItems[i]);
}
// Write the items to the CSV file
writer.writeNext(svcItems);
count++;
if ((count + 1) % 50 == 0) {
System.out.println("new " + count); // for debug
jIter = jsonIterator(ApiLTA.getBusStops(count), "value");
}
}
System.out.println("done"); // for debug
} catch (IOException ex) {
Logger.getLogger(ApiLTADeserialiser.class.getName()).log(Level.SEVERE, null, ex);
} catch (Exception e) {
e.printStackTrace();
}
}
方法感兴趣的人......
jsonIterator
我使用private static Iterator<JsonElement> jsonIterator(String jsonToParse, String k)
throws JsonSyntaxException, IllegalStateException {
// Parse the raw JSON
JsonParser parser = new JsonParser();
JsonElement raw = parser.parse(jsonToParse); // The jsonToParse is simply, the URL of the GET request
// Get the array we need directly and pass it to an iterator
JsonArray jArray = raw.getAsJsonObject().getAsJsonArray(k);
return jArray.iterator();
}
和gson-2.7
作为外部包。
CSV文件中的预期输出是:
opencsv.3.8
然而,输出变为:
"BusStopCode","RoadName","Description","Latitude","Longitude"
"01012","Victoria St","Hotel Grand Pacific","1.29684825487647","103.85253591654006"
"01013","Victoria St","St. Joseph's Ch","1.29770970610083","103.8532247463225"
"01019","Victoria St","Bras Basah Cplx","1.29698951191332","103.85302201172507"
"01029","Nth Bridge Rd","Cosmic Insurance Bldg","1.2966729849642","103.85441422464267"
// and so on
"99009","Changi Village Rd","Changi Village Ter","1.38969812175274","103.98762553601895"
"99011","Loyang Ave","Bef Sch Of Commando","1.38328110134472","103.97812168830427"
// and so on
基本上,重复前50个记录本身。
你们都觉得更好的方法是进行多次API调用来填充所有记录而不是重复前50条记录?那我怎么能容纳这种方法来获得不到50条记录?
答案 0 :(得分:0)
我无法访问该API并且没有查看如何获取API密钥,但我做了一个简单的基于Spring的模拟,我希望它能够完全按照您的方式运行API有效。
首先,$skip=49
看起来有点奇怪。没有API经验,我认为它应该是$skip=50
。
其次,您可能希望重新设计分离其职责的阅读和写作程序。如果您重新设计数据生成和使用方法以使它们更灵活,该怎么办?
produce()
以下方法创建一个特殊的iterable,它在任何对象上生成一个迭代器,并检查封装迭代状态的越界状态本身。它接受三个参数:切片大小(在您的情况下为50),一个接受&#34;跳过&#34; state并且必须为即将到来的迭代器next()
生成一个新值,并且一个谓词检查生成迭代器是否可以被认为已完成。
private static <T> Iterable<T> produce(final int slice, final IntFunction<T> mapper, final Predicate<? super T> predicate) {
return () -> new Iterator<T>() {
private boolean hasMore = true;
private int i;
@Override
public boolean hasNext() {
return hasMore;
}
@Override
public T next()
throws NoSuchElementException {
if ( !hasMore ) {
throw new NoSuchElementException();
}
final T next = mapper.apply(i * slice);
hasMore = predicate.test(next);
i++;
return next;
}
};
}
getValuesFrom()
和hasNotEmptyValues()
这两个方法非常简单,只允许检查传入的JSON对象。
private static JsonArray getValuesFrom(final JsonElement root) {
final JsonObject rootObject = root.getAsJsonObject();
return rootObject.get("values").getAsJsonArray();
}
private static boolean hasNotEmptyValues(final JsonElement root) {
return getValuesFrom(root).size() != 0;
}
consumeJsonValues()
这是一种特殊的,不是非常通用的方法,它接受生成/生成的内容,并根据一些特殊规则将其产品委托给消费者:
filter
,因为produce()
方法返回空值,因为&#34;已完成&#34;标记物。flatMap
,以平整存储在回复根对象中的值。map
踩到JsonObject
forEach
JsonObject被委托给作为第二个参数传递的给定消费者private static void consumeJsonValues(final Iterable<? extends JsonElement> jsonElements, final Consumer<? super JsonObject> consumer) {
stream(jsonElements.spliterator(), false)
.filter(LtaMockControllerClient::hasNotEmptyValues)
.flatMap(root -> stream(getValuesFrom(root).spliterator(), false))
.map(JsonElement::getAsJsonObject)
.forEach(consumer);
}
readBusStops()
private static JsonElement readBusStops(final Gson gson, final int skip) {
try ( final InputStream inputStream = new URL("http://127.0.0.1:9000/ltaodataservice/BusStops?$skip=" + skip).openStream();
final Reader reader = new InputStreamReader(inputStream) ) {
return gson.fromJson(reader, JsonElement.class);
} catch ( final IOException ex ) {
throw new RuntimeException(ex);
}
}
如前所述,所有内容都位于LtaMockControllerClient
。请考虑以下main()
方法:
private static final Gson gson = new Gson();
public static void main(final String... args)
throws IOException {
try ( final CSVWriter writer = new CSVWriter(new PrintWriter(new OutputStreamWriter(out))) ) {
writer.writeNext(new String[]{ "BusStopCode", "RoadName", "Description", "Latitude", "Longitude" });
consumeJsonValues(
produce(50, skip -> readBusStops(gson, skip), LtaMockControllerClient::hasNotEmptyValues),
v -> writer.writeNext(new String[]{
v.getAsJsonPrimitive("BusStopCode").getAsString(),
v.getAsJsonPrimitive("RoadName").getAsString(),
v.getAsJsonPrimitive("Description").getAsString(),
v.getAsJsonPrimitive("Latitude").getAsString(),
v.getAsJsonPrimitive("Longitude").getAsString()
})
);
}
}
上面的代码很简单,字面意思是:
System.out
)。produce()
方法接受的内容),并将解析后的结果委派给只将已解析属性写入结果CSV文件的使用者。在上面的代码中有localhost:9000
,因为这只是一个本地模拟服务器,其行为类似于LTA服务。它只是一个用Spring MVC编写的常规REST控制器:
@RestController
@RequestMapping("/ltaodataservice")
public class LtaMockController {
private static final Gson gson = new Gson();
private static final List<BusStopDto> busStops = generateBusStops(5296);
@RequestMapping(method = GET, value = "/BusStops", produces = "application/json")
public String get(
@RequestParam(value = "$skip", defaultValue = "0") final int from
) {
final int size = busStops.size();
final List<BusStopDto> values = from <= size
? busStops.subList(from, min(from + 50, size))
: emptyList();
return gson.toJson(new ResponseDto("http://datamall2.mytransport.sg/ltaodataservice/$metadata#BusStops", values));
}
private static List<BusStopDto> generateBusStops(final int count) {
final List<BusStopDto> busStops = new ArrayList<>();
for ( int i = 1; i <= count; i++ ) {
busStops.add(new BusStopDto("code:" + i, "road:" + i, "description:" + i, i, i));
}
return unmodifiableList(busStops);
}
private static final class ResponseDto {
@SerializedName("odata.metadata")
@SuppressWarnings("unused")
private final String odataMetadata;
@SerializedName("values")
@SuppressWarnings("unused")
private final List<BusStopDto> values;
private ResponseDto(final String odataMetadata, final List<BusStopDto> values) {
this.odataMetadata = odataMetadata;
this.values = values;
}
}
private static final class BusStopDto {
@SerializedName("BusStopCode")
@SuppressWarnings("unused")
private final String busStopCode;
@SerializedName("RoadName")
@SuppressWarnings("unused")
private final String roadName;
@SerializedName("Description")
@SuppressWarnings("unused")
private final String description;
@SerializedName("Latitude")
@SuppressWarnings("unused")
private final double latitude;
@SerializedName("Longitude")
@SuppressWarnings("unused")
private final double longitude;
private BusStopDto(final String busStopCode, final String roadName, final String description, final double latitude, final double longitude) {
this.busStopCode = busStopCode;
this.roadName = roadName;
this.description = description;
this.latitude = latitude;
this.longitude = longitude;
}
}
}
客户端和服务器组件都会产生以下输出(5296个假巴士站和5297个总线):
"BusStopCode","RoadName","Description","Latitude","Longitude"
"code:1","road:1","description:1","1.0","1.0"
"code:2","road:2","description:2","2.0","2.0"
"code:3","road:3","description:3","3.0","3.0"
"code:4","road:4","description:4","4.0","4.0"
"code:5","road:5","description:5","5.0","5.0"
"code:6","road:6","description:6","6.0","6.0"
"code:7","road:7","description:7","7.0","7.0"
...
"code:5291","road:5291","description:5291","5291.0","5291.0"
"code:5292","road:5292","description:5292","5292.0","5292.0"
"code:5293","road:5293","description:5293","5293.0","5293.0"
"code:5294","road:5294","description:5294","5294.0","5294.0"
"code:5295","road:5295","description:5295","5295.0","5295.0"
"code:5296","road:5296","description:5296","5296.0","5296.0"