如何从多个API调用中获取所有记录?

时间:2016-12-27 03:56:04

标签: java json gson opencsv

我正在编写一个Java程序来从LTA DataMall API获取一些列表。问题是每次调用LTA DataMall API最多返回50条记录,我想获取所有记录(一次只能完成50条记录)并将其写入CSV文件而不是前50条记录仅

对于此示例,我将使用BusStops API。

来自LTA DataMall的典型API调用

根据this userguide,API响应通常每次调用最多返回50条记录。

假设我正在调用BusStops API,其中GET请求将类似于http://datamall2.mytransport.sg/ltaodataservice/BusStops。得到的响应将如下所示:

{
    "odata.metadata": "http://datamall2.mytransport.sg/ltaodataservice/$metadata#BusStops",
    "value": [
        {
            "BusStopCode": "01012",
            "RoadName": "Victoria St",
            "Description": "Hotel Grand Pacific",
            "Latitude": 1.29684825487647,
            "Longitude": 103.85253591654006
        },
        {
            "BusStopCode": "01013",
            "RoadName": "Victoria St",
            "Description": "St. Joseph's Ch",
            "Latitude": 1.29770970610083,
            "Longitude": 103.8532247463225
        },
        {
            "BusStopCode": "01019",
            "RoadName": "Victoria St",
            "Description": "Bras Basah Cplx",
            "Latitude": 1.29698951191332,
            "Longitude": 103.85302201172507
        },
        {
            "BusStopCode": "01029",
            "RoadName": "Nth Bridge Rd",
            "Description": "Cosmic Insurance Bldg",
            "Latitude": 1.2966729849642,
            "Longitude": 103.85441422464267
        }
        // and so on, up till ...
        {
            "BusStopCode": "02099",
            "RoadName": "Raffles Blvd",
            "Description": "Marina Ctr Ter",
            "Latitude": 1.29101661693418,
            "Longitude": 103.86255772172497
        },
        {
            "BusStopCode": "02101",
            "RoadName": "Raffles Ave",
            "Description": "Bef Temasek Ave",
            "Latitude": 1.28939197625331,
            "Longitude": 103.8618029276249
        }
    ]
}

请记住,每个请求都会返回50条记录,因此要获得另外50条记录,我必须将$skip附加到网址,因此要检索第51条到第100条记录,网址将如下所示:{ {1}}。

仅供参考,共有5296条http://datamall2.mytransport.sg/ltaodataservice/BusStops?$skip=49条记录。

问题

考虑到这一点,我正在尝试将上述响应写入CSV文件。以下方法是为此目的而创建的:

BusStop

编辑:对于那些对public static void writeBusStops() { String file = "./csv/bus_stops.csv"; // mainQueryItems is to get the JsonObjects. They will also be written to the CSV file as headers. String[] mainQueryItems = new String[]{"BusStopCode", "RoadName", "Description", "Latitude", "Longitude"}; Iterator<JsonElement> jIter; int count = 0; try (CSVWriter writer = new CSVWriter(new FileWriter(file))) { writer.writeNext(mainQueryItems); // write the header // ApiLTA.getBusStops(0) just gets the raw JSON from LTA's DataMall API from the 1st to 50th record. // To get the 51st to 100th record, change '0' to '49', i.e. ApiLTA.getBusStops(49). // "value" is the JsonArray I want to get, all others like "odata.metadata" are redundant. jIter = jsonIterator(ApiLTA.getBusStops(0),"value"); while (jIter.hasNext()) { // each bus stop JsonObject jObject = jIter.next().getAsJsonObject(); // Get the individual items String[] svcItems = new String[mainQueryItems.length]; for (int i = 0; i < mainQueryItems.length; i++) { svcItems[i] = getStringItem(jObject, mainQueryItems[i]); } // Write the items to the CSV file writer.writeNext(svcItems); count++; if ((count + 1) % 50 == 0) { System.out.println("new " + count); // for debug jIter = jsonIterator(ApiLTA.getBusStops(count), "value"); } } System.out.println("done"); // for debug } catch (IOException ex) { Logger.getLogger(ApiLTADeserialiser.class.getName()).log(Level.SEVERE, null, ex); } catch (Exception e) { e.printStackTrace(); } } 方法感兴趣的人......

jsonIterator

我使用private static Iterator<JsonElement> jsonIterator(String jsonToParse, String k) throws JsonSyntaxException, IllegalStateException { // Parse the raw JSON JsonParser parser = new JsonParser(); JsonElement raw = parser.parse(jsonToParse); // The jsonToParse is simply, the URL of the GET request // Get the array we need directly and pass it to an iterator JsonArray jArray = raw.getAsJsonObject().getAsJsonArray(k); return jArray.iterator(); } gson-2.7作为外部包。

CSV文件中的预期输出是:

opencsv.3.8

然而,输出变为:

"BusStopCode","RoadName","Description","Latitude","Longitude"
"01012","Victoria St","Hotel Grand Pacific","1.29684825487647","103.85253591654006"
"01013","Victoria St","St. Joseph's Ch","1.29770970610083","103.8532247463225"
"01019","Victoria St","Bras Basah Cplx","1.29698951191332","103.85302201172507"
"01029","Nth Bridge Rd","Cosmic Insurance Bldg","1.2966729849642","103.85441422464267"
// and so on
"99009","Changi Village Rd","Changi Village Ter","1.38969812175274","103.98762553601895"
"99011","Loyang Ave","Bef Sch Of Commando","1.38328110134472","103.97812168830427"
// and so on

基本上,重复前50个记录本身。

你们都觉得更好的方法是进行多次API调用来填充所有记录而不是重复前50条记录?那我怎么能容纳这种方法来获得不到50条记录?

1 个答案:

答案 0 :(得分:0)

我无法访问该API并且没有查看如何获取API密钥,但我做了一个简单的基于Spring的模拟,我希望它能够完全按照您的方式运行API有效。

首先,$skip=49看起来有点奇怪。没有API经验,我认为它应该是$skip=50

其次,您可能希望重新设计分离其职责的阅读和写作程序。如果您重新设计数据生成和使用方法以使它们更灵活,该怎么办?

produce()

以下方法创建一个特殊的iterable,它在任何对象上生成一个迭代器,并检查封装迭代状态的越界状态本身。它接受三个参数:切片大小(在您的情况下为50),一个接受&#34;跳过&#34; state并且必须为即将到来的迭代器next()生成一个新值,并且一个谓词检查生成迭代器是否可以被认为已完成。

private static <T> Iterable<T> produce(final int slice, final IntFunction<T> mapper, final Predicate<? super T> predicate) {
    return () -> new Iterator<T>() {
        private boolean hasMore = true;
        private int i;

        @Override
        public boolean hasNext() {
            return hasMore;
        }

        @Override
        public T next()
                throws NoSuchElementException {
            if ( !hasMore ) {
                throw new NoSuchElementException();
            }
            final T next = mapper.apply(i * slice);
            hasMore = predicate.test(next);
            i++;
            return next;
        }
    };
}

getValuesFrom()hasNotEmptyValues()

这两个方法非常简单,只允许检查传入的JSON对象。

private static JsonArray getValuesFrom(final JsonElement root) {
    final JsonObject rootObject = root.getAsJsonObject();
    return rootObject.get("values").getAsJsonArray();
}

private static boolean hasNotEmptyValues(final JsonElement root) {
    return getValuesFrom(root).size() != 0;
}

consumeJsonValues()

这是一种特殊的,不是非常通用的方法,它接受生成/生成的内容,并根据一些特殊规则将其产品委托给消费者:

  • 将给定的iterable转换为流,以便使用Java 8 Stream API工具。 (使用Google Guava替代品也很好,可以在非Java 8环境中使用。)
  • 该流针对空值filter,因为produce()方法返回空值,因为&#34;已完成&#34;标记物。
  • 过滤后的内容现在为flatMap,以平整存储在回复根对象中的值。
  • 然后它map踩到JsonObject
  • 最后forEach JsonObject被委托给作为第二个参数传递的给定消费者
private static void consumeJsonValues(final Iterable<? extends JsonElement> jsonElements, final Consumer<? super JsonObject> consumer) {
    stream(jsonElements.spliterator(), false)
            .filter(LtaMockControllerClient::hasNotEmptyValues)
            .flatMap(root -> stream(getValuesFrom(root).spliterator(), false))
            .map(JsonElement::getAsJsonObject)
            .forEach(consumer);
}

readBusStops()

private static JsonElement readBusStops(final Gson gson, final int skip) {
    try ( final InputStream inputStream = new URL("http://127.0.0.1:9000/ltaodataservice/BusStops?$skip=" + skip).openStream();
          final Reader reader = new InputStreamReader(inputStream) ) {
        return gson.fromJson(reader, JsonElement.class);
    } catch ( final IOException ex ) {
        throw new RuntimeException(ex);
    }
}

毕竟它看起来如何?

如前所述,所有内容都位于LtaMockControllerClient。请考虑以下main()方法:

private static final Gson gson = new Gson();

public static void main(final String... args)
        throws IOException {
    try ( final CSVWriter writer = new CSVWriter(new PrintWriter(new OutputStreamWriter(out))) ) {
        writer.writeNext(new String[]{ "BusStopCode", "RoadName", "Description", "Latitude", "Longitude" });
        consumeJsonValues(
                produce(50, skip -> readBusStops(gson, skip), LtaMockControllerClient::hasNotEmptyValues),
                v -> writer.writeNext(new String[]{
                        v.getAsJsonPrimitive("BusStopCode").getAsString(),
                        v.getAsJsonPrimitive("RoadName").getAsString(),
                        v.getAsJsonPrimitive("Description").getAsString(),
                        v.getAsJsonPrimitive("Latitude").getAsString(),
                        v.getAsJsonPrimitive("Longitude").getAsString()
                })
        );
    }
}

上面的代码很简单,字面意思是:

  • 打开CSV编写器(例如,打开System.out)。
  • 写下标题行。
  • 使生产者/使用者管道从远程服务中读取(请参阅produce()方法接受的内容),并将解析后的结果委派给只将已解析属性写入结果CSV文件的使用者。

在上面的代码中有localhost:9000,因为这只是一个本地模拟服务器,其行为类似于LTA服务。它只是一个用Spring MVC编写的常规REST控制器:

@RestController
@RequestMapping("/ltaodataservice")
public class LtaMockController {

    private static final Gson gson = new Gson();

    private static final List<BusStopDto> busStops = generateBusStops(5296);

    @RequestMapping(method = GET, value = "/BusStops", produces = "application/json")
    public String get(
            @RequestParam(value = "$skip", defaultValue = "0") final int from
    ) {
        final int size = busStops.size();
        final List<BusStopDto> values = from <= size
                ? busStops.subList(from, min(from + 50, size))
                : emptyList();
        return gson.toJson(new ResponseDto("http://datamall2.mytransport.sg/ltaodataservice/$metadata#BusStops", values));
    }

    private static List<BusStopDto> generateBusStops(final int count) {
        final List<BusStopDto> busStops = new ArrayList<>();
        for ( int i = 1; i <= count; i++ ) {
            busStops.add(new BusStopDto("code:" + i, "road:" + i, "description:" + i, i, i));
        }
        return unmodifiableList(busStops);
    }

    private static final class ResponseDto {

        @SerializedName("odata.metadata")
        @SuppressWarnings("unused")
        private final String odataMetadata;

        @SerializedName("values")
        @SuppressWarnings("unused")
        private final List<BusStopDto> values;

        private ResponseDto(final String odataMetadata, final List<BusStopDto> values) {
            this.odataMetadata = odataMetadata;
            this.values = values;
        }

    }

    private static final class BusStopDto {

        @SerializedName("BusStopCode")
        @SuppressWarnings("unused")
        private final String busStopCode;

        @SerializedName("RoadName")
        @SuppressWarnings("unused")
        private final String roadName;

        @SerializedName("Description")
        @SuppressWarnings("unused")
        private final String description;

        @SerializedName("Latitude")
        @SuppressWarnings("unused")
        private final double latitude;

        @SerializedName("Longitude")
        @SuppressWarnings("unused")
        private final double longitude;

        private BusStopDto(final String busStopCode, final String roadName, final String description, final double latitude, final double longitude) {
            this.busStopCode = busStopCode;
            this.roadName = roadName;
            this.description = description;
            this.latitude = latitude;
            this.longitude = longitude;
        }

    }

}

客户端和服务器组件都会产生以下输出(5296个假巴士站和5297个总线):

"BusStopCode","RoadName","Description","Latitude","Longitude"
"code:1","road:1","description:1","1.0","1.0"
"code:2","road:2","description:2","2.0","2.0"
"code:3","road:3","description:3","3.0","3.0"
"code:4","road:4","description:4","4.0","4.0"
"code:5","road:5","description:5","5.0","5.0"
"code:6","road:6","description:6","6.0","6.0"
"code:7","road:7","description:7","7.0","7.0"
...
"code:5291","road:5291","description:5291","5291.0","5291.0"
"code:5292","road:5292","description:5292","5292.0","5292.0"
"code:5293","road:5293","description:5293","5293.0","5293.0"
"code:5294","road:5294","description:5294","5294.0","5294.0"
"code:5295","road:5295","description:5295","5295.0","5295.0"
"code:5296","road:5296","description:5296","5296.0","5296.0"