Gson:解析非标准的JSON格式

时间:2017-04-20 19:24:14

标签: java json gson

Gson是否有办法读取非标准的JSON文件?

而不是像:

这样的典型文件
[{obj1},{objN}]

我有一个这样的文件:

{obj1}
{objN}

没有方括号或逗号且每个对象由换行符分隔。

3 个答案:

答案 0 :(得分:1)

是的,它有。 Gson支持宽松阅读。例如,以下JSON文档(non-standard.json):

{
    "foo": 1
}
{
    "bar": 1
}

您可以使用以下阅读方式:

private static final Gson gson = new Gson();
private static final TypeAdapter<JsonElement> jsonElementTypeAdapter = gson.getAdapter(JsonElement.class);

public static void main(final String... args)
        throws IOException {
    try ( final Reader reader = getPackageResourceReader(Q43528208.class, "non-standard.json") ) {
        final JsonReader jsonReader = new JsonReader(reader);
        jsonReader.setLenient(true); // this makes it work
        while ( jsonReader.peek() != END_DOCUMENT ) {
            final JsonElement jsonElement = jsonElementTypeAdapter.read(jsonReader);
            System.out.println(jsonElement);
        }
    }
}

输出:

{"foo":1}  
{"bar":1}  

我不确定你是否可以这样写一个强大的解串器。

更新

为了简化Gson支持,我们可以实现一些方便的阅读方法:

// A shortcut method for the below implementation: aggregates the whole result into a single list
private static <T> List<T> parseToListLenient(final JsonReader jsonReader, final IMapper<? super JsonReader, ? extends T> mapper)
        throws IOException {
    final List<T> list = new ArrayList<>();
    parseLenient(jsonReader, in -> list.add(mapper.map(in)));
    return list;
}

// A convenient strategy-accepting method to configure a JsonReader instance to make it lenient and do read
// The consumer defines the strategy what to do with the current JsonReader token
private static void parseLenient(final JsonReader jsonReader, final IConsumer<? super JsonReader> consumer)
        throws IOException {
    final boolean isLenient = jsonReader.isLenient();
    try {
        jsonReader.setLenient(true);
        while ( jsonReader.peek() != END_DOCUMENT ) {
            consumer.accept(jsonReader);
        }
    } finally {
        jsonReader.setLenient(isLenient);
    }
}

// Since Java 8 Consumer inteface does not allow checked exceptions to be rethrown
private interface IConsumer<T> {

    void accept(T value)
            throws IOException;

}

private interface IMapper<T, R> {

    R map(T value)
            throws IOException;

}

然后简单的阅读非常简单,我们可以使用上面的方法:

final Gson gson = new Gson();
final TypeToken<Map<String, Integer>> typeToken = new TypeToken<Map<String, Integer>>() {
};
final TypeAdapter<Map<String, Integer>> typeAdapter = gson.getAdapter(typeToken);
try ( final JsonReader jsonReader = getPackageResourceJsonReader(Q43528208.class, "non-standard.json") ) {
    final List<Map<String, Integer>> maps = parseToListLenient(jsonReader, typeAdapter::read);
    System.out.println(maps);
}

直接通过Gson进行反序列化需要更复杂的实现:

// This is just a marker not meant to be instantiated but to create a sort of "gateway" to dispatch types in Gson
@SuppressWarnings("unused")
private static final class LenientListMarker<T> {
    private LenientListMarker() {
        throw new AssertionError("must not be instantiated");
    }
}

private static void doDeserialize()
        throws IOException {
    final Gson gson = new GsonBuilder()
            .registerTypeAdapterFactory(new TypeAdapterFactory() {
                @Override
                public <T> TypeAdapter<T> create(final Gson gson, final TypeToken<T> typeToken) {
                    // Check if the given type is the lenient list marker class
                    if ( !LenientListMarker.class.isAssignableFrom(typeToken.getRawType()) ) {
                        // Not the case? Just delegate the job to Gson
                        return null;
                    }
                    final Type listElementType = getTypeParameter0(typeToken.getType());
                    final TypeAdapter<?> listElementAdapter = gson.getAdapter(TypeToken.get(listElementType));
                    @SuppressWarnings("unchecked")
                    final TypeToken<List<?>> listTypeToken = (TypeToken<List<?>>) TypeToken.getParameterized(List.class, listElementType);
                    final TypeAdapter<List<?>> listAdapter = gson.getAdapter(listTypeToken);
                    final TypeAdapter<List<?>> typeAdapter = new TypeAdapter<List<?>>() {
                        @Override
                        public void write(final JsonWriter out, final List<?> value)
                                throws IOException {
                            // Always write a well-formed list
                            listAdapter.write(out, value);
                        }

                        @Override
                        public List<?> read(final JsonReader in)
                                throws IOException {
                            // Delegate the job to the reading method - we only have to tell how to obtain the list values
                            return parseToListLenient(in, listElementAdapter::read);
                        }
                    };
                    @SuppressWarnings("unchecked")
                    final TypeAdapter<T> castTypeAdapter = (TypeAdapter<T>) typeAdapter;
                    return castTypeAdapter;
                }

                // A simple method to resolve actual type parameter
                private Type getTypeParameter0(final Type type) {
                    if ( !(type instanceof ParameterizedType) ) {
                        // List or List<?>
                        return Object.class;
                    }
                    return ((ParameterizedType) type).getActualTypeArguments()[0];
                }
            })
            .create();
    // This type declares a marker specialization to be used during deserialization
    final Type type = new TypeToken<LenientListMarker<Map<String, Integer>>>() {
    }.getType();
    try ( final JsonReader jsonReader = getPackageResourceJsonReader(Q43528208.class, "non-standard.json") ) {
        // This is where we're a sort of cheating:
        // We tell Gson to deserialize LenientListMarker<Map<String, Integer>> but the type adapter above will return a list
        final List<Map<String, Integer>> maps = gson.fromJson(jsonReader, type);
        System.out.println(maps);
    }
}

输出现在是Map<String, Integer> s,而不是JsonElement s:

  

[{foo = 1},{bar = 1}]

更新2

TypeToken.getParameterized解决方法:

@SuppressWarnings("unchecked")
final TypeToken<List<?>> listTypeToken = (TypeToken<List<?>>) TypeToken.get(new ParameterizedType() {
    @Override
    public Type getRawType() {
        return List.class;
    }

    @Override
    public Type[] getActualTypeArguments() {
        return new Type[]{ listElementType };
    }

    @Override
    public Type getOwnerType() {
        return null;
    }
});

答案 1 :(得分:0)

我们可以再增加一个程序来引入逗号(,)并构造一个格式良好的JSON

答案 2 :(得分:0)

使用spark 2,我们可以添加多行作为读取选项。

spark.df.option("multiline","true").json("data.json")