我有一个非常大的> 1GB JSON文件包含一个数组(它是机密的,但这是一个代理:)
[
{
"date": "August 17, 2015",
"hours": 7,
"minutes": 10
},
{
"date": "August 19, 2015",
"hours": 4,
"minutes": 46
},
{
"date": "August 19, 2015",
"hours": 7,
"minutes": 22
},
{
"date": "August 21, 2015",
"hours": 4,
"minutes": 48
},
{
"date": "August 21, 2015",
"hours": 6,
"minutes": 1
}
]
我已经使用JSON2POJO来制作一个"睡眠"对象定义。
现在,可以使用Jackson的Mapper转换为数组,然后使用Arrays.stream(ARRAY)。除了这个崩溃(是的,它是一个大文件)。
显而易见的是使用Jackson的Streaming API。但那是超低水平。特别是,我仍然想要睡眠对象。
如何使用Jackson Streaming JSON阅读器和Sleep.java类生成Java 8睡眠对象流?
答案 0 :(得分:3)
我无法找到一个好的解决方案,我需要一个特定的案例: 我有一个> 1GB的JSON文件(顶级JSON数组,有成千上万的大对象),使用普通的Jackson映射器在访问生成的Java对象数组时导致崩溃。
我发现使用Jackson Streaming API的示例丢失了极具吸引力的对象映射,当然也不允许通过(显然适当的)Java 8 Streaming API访问对象。
以下是一个使用的简单示例:
//Use the JSON File included as a resource
ClassLoader classLoader = SleepReader.class.getClassLoader();
File dataFile = new File(classLoader.getResource("example.json").getFile());
//Simple example of getting the Sleep Objects from that JSON
new JsonArrayStreamDataSupplier<>(dataFile, Sleep.class) //Got the Stream
.forEachRemaining(nightsRest -> {
System.out.println(nightsRest.toString());
});
这里是来自example.json
的一些JSON [
{
"date": "August 17, 2015",
"hours": 7,
"minutes": 10
},
{
"date": "August 19, 2015",
"hours": 4,
"minutes": 46
},
{
"date": "August 19, 2015",
"hours": 7,
"minutes": 22
},
{
"date": "August 21, 2015",
"hours": 4,
"minutes": 48
},
{
"date": "August 21, 2015",
"hours": 6,
"minutes": 1
}
]
并且,如果你不想去GitHub(你应该),这里是包装类本身:
/**
* @license APACHE LICENSE, VERSION 2.0 http://www.apache.org/licenses/LICENSE-2.0
* @author Michael Witbrock
*/
package com.michaelwitbrock.jacksonstream;
import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.JsonParser;
import com.fasterxml.jackson.core.JsonToken;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import java.io.File;
import java.io.IOException;
import java.util.Iterator;
import java.util.Spliterators;
import java.util.stream.Stream;
import java.util.stream.StreamSupport;
public class JsonArrayStreamDataSupplier<T> implements Iterator<T> {
/*
* This class wraps the Jackson streaming API for arrays (a common kind of
* large JSON file) in a Java 8 Stream. The initial motivation was that
* use of a default objectmapper to a Java array was crashing for me on
* a very large JSON file (> 1GB). And there didn't seem to be good example
* code for handling Jackson streams as Java 8 streams, which seems natural.
*/
static ObjectMapper mapper = new ObjectMapper();
JsonParser parser;
boolean maybeHasNext = false;
int count = 0;
JsonFactory factory = new JsonFactory();
private Class<T> type;
public JsonArrayStreamDataSupplier(File dataFile, Class<T> type) {
this.type = type;
try {
// Setup and get into a state to start iterating
parser = factory.createParser(dataFile);
parser.setCodec(mapper);
JsonToken token = parser.nextToken();
if (token == null) {
throw new RuntimeException("Can't get any JSON Token from "
+ dataFile.getAbsolutePath());
}
// the first token is supposed to be the start of array '['
if (!JsonToken.START_ARRAY.equals(token)) {
// return or throw exception
maybeHasNext = false;
throw new RuntimeException("Can't get any JSON Token fro array start from "
+ dataFile.getAbsolutePath());
}
} catch (Exception e) {
maybeHasNext = false;
}
maybeHasNext = true;
}
/*
This method returns the stream, and is the only method other
than the constructor that should be used.
*/
public Stream<T> getStream() {
return StreamSupport.stream(Spliterators.spliteratorUnknownSize(this, 0), false);
}
/* The remaining methods are what enables this to be passed to the spliterator generator,
since they make it Iterable.
*/
@Override
public boolean hasNext() {
if (!maybeHasNext) {
return false; // didn't get started
}
try {
return (parser.nextToken() == JsonToken.START_OBJECT);
} catch (Exception e) {
System.out.println("Ex" + e);
return false;
}
}
@Override
public T next() {
try {
JsonNode n = parser.readValueAsTree();
//Because we can't send T as a parameter to the mapper
T node = mapper.convertValue(n, type);
return node;
} catch (IOException | IllegalArgumentException e) {
System.out.println("Ex" + e);
return null;
}
}
}
答案 1 :(得分:0)
我认为你可以使用Jackson的API来摆脱整个Iterator实现。
这里的catch 22是readValueAs可以返回一个迭代器,我唯一没想到的就是为什么我必须先消耗JSON数组启动才能让杰克逊做它的工作
<DataGrid x:Name="dg" AutoGenerateColumns="False">
<DataGrid.CellStyle>
<Style TargetType="DataGridCell">
<Setter Property="Template">
<Setter.Value>
<ControlTemplate TargetType="{x:Type DataGridCell}">
<Border BorderBrush="{TemplateBinding BorderBrush}" BorderThickness="{TemplateBinding BorderThickness}" Background="{TemplateBinding Background}" SnapsToDevicePixels="True">
<ContentPresenter SnapsToDevicePixels="{TemplateBinding SnapsToDevicePixels}">
<ContentPresenter.InputBindings>
<MouseBinding Gesture="LeftDoubleClick"
Command="{Binding DataContext.DoubleClickCommand,RelativeSource={RelativeSource AncestorType=DataGrid}}"/>
</ContentPresenter.InputBindings>
</ContentPresenter>
</Border>
</ControlTemplate>
</Setter.Value>
</Setter>
</Style>
</DataGrid.CellStyle>
<DataGrid.InputBindings>
<MouseBinding Gesture="LeftDoubleClick" Command="{Binding DoubleClickCommand}"/>
</DataGrid.InputBindings>
<DataGrid.Columns>
<DataGridTextColumn Header="Name" Binding="{Binding Name}" />
</DataGrid.Columns>
</DataGrid>