我试图传递我在管道阶段之间生成的TableRow,我收到以下错误:
Exception in thread "main"
com.google.cloud.dataflow.sdk.Pipeline$PipelineExecutionException:
java.lang.IllegalArgumentException: Forbidden IOException when writing to OutputStream
[... exception propagation ...]
Caused by: com.fasterxml.jackson.databind.JsonMappingException:
Infinite recursion (StackOverflowError) (through reference chain:
com.google.protobuf.Descriptors$Descriptor["file"]
->com.google.protobuf.Descriptors$FileDescriptor["messageTypes"]
->java.util.Collections$UnmodifiableRandomAccessList[0]->
[... many, many lines of this ...]
at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:733)
at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:155)
at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serializeContentsUsing(IndexedListSerializer.java:142)
at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serializeContents(IndexedListSerializer.java:88)
at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serialize(IndexedListSerializer.java:79)
at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serialize(IndexedListSerializer.java:18)
at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:727)
at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:717)
at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:155)
at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:727)
at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:717)
at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:155)
at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serializeContentsUsing(IndexedListSerializer.java:142)
[... many, many lines of this ...]
Caused by: java.lang.StackOverflowError
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:736)
at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:155)
at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:727)
at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:717)
at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:155)
at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serializeContentsUsing(IndexedListSerializer.java:142)
at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serializeContents(IndexedListSerializer.java:88)
at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serialize(IndexedListSerializer.java:79)
at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serialize(IndexedListSerializer.java:18)
[... snip ...]
at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serialize(IndexedListSerializer.java:79)
at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serialize(IndexedListSerializer.java:18)
at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:727)
at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:717)
at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:155)
我通过其描述符从Google protobuf递归构建我的TableRow - 我以递归方式遍历描述符深度(因为protobufs可能有嵌套定义)并在遍历时构建TableRow。以下是TableRow创建类的摘录:
public void processElement(ProcessContext c) throws Exception {
TableRow row = getTableRow(c.element());
LOG.info(row.toPrettyString());
c.output(row);
}
private TableRow getTableRow(TMessage message) throws Exception {
TableRow row = new TableRow();
encode(message, row);
return row;
}
private TableCell getTableCell(TMessage message) throws Exception {
TableCell cell = new TableCell();
encode(message, cell);
return cell;
}
private void encode(TMessage message, GenericJson row) throws Exception {
Descriptors.Descriptor descriptor = message.getDescriptorForType();
List<Descriptors.FieldDescriptor> fields = descriptor.getFields();
for (Descriptors.FieldDescriptor fieldDescriptor : fields) {
Descriptors.FieldDescriptor.Type fieldType = fieldDescriptor.getType();
switch (fieldType) {
case DOUBLE:
case FLOAT:
case INT64:
case UINT64:
case INT32:
case FIXED64:
case FIXED32:
case UINT32:
case SFIXED32:
case SFIXED64:
case SINT32:
case SINT64:
case BOOL:
case STRING:
case BYTES:
case ENUM:
if (fieldDescriptor.isRepeated()) {
List<Object> tableCells = new ArrayList<>();
tableCells.addAll((List<?>) message.getField(fieldDescriptor));
row.set(fieldDescriptor.getName(), tableCells);
}
else {
row.set(fieldDescriptor.getName(), message.getField(fieldDescriptor));
}
break;
case MESSAGE:
if (fieldDescriptor.isRepeated()) {
List<TableRow> tableRows = new ArrayList<>();
for (Object o : (List<?>) message.getField(fieldDescriptor)) {
TMessage nestedMessage = (TMessage) o;
TableRow tableRow = getTableRow(nestedMessage);
tableRows.add(tableRow);
}
row.set(fieldDescriptor.getName(), tableRows);
}
else {
row.set(fieldDescriptor.getName(), getTableCell((TMessage) message.getField(fieldDescriptor)));
}
break;
case GROUP:
throw new Exception("groups are deprecated");
}
}
我相信TableRow正在被正确创建,因为我已经用一些简单的虚拟数据测试了这个DoFn,并查看了我的数据集子集上的TableRow创建结果(参见上面的代码片段,我在哪里LOG.info
TableRow编码的结果),结果TableRow似乎包含了我期望的所有数据,没有额外的字段。
答案 0 :(得分:2)
基于堆栈跟踪和代码,看起来协议缓冲区消息中的某些内容可能是自引用的。遵循这些引用时,JSON编码失败。
查看代码,我的猜测是你遇到了一个枚举。如果您查看getField的协议缓冲区文档,它会返回EnumValueDescriptor。
查看EnumValueDescriptor,它有一个指向FileDescriptor的链接,该链接包含一个指向EnumDescriptor的链接,该链接包含一个指向FileDescriptor的链接,该链接包含所有EnumDescriptors的列表,其中包含指向FileDescriptor的链接等。 / p>
如果您专门处理ENUM
案例(特别是为了防止protos在JSON Map中显示为值),它应该可以解决您的问题。