StreamingFileSink无法重命名正在进行的文件

时间:2019-11-12 01:33:59

标签: google-cloud-storage apache-flink flink-streaming

当StreamingFileSink尝试重命名正在进行的文件时,我偶尔会遇到问题。参见以下异常:

FS已安装在Google Cloud Storage存储桶中。在大多数情况下,它都能正常工作,然后突然失败。

我已验证正在进行的文件存在。任何提示建议将非常有帮助-谢谢!


2019-11-12 01:26:44.908 [flink-akka.actor.default-dispatcher-77] INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph - Job kafka-ingestion-avro-enriched-structured-trace-event-staging (b2f5e85f57049e0841a88498fcda868d) switched from state RUNNING to FAILING.
java.nio.file.NoSuchFileException: /var/data/rawdata/2019-11-11--19/.part-0-0.inprogress.a99cb596-8e96-4b88-ae60-e5b64875bb74 -> /var/data/rawdata/2019-11-11--19/part-0-0
    at sun.nio.fs.UnixException.translateToIOException(Unknown Source) ~[?:?]
    at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source) ~[?:?]
    at sun.nio.fs.UnixCopyFile.move(Unknown Source) ~[?:?]
    at sun.nio.fs.UnixFileSystemProvider.move(Unknown Source) ~[?:?]
    at java.nio.file.Files.move(Unknown Source) ~[?:?]
    at org.apache.flink.core.fs.local.LocalRecoverableFsDataOutputStream$LocalCommitter.commitAfterRecovery(LocalRecoverableFsDataOutputStream.java:181) ~[flink-core-1.7.0.jar:1.7.0]
    at org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.commitRecoveredPendingFiles(Bucket.java:137) ~[flink-streaming-java_2.11-1.7.0.jar:1.7.0]
    at org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.<init>(Bucket.java:119) ~[flink-streaming-java_2.11-1.7.0.jar:1.7.0]
    at org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.restore(Bucket.java:346) ~[flink-streaming-java_2.11-1.7.0.jar:1.7.0]
    at org.apache.flink.streaming.api.functions.sink.filesystem.DefaultBucketFactoryImpl.restoreBucket(DefaultBucketFactoryImpl.java:64) ~[flink-streaming-java_2.11-1.7.0.jar:1.7.0]
    at org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.handleRestoredBucketState(Buckets.java:177) ~[flink-streaming-java_2.11-1.7.0.jar:1.7.0]
    at org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.initializeActiveBuckets(Buckets.java:165) ~[flink-streaming-java_2.11-1.7.0.jar:1.7.0]
    at org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.initializeState(Buckets.java:149) ~[flink-streaming-java_2.11-1.7.0.jar:1.7.0]
    at org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSink.initializeState(StreamingFileSink.java:334) ~[flink-streaming-java_2.11-1.7.0.jar:1.7.0]
    at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178) ~[flink-streaming-java_2.11-1.7.0.jar:1.7.0]
    at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160) ~[flink-streaming-java_2.11-1.7.0.jar:1.7.0]
    at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96) ~[flink-streaming-java_2.11-1.7.0.jar:1.7.0]
    at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:278) ~[flink-streaming-java_2.11-1.7.0.jar:1.7.0]
    at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:738) ~[flink-streaming-java_2.11-1.7.0.jar:1.7.0]
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:289) ~[flink-streaming-java_2.11-1.7.0.jar:1.7.0]
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:704) ~[flink-runtime_2.11-1.7.0.jar:1.7.0]




1 个答案:

答案 0 :(得分:0)

遇到相同的问题,我敢打赌这与最终一致性或使用对象存储进行文件系统模拟有关。我改用了基于gcs://的路径,现在看来工作正常。