对于已经运行并执行了数十次的spark程序,在以下逻辑上发生了一个有趣的文件系统错误,以设置checkpoint dir
:
val tempDir = s"alsTest"
sc.setCheckpointDir(tempDir)
这是错误:
org.apache.hadoop.fs.FileSystem: Provider tachyon.hadoop.TFS could not be instantiated
这是完整的堆栈跟踪:
Exception in thread "main" java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider tachyon.hadoop.TFS could not be instantiated
at java.util.ServiceLoader.fail(ServiceLoader.java:232)
at java.util.ServiceLoader.access$100(ServiceLoader.java:185)
at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2400)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
at org.apache.spark.SparkContext$$anonfun$setCheckpointDir$2.apply(SparkContext.scala:2076)
at org.apache.spark.SparkContext$$anonfun$setCheckpointDir$2.apply(SparkContext.scala:2074)
at scala.Option.map(Option.scala:145)
at org.apache.spark.SparkContext.setCheckpointDir(SparkContext.scala:2074)
at com.blazedb.spark.ml.AlsTest$.main(AlsTest.scala:331)
at com.blazedb.spark.ml.AlsTest.main(AlsTest.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
Caused by: java.lang.ExceptionInInitializerError
at tachyon.Constants.<clinit>(Constants.java:328)
at tachyon.hadoop.AbstractTFS.<clinit>(AbstractTFS.java:63)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at java.lang.Class.newInstance(Class.java:442)
at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)
... 21 more
Caused by: java.lang.RuntimeException: java.net.ConnectException: Permission denied (connect failed)
at com.google.common.base.Throwables.propagate(Throwables.java:160)
at tachyon.util.network.NetworkAddressUtils.getLocalIpAddress(NetworkAddressUtils.java:398)
at tachyon.util.network.NetworkAddressUtils.getLocalHostName(NetworkAddressUtils.java:320)
at tachyon.conf.TachyonConf.<init>(TachyonConf.java:122)
at tachyon.conf.TachyonConf.<init>(TachyonConf.java:111)
at tachyon.Version.<clinit>(Version.java:27)
... 29 more
Caused by: java.net.ConnectException: Permission denied (connect failed)
at java.net.Inet6AddressImpl.isReachable0(Native Method)
at java.net.Inet6AddressImpl.isReachable(Inet6AddressImpl.java:77)
at java.net.InetAddress.isReachable(InetAddress.java:502)
at java.net.InetAddress.isReachable(InetAddress.java:461)
at tachyon.util.network.NetworkAddressUtils.isValidAddress(NetworkAddressUtils.java:414)
at tachyon.util.network.NetworkAddressUtils.getLocalIpAddress(NetworkAddressUtils.java:382)
... 33 more
请注意,使用alsTest
的相对路径之前一直运行正常。我们的RDD存储设置为MEMORY_AND_SER
(不 OFF_HEAP
)。我们还可以通过查看hdfs
:
$hdfs dfs -lsr
drwxr-xr-x - boescst supergroup 0 2016-12-13 12:43 alsTest/78081dc9-06f5-43d6-bcfb-1cfea7b4f015
drwxr-xr-x - boescst supergroup 0 2016-12-13 12:19 alsTest/e2dd272b-19fe-4ee8-87d0-2a9afe141c9e
那么为什么Spark FileSystem类现在会尝试访问OFF_HEAP
(tachyon)?
更新这变得越来越有趣:甚至在Tachyon错误中明确指定hdfs
URL结果
val tempDir = s"hdfs://$host:8020:alsTest/"
sc.setCheckpointDir(tempDir)
<same error as above>
答案 0 :(得分:3)
问题在于昨天在我的系统上首次启用的新 VPN 软件当VPN软件被暂停时,HDFS
网址再次正确解析