从HDFS

时间:2018-04-04 19:26:17

标签: scala apache-spark hadoop hdfs

在spark集群中,我正在运行一个循环,我想删除在最后一次迭代中创建的文件。 这是我的代码:

val numIter=10
var iteration = 0
while (iteration < numIter) {    
  val b=iteration-2
  if(b >= 0){
     val fs=FileSystem.get(sc.hadoopConfiguration)
     val edgpath="Mypath" + b
     fs.delete(new Path(edgpath),true)
  }

  graph.vertices.saveAsObjectFile("Mypath_" + iteration)
}

当我运行此代码时,我收到此异常

Caused by: java.io.NotSerializableException: 
org.apache.hadoop.hdfs.DistributedFileSystem
Serialization stack:
    - object not serializable (class: 
org.apache.hadoop.hdfs.DistributedFileSystem, value: 
DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_668411888_1, 
ugi=vadivel (auth:SIMPLE)]])
    - field (class: $iw, name: fs, type: class org.apache.hadoop.fs.FileSystem)
    - object (class $iw, $iw@49637e15)
    - field (class: $iw, name: $iw, type: class $iw)
    - object (class $iw, $iw@71a504e5)
    - field (class: $iw, name: $iw, type: class $iw)
    - object (class $iw, $iw@7bc7d6dc)
    - field (class: $iw, name: $iw, type: class $iw)
    - object (class $iw, $iw@329e5b56)
    - field (class: $iw, name: $iw, type: class $iw)
    - object (class $iw, $iw@cf66fb8)
    - field (class: $iw, name: $iw, type: class $iw)
    - object (class $iw, $iw@77abf7b7)
    - field (class: $iw, name: $iw, type: class $iw)
    - object (class $iw, $iw@17b267fc)
    - field (class: $iw, name: $iw, type: class $iw)
    - object (class $iw, $iw@74e14791)
    - field (class: $line17.$read, name: $iw, type: class $iw)
    - object (class $line17.$read, $line17.$read@56f66c33)
    - field (class: $iw, name: $line17$read, type: class $line17.$read)
    - object (class $iw, $iw@45674a70)
    - field (class: $iw, name: $outer, type: class $iw)
    - object (class $iw, $iw@47d9f051)
    - field (class: $anonfun$3, name: $outer, type: class $iw)
    - object (class $anonfun$3, <function3>)
    - field (class: org.apache.spark.graphx.GraphOps$$anonfun$13, name: mapFunc$1, type: interface scala.Function3)
    - object (class org.apache.spark.graphx.GraphOps$$anonfun$13, <function3>)
    - field (class: org.apache.spark.graphx.impl.VertexRDDImpl$$anonfun$3, name: f$4, type: interface scala.Function3)
    - object (class org.apache.spark.graphx.impl.VertexRDDImpl$$anonfun$3, <function2>)
at org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)
at  org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:295)
... 57 more

请分享您对如何解决此错误的想法

0 个答案:

没有答案