如何防止Hadoop的HDFS API创建父目录?

时间:2017-12-17 17:04:33

标签: scala hadoop hdfs mkdirs

如果在创建子目录时父目录不存在,我希望HDFS命令失败。当我使用任何FileSystem#mkdirs时,我发现异常没有上升,而是创建不存在的父目录:

import java.util.UUID
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{FileSystem, Path}

val conf = new Configuration()
conf.set("fs.defaultFS", s"hdfs://$host:$port")

val fileSystem = FileSystem.get(conf)
val cwd = fileSystem.getWorkingDirectory

// Guarantee non-existence by appending two UUIDs.
val dirToCreate = new Path(cwd, new Path(UUID.randomUUID.toString, UUID.randomUUID.toString))

fileSystem.mkdirs(dirToCreate)

如果没有检查存在的繁重负担,如果父目录不存在,如何强制HDFS抛出异常?

1 个答案:

答案 0 :(得分:0)

FileSystem API不支持此类行为。相反,应该使用FileContext#mkdir;例如:

import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{FileContext, FileSystem, Path}
import org.apache.hadoop.fs.permission.FsPermission

val files = FileContext.getFileContext()
val cwd = files.getWorkingDirectory
val permissions = new FsPermission("644")
val createParent = false

// Guarantee non-existence by appending two UUIDs.
val dirToCreate = new Path(cwd, new Path(UUID.randomUUID.toString, UUID.randomUUID.toString))

files.mkdir(dirToCreate, permissions, createParent)

以上示例将抛出:

java.io.FileNotFoundException: Parent directory doesn't exist: /user/erip/f425a2c9-1007-487b-8488-d73d447c6f79