我试图让s3存储桶中存在数字文件。我有路径列表作为Seq,我试图检查。我试图过滤路径并计数,但不断收到错误。
import java.net.URI
import org.apache.hadoop.fs.FileSystem
import org.apache.hadoop.fs.Path
files: Seq[String] = Vector(s3://dv-service-prod-na/output/sample/test/data/2016/12/01/*/*, s3://dv-service-prod-na/output/sample/test/data/2016/12/02/*/*, s3://dv-service-prod-na/output/sample/test/data/2016/12/03/*/*, s3://dv-service-prod-na/output/sample/test/data/2016/12/04/*/*, s3://dv-service-prod-na/output/sample/test/data/2016/12/05/*/*)
val filePath = files.map(x=> x.split("/\\*/\\*"))
val input = "s3n://dv-service-prod-na"
val missingPath = filePath.filter(x => (FileSystem.get(new URI(input), sc.hadoopConfiguration).exists(new Path(x))).equals(false)).count
错误:
console>:92: error: overloaded method constructor Path with alternatives: (x$1: java.net.URI)org.apache.hadoop.fs.Path <and> (x$1: String)org.apache.hadoop.fs.Path cannot be applied to (Array[String])
答案 0 :(得分:3)
您可能希望在拆分后展平:
val filePath = files.flatMap(x=> x.split("/\\*/\\*"))