我使用下面的代码链接来展平嵌套数据框Flatten a DataFrame in Scala with different DataTypes inside ....我收到以下错误:
线程“main”中的异常org.apache.spark.sql.AnalysisException: 参考'alternateIdentificationQualifierCode'是不明确的,可以 be:alternateIdentificationQualifierCode#2, alternateIdentificationQualifierCode#11; 在org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolve(LogicalPlan.scala:287) 在org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveChildren(LogicalPlan.scala:171) 在org.apache.spark.sql.catalyst.analysis.Analyzer $ ResolveReferences $$ anonfun $ apply $ 10 $$ anonfun $ applyOrElse $ 4 $$ anonfun $ 26.apply(Analyzer.scala:470) 在org.apache.spark.sql.catalyst.analysis.Analyzer $ ResolveReferences $$ anonfun $ apply $ 10 $$ anonfun $ applyOrElse $ 4 $$ anonfun $ 26.apply(Analyzer.scala:470) 在org.apache.spark.sql.catalyst.analysis.package $ .withPosition(package.scala:48) 在org.apache.spark.sql.catalyst.analysis.Analyzer $ ResolveReferences $$ anonfun $ apply $ 10 $$ anonfun $ applyOrElse $ 4.applyOrElse(Analyzer.scala:470) 在org.apache.spark.sql.catalyst.analysis.Analyzer $ ResolveReferences $$ anonfun $ apply $ 10 $$ anonfun $ applyOrElse $ 4.applyOrElse(Analyzer.scala:466) 在org.apache.spark.sql.catalyst.trees.TreeNode $$ anonfun $ transformUp $ 1.apply(TreeNode.scala:335) 在org.apache.spark.sql.catalyst.trees.TreeNode $$ anonfun $ transformUp $ 1.apply(TreeNode.scala:335) at org.apache.spark.sql.catalyst.trees.CurrentOrigin $ .withOrigin(TreeNode.scala:69) 在org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:334) 在org.apache.spark.sql.catalyst.trees.TreeNode $$ anonfun $ 5.apply(TreeNode.scala:332) 在org.apache.spark.sql.catalyst.trees.TreeNode $$ anonfun $ 5.apply(TreeNode.scala:332) 在org.apache.spark.sql.catalyst.trees.TreeNode $$ anonfun $ 4.apply(TreeNode.scala:281) 在scala.collection.Iterator $$ anon $ 11.next(Iterator.scala:328)
有什么方法可以在scala的spark-dataframes中以编程方式重命名列表在此先感谢.. \
代码:
object flatten {
def main(args: Array[String]) {
if (args.length < 1) {
System.err.println("Usage: XMLParser.jar <config.properties>")
println("Please provide the Configuration File for the XML Parser Job")
System.exit(1)
}
val sc = new SparkContext(new SparkConf().setAppName("Spark XML Process"))
val sqlContext = new HiveContext(sc)
val prop = new Properties()
prop.load(new FileInputStream(args(0)))
val dfSchema = sqlContext.read.format("com.databricks.spark.xml").option("rowTag",prop.getProperty("xmltag")).load(prop.getProperty("input"))
val flattened_DataFrame=flattenDf(dfSchema)
// flattened_DataFrame.printSchema()
}
答案 0 :(得分:1)
使用
val renamed_df = df.toDF(Seq("col1","col2","col3"))
重命名列