DataFrame.write.parquet()抛出NPE

时间:2016-02-01 07:57:45

标签: apache-spark spark-dataframe

源DataFrame:带有The following build commands failed: ProcessPCH /Users/konysync/Library/Developer/Xcode/DerivedData/HelloCordova-gyroiomjvclmgtfewwtckeoypgfd/Build/Intermediates/PrecompiledHeaders/CordovaLib_Prefix-almeazhzuslzcvewimbluxlrnwby/CordovaLib_Prefix.pch.pch CordovaLib_Prefix.pch normal armv7 objective-c com.apple.compilers.llvm.clang.1_0.compiler ProcessPCH /Users/konysync/Library/Developer/Xcode/DerivedData/HelloCordova-gyroiomjvclmgtfewwtckeoypgfd/Build/Intermediates/PrecompiledHeaders/CordovaLib_Prefix-bvlrmrstkahcccfcihrhcdumeenk/CordovaLib_Prefix.pch.pch CordovaLib_Prefix.pch normal arm64 objective-c com.apple.compilers.llvm.clang.1_0.compiler 的csv,来自spark-csv

maven

阅读之后,我尝试了sqlctx.read.format("com.databricks.spark.csv") .option("header", "true") .option("inferSchema", "true") .load(FileName) ,显示效果很好

当做DataFrame.printSchema()时,它与NPE一起崩溃,消息正在跟随:

DataFrame.write.parquet(path)

环境:NPE只发生在Windows中,所有代码在linux中运行良好

P.S:DataFrame.write.save()显示相同的NPE,这是一个win规范问题吗?

1 个答案:

答案 0 :(得分:1)

您必须在$ HADOOP_HOME中提到的hadoop安装中安装winutils.exe。您可以从How to parse GCM respond to remove invalid registration id from server with php获取此安装,这是一个带有bin文件夹中的winutils.exe的hadoop版本。这解决了NPE问题。