如何使用RDD.save作为文本文件保存带分隔格式的文本文件?还需要将数据帧列写为标题..我如何实现?
对于大型RDD,是否有比下面更简单的方法..
List<Row> data = resultFrame.toJavaRDD().collect();
try {
File file = new File(fileName);
if (!file.exists()) {
file.createNewFile();
}
FileWriter fw = new FileWriter(file);
BufferedWriter bufferedWriter = new BufferedWriter(fw);
for (Row dataRow:data)
{
StringBuilder row = new StringBuilder();
for(int i = 0; i<dataRow.size();i++)
{
row.append(dataRow.get(i));
if (i != dataRow.size()-1)
{
row.append("~");
}
}
bufferedWriter.write(row.toString());
bufferedWriter.write("\n");
row.setLength(0);
}
bufferedWriter.close();
} catch (IOException e) {
LOGGER.error("Error in writing to the ruf file");
}
答案 0 :(得分:0)
就像您使用 SQLContext.read (Java API)阅读一样,您需要使用 DataFrame.write (Java API)。
其他方式已弃用(例如SQLContext.parquetFile,SQLContext.jsonFile)。
答案 1 :(得分:0)
感谢您的回复。以下工作
public class TildaDelimiter implements Function<Row, String> {
public String call(Row r) {
return r.mkString("~");
}
}
in my save as i did the following to save as a ~ delimited file
resultFrame.toJavaRDD().map(new TildaDelimiter()).coalesce(1, true)
.saveAsTextFile(folderName);