当前,我正在从.txt文件中的spark作业获取输出。我正在尝试将其转换为.csv
.txt输出 import io.circe.generic.JsonCodec
import monocle.macros.Lenses
@JsonCodec @Lenses("_") case class Test(a: Int)
object Test
(Dataset <String>)
.csv输出
John MIT Bachelor ComputerScience Mike UB Master ComputerScience
我试图将其收集到一个列表中,但不确定如何将其转换为.csv并添加标题。
答案 0 :(得分:0)
这是一种简单的方法,可将txt输出数据转换为数据结构(可以轻松地写入csv文件)。
基本思想是使用数据结构以及标题/列的数量,以便从一个线性txt输出中解析条目集。
看看代码注释,每个“ TODO 4 U” 都对您有用,主要是因为我无法真正猜出您在这些位置上需要做什么在代码中(例如如何获取标头)。
这只是一个主要方法,可以直接工作。您可能想了解它的作用,并进行更改以使代码符合您的要求。输入和输出只是您自己创建,接收或处理的
String
。
public static void main(String[] args) {
// TODO 4 U: get the values for the header somehow
String headerLine = "NAME, UNIV, DEGREE, COURSE";
// TODO 4 U: read the txt output
String txtOutput = "John MIT Bachelor ComputerScience Mike UB Master ComputerScience";
/*
* then split the header line
* (or do anything similar, I don't know where your header comes from)
*/
String[] headers = headerLine.split(", ");
// store the amount of headers, which is the amount of columns
int amountOfColumns = headers.length;
// split txt output data by space
String[] data = txtOutput.split(" ");
/*
* declare a data structure that stores lists of Strings,
* each one is representing a line of the csv file
*/
Map<Integer, List<String>> linesForCsv = new TreeMap<Integer, List<String>>();
// get the length of the txt output data
int a = data.length;
// create a list of Strings containing the headers and put it into the data structure
List<String> columnHeaders = Arrays.asList(headers);
linesForCsv.put(0, columnHeaders);
// declare a line counter for the csv file
int l = 0;
// go through the txt output data in order to get the lines for the csv file
for (int i = 0; i < a; i++) {
// check if there is a new line to be created
if (i % amountOfColumns == 0) {
/*
* every time the amount of headers is reached,
* create a new list for a new line in the csv file
*/
l++; // increment the line counter (even at 0 because the header row is inserted at 0)
linesForCsv.put(l, new ArrayList<String>()); // create a new line-list
linesForCsv.get(l).add(data[i]); // add the data to the line-list
} else {
// if there is no new line to be created, store the data in the current one
linesForCsv.get(l).add(data[i]);
}
}
// print the lines stored in the map
// TODO 4 U: write this to a csv file instead of just printing it to the console
linesForCsv.forEach((lineNumber, line) -> {
System.out.println("Line " + lineNumber + ": " + String.join(",", line));
});
}