我有一个CSV文件,里面装满了从Fitbit下载的数据。 CSV文件中的数据遵循基本格式:
<Type of Data>
<Columns-comma-separated>
<Data-related-to-columns>
以下是文件布局的一个小例子:
Activities
Date,Calories Burned,Steps,Distance,Floors,Minutes Sedentary,Minutes Lightly Active,Minutes Fairly Active,Minutes Very Active,Activity Calories
"2016-07-17","3,442","9,456","4.41","12","612","226","18","44","1,581"
"2016-07-18","2,199","7,136","3.33","10","370","93","12","46","1,092"
...other logs
Sleep
Date,Minutes Asleep,Minutes Awake,Number of Awakenings,Time in Bed
"2016-07-17","418","28","17","452"
"2016-07-18","389","26","10","419"
现在,我正在使用Apache Common的库中的CSVParser
来查看这些数据。我的目标是将其转换为可以将相关数据转换为Json的Java对象(我需要将Json上传到不同的网站)。 CSVParser
有一个迭代器,我可以用来迭代文件中的CSVRecords
。所以,基本上,我有一个&#34;列表&#34;所有数据。
因为该文件包含不同类型的数据(睡眠日志,活动日志等),我需要获取该文件的子部分/子列表,并将其传递给一个类进行分析。
我需要遍历列表并查找标识文件新部分的关键字(例如,活动,食物,睡眠等)。一旦我确定了文件的下一部分是什么,我需要选择以下所有行,直到下一个类别。
现在,对于本课题中的问题:我不知道如何使用迭代器来获得等效的List.sublist()
。这是我一直在尝试的:
while (iterator.hasNext())
{
CSVRecord current = iterator.next();
if (current.get(0).equals("Activities"))
{
iterator.next(); //Columns
while (iterator.hasNext() && iterator.next().get(0).isData()) //isData isn't real, but I can't figure out what I need to do.
{
//How do I sublist it here?
}
}
}
所以,我需要确定下一个CSVRecord
是否以quote / has数据开头,然后循环直到找到下一个类别,最后将该文件的一个子部分(使用迭代器)传递给另一个函数用正确的日志做点什么。
我考虑使用while循环将首先转换为List
,然后进行子列表,但这似乎很浪费。如果我错了,请纠正我。
另外,我不能假设每个部分后面都有相同的行数。它们可能有相似之处,但也有食物日志,它们遵循完全不同的模式。这是两个不同的日子。 Foods
遵循正常模式,但食物日志不遵循。
Foods
Date,Calories In
"2016-07-17","0"
"2016-07-18","1,101"
Food Log 20160717
Daily Totals
"","Calories","0"
"","Fat","0 g"
"","Fiber","0 g"
"","Carbs","0 g"
"","Sodium","0 mg"
"","Protein","0 g"
"","Water","0 fl oz"
Food Log 20160718
Meal,Food,Calories
"Lunch"
"","Raspberry Yogurt","190"
"","Almond Sweet & Salty Granola Bar","140"
"","Goldfish Baked Snack Crackers, Cheddar","140"
"","Bagels, Whole Wheat","190"
"","Braided Twists Honey Wheat Pretzels","343"
"","Apples, raw, gala, with skin - 1 medium","98"
"Daily Totals"
"","Calories","1,101"
"","Fat","21 g"
"","Fiber","13 g"
"","Carbs","202 g"
"","Sodium","1,538 mg"
"","Protein","28 g"
"","Water","24 fl oz"
答案 0 :(得分:1)
执行所需操作的最简单方法是简单地记住以前的类别数据,当您点击新类别时,处理之前的类别数据并重置下一个类别。这应该有效:
String categoryName = null;
List<List<String>> categoryData = new ArrayList<>();
while (iterator.hasNext()) {
CSVRecord current = iterator.next();
if (current.size() == 1) { //start of next category
processCategory(categoryName, categoryData);
categoryName = current.get(0);
categoryData.clear();
iterator.next(); //skip header
} else { //category data
List<String> rowData = new ArrayList<>(current.size());
CollectionUtils.addAll(rowData, current.iterator()); //uses Apache Commons Collections, but you can use whatever
categoryData.add(rowData);
}
}
processCategory(categoryName, categoryData); //last category of file
然后:
void processCategory(String categoryName, List<List<String>> categoryData) {
if (categoryName != null) { //first category of the file, skip
//do stuff
}
}
以上假设List<List<String>>
是您要处理的数据结构,但您可以根据需要进行调整。我甚至可以建议简单地将List<Iterable<String>>
传递给流程方法(CSVRecord
实现Iterable<String>
)并在那里处理行数据。
这绝对可以进一步清理,但它应该让你开始。