我如何拍摄"切片"一个只有迭代器的List?

时间:2016-07-18 20:20:56

标签: java list csv fitbit apache-commons-csv

我有一个CSV文件,里面装满了从Fitbit下载的数据。 CSV文件中的数据遵循基本格式:

<Type of Data>
<Columns-comma-separated>
<Data-related-to-columns>

以下是文件布局的一个小例子:

Activities
Date,Calories Burned,Steps,Distance,Floors,Minutes Sedentary,Minutes Lightly Active,Minutes Fairly Active,Minutes Very Active,Activity Calories
"2016-07-17","3,442","9,456","4.41","12","612","226","18","44","1,581"
"2016-07-18","2,199","7,136","3.33","10","370","93","12","46","1,092"
...other logs
Sleep
Date,Minutes Asleep,Minutes Awake,Number of Awakenings,Time in Bed
"2016-07-17","418","28","17","452"
"2016-07-18","389","26","10","419"

现在,我正在使用Apache Common的库中的CSVParser来查看这些数据。我的目标是将其转换为可以将相关数据转换为Json的Java对象(我需要将Json上传到不同的网站)。 CSVParser有一个迭代器,我可以用来迭代文件中的CSVRecords。所以,基本上,我有一个&#34;列表&#34;所有数据。
因为该文件包含不同类型的数据(睡眠日志,活动日志等),我需要获取该文件的子部分/子列表,并将其传递给一个类进行分析。

我需要遍历列表并查找标识文件新部分的关键字(例如,活动,食物,睡眠等)。一旦我确定了文件的下一部分是什么,我需要选择以下所有行,直到下一个类别。

现在,对于本课题中的问题:我不知道如何使用迭代器来获得等效的List.sublist()。这是我一直在尝试的:

while (iterator.hasNext())
{
    CSVRecord current = iterator.next();
    if (current.get(0).equals("Activities"))
    {
        iterator.next(); //Columns
        while (iterator.hasNext() && iterator.next().get(0).isData()) //isData isn't real, but I can't figure out what I need to do.
        {
            //How do I sublist it here?
        }
    }
}

所以,我需要确定下一个CSVRecord是否以quote / has数据开头,然后循环直到找到下一个类别,最后将该文件的一个子部分(使用迭代器)传递给另一个函数用正确的日志做点什么。

修改

我考虑使用while循环将首先转换为List,然后进行子列表,但这似乎很浪费。如果我错了,请纠正我。

另外,我不能假设每个部分后面都有相同的行数。它们可能有相似之处,但也有食物日志,它们遵循完全不同的模式。这是两个不同的日子。 Foods遵循正常模式,但食物日志不遵循。

Foods
Date,Calories In
"2016-07-17","0"
"2016-07-18","1,101"

Food Log 20160717
Daily Totals
"","Calories","0"
"","Fat","0 g"
"","Fiber","0 g"
"","Carbs","0 g"
"","Sodium","0 mg"
"","Protein","0 g"
"","Water","0 fl oz"

Food Log 20160718
Meal,Food,Calories
"Lunch"
"","Raspberry Yogurt","190"
"","Almond Sweet & Salty Granola Bar","140"
"","Goldfish Baked Snack Crackers, Cheddar","140"
"","Bagels, Whole Wheat","190"
"","Braided Twists Honey Wheat Pretzels","343"
"","Apples, raw, gala, with skin - 1 medium","98"
"Daily Totals"
"","Calories","1,101"
"","Fat","21 g"
"","Fiber","13 g"
"","Carbs","202 g"
"","Sodium","1,538 mg"
"","Protein","28 g"
"","Water","24 fl oz"

1 个答案:

答案 0 :(得分:1)

执行所需操作的最简单方法是简单地记住以前的类别数据,当您点击新类别时,处理之前的类别数据并重置下一个类别。这应该有效:

String categoryName = null;
List<List<String>> categoryData = new ArrayList<>();
while (iterator.hasNext()) {
    CSVRecord current = iterator.next();
    if (current.size() == 1) { //start of next category
        processCategory(categoryName, categoryData);
        categoryName = current.get(0);
        categoryData.clear();
        iterator.next(); //skip header
    } else { //category data
        List<String> rowData = new ArrayList<>(current.size());
        CollectionUtils.addAll(rowData, current.iterator()); //uses Apache Commons Collections, but you can use whatever
        categoryData.add(rowData);
    }
}
processCategory(categoryName, categoryData); //last category of file

然后:

void processCategory(String categoryName, List<List<String>> categoryData) {
    if (categoryName != null) { //first category of the file, skip
        //do stuff
    }
}

以上假设List<List<String>>是您要处理的数据结构,但您可以根据需要进行调整。我甚至可以建议简单地将List<Iterable<String>>传递给流程方法(CSVRecord实现Iterable<String>)并在那里处理行数据。

这绝对可以进一步清理,但它应该让你开始。