如何从URL读取数据并将频率计入另一个URL中给出的bin中?

时间:2014-12-08 23:04:08

标签: java url arraylist hashmap binning

我正在处理一个我有2个网址的作业。第一列有3列,第一列是下部bin边界,第二列是上部bin边界,第三列与此问题无关,但只包含另一个数字。它看起来像这样。

100 101 3.45
101 102 4.23
103 104 2.40
  ...... ....... 199 200 6.89

第二个网址包含第一列中2个可能的ID代码中的1个,第二个列中包含一个数字,我们称之为高度。它看起来像这样

xx 108.45
xx 122.00
是124.78
xx 156.93

在前面的部分中,我已经读取并将数据存储到数组列表中,并处理了第一个URL中的一些数据。

现在我需要分别为两个id代码找到每个给定箱的高度频率。我环顾四周,尝试了一些事情,但我并没有真正想要我需要的东西。我仍然不确定这样做的最佳方法是使用哈希映射,数组列表还是其他类型的集合?有什么帮助吗?

1 个答案:

答案 0 :(得分:0)

我参加了一些关于收藏的练习。感谢@Eran在我other question的帮助下,我能够解决这个问题。

我在测试输入中添加了一些额外内容,因为您提供的值实际上并没有放入任何二进制文件中!

package soBins;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.Map;
import java.util.Scanner;

public class Bins {

    public static void main(String[] args) {

        String rangeIn = "100 101 3.45\n101 102 4.23\n103 104 2.40\n199 200 6.89";

        String dataIn = "xx 108.45\nxx 122.00\nyy 124.78\nxx 156.93\nzz 101.5\nxx 103.5\nzz 101.25";

        Scanner rangeScanner = new Scanner(rangeIn);
        Scanner dataScanner = new Scanner(dataIn);  

        ArrayList<Bin> bins = new ArrayList<Bin>();
        while (rangeScanner.hasNextLine()) {
            String line = rangeScanner.nextLine();
            String[] tokens = line.split(" ");
            int min = Integer.parseInt(tokens[0]);
            int max = Integer.parseInt(tokens[1]);
//          System.out.println("Creating new bin, min "+ min + ", max "+ max);
            bins.add(new Bin(min,max));
        }
        rangeScanner.close();

        Map<String,ArrayList<Bin>> namedBins = new HashMap<String,ArrayList<Bin>>();
        while (dataScanner.hasNextLine()) {
            String line = dataScanner.nextLine();
            String[] tokens = line.split(" ");
            String name = tokens[0]; // name is first token on line
            float data = Float.parseFloat(tokens[1]); // data is second token on line

            if (!namedBins.containsKey(name)) {
                ArrayList<Bin> newBins = new ArrayList<Bin>();

                for (Bin bin : bins) {
                    newBins.add (new Bin(bin)); // using a copy constructor            
                }
                namedBins.put(name,newBins);
            }
            for (Bin b : namedBins.get(name)) {
                if (b.isInRange(data)) {
                    System.out.println("adding "+ data + " to bin in "+ name);
                    b.addData(data);
                }
            }
        }
        dataScanner.close();

        System.out.println("All bins and data contents:");
        for (String dataName : namedBins.keySet()) { // print all values and bin ranges
            for (Bin range : namedBins.get(dataName)) {
                System.out.println(dataName + ", min " + range.getMin() + ", max " + range.getMax()
                        + ", data is " + range.getData());              
            }
        }
    }
}

我的Bin课程:

package soBins;

import java.util.ArrayList;

public class Bin {

    int min = 0;
    int max = 0;
    ArrayList<Float> values = new ArrayList<Float>();

    Bin(int min,int max) {
        this.min = min;
        this.max = max;
    }

    public Bin(Bin bin) {
        this.min = bin.min;
        this.max = bin.max;
        for (float f : bin.values) {
            this.values.add(f);
        }
//      this.values.addAll(bin.values);  // also worked
//      this.values = (ArrayList<Float>) bin.values.clone(); // also worked but gave unchecked cast warning
    }

    public boolean isInRange(float data) {
        return (min < data) && (data < max);
    }

    public void addData(float data) {
        values.add(data);
    }

    public int getMin() {
        return min;
    }
    public void setMin(int min) {
        this.min = min;
    }
    public int getMax() {
        return max;
    }
    public void setMax(int max) {
        this.max = max;
    }

    public ArrayList<Float> getData() {
        return values;
    }   

}

输出:

adding 101.5 to bin in zz
adding 103.5 to bin in xx
adding 101.25 to bin in zz
All bins and data contents:
zz, min 100, max 101, data is []
zz, min 101, max 102, data is [101.5, 101.25]
zz, min 103, max 104, data is []
zz, min 199, max 200, data is []
yy, min 100, max 101, data is []
yy, min 101, max 102, data is []
yy, min 103, max 104, data is []
yy, min 199, max 200, data is []
xx, min 100, max 101, data is []
xx, min 101, max 102, data is []
xx, min 103, max 104, data is [103.5]
xx, min 199, max 200, data is []