我想将信息存储在 ArrayList 中。我从csv文件中获取数据,但是有相同的数据,我想消除它们。最有效的方法是什么?我考虑了两种方法:将所有数据添加到 Set 并将其转换为 ArrayList 。将它们添加到 ArrayList ,同时检查它们是否包含相同的数据。这是我的代码:
public static void sanitization(String file_path) throws FileNotFoundException, IOException {
File file = new File(file_path);
BufferedReader reader = new BufferedReader(new FileReader(file)); //read the csv file
Set<Flight> flights_set = new HashSet<>(); //All valid flights will be added to set in order to prevent from adding same flights.
String[] split = new String[31];
String st;
while ((st = reader.readLine()) != null) {
split = st.split(",", -2);
flights_set.add(new Flight(split[4], split[5], Integer.valueOf(split[11]), split[7], split[8], Integer.valueOf(split[0]), Integer.valueOf(split[1]), Integer.valueOf(split[2])));
}
//Second possible way
/*while ((st = reader.readLine()) != null) {
split = st.split(",", -2);
Flight f=new Flight(split[4], split[5], Integer.valueOf(split[11]), split[7], split[8], Integer.valueOf(split[0]), Integer.valueOf(split[1]), Integer.valueOf(split[2]));
if(!flights_arraylist.contains(f))
flights_arraylist.add(f);
}*/
ArrayList<Flight> flights_arraylist = new ArrayList<>(flights_set);
}
class Flight implements Comparable<Flight> {
//All necessary information
public String airline;
public String flight_number;
public Integer departure_delay;
public String origin_airport_name;
public String destination_airport_name;
public Integer year;
public Integer month;
public Integer day;
//Constructor
public Flight(String airline, String flight_number, Integer departure_delay, String origin_airport_name, String destination_airport_name, Integer year, Integer month, Integer day) {
this.airline = airline;
this.flight_number = flight_number;
this.departure_delay = departure_delay;
this.origin_airport_name = origin_airport_name;
this.destination_airport_name = destination_airport_name;
this.year = year;
this.month = month;
this.day = day;
}
public Flight() {
}
//Flight is bigger if its departure delay is bigger
public int compareTo(Flight o) {
if (this.departure_delay > o.departure_delay) return 1;
else if (this.departure_delay < o.departure_delay) return -1;
else return 0;
}
@Override
public boolean equals(Object obj) {
Flight f = (Flight) obj;
if ((this.airline.equals(f.airline)) && (this.flight_number.equals(f.flight_number)) && (this.departure_delay.equals(f.departure_delay)) && (this.origin_airport_name.equals(f.origin_airport_name)) && (this.destination_airport_name.equals(f.destination_airport_name)) && (this.year.equals(f.year)) && (this.month.equals(f.month)) && (this.day.equals(f.day))) {
return true;
}
return false;
}
@Override
public int hashCode() {
return 0;
}
@Override
public String toString() {
return this.airline + " " + this.flight_number + " " + this.departure_delay;
}
}
这也是我的第一个问题,如果我有任何错误,请警告我
答案 0 :(得分:1)
您可以使用流,下面是对列表进行处理的示例方法。
首先将所有元素添加到列表中,然后使用流并收集不同的元素并在同一列表中进行更新。
示例:
List<String> strList = new ArrayList<String>();
strList.add("Alpha");
strList.add("Beta");
strList.add("Charlie");
strList.add("Delta");
strList.add("Delta");
strList.add("Delta");
strList = strList.stream().distinct().collect(Collectors.toList());
System.out.println("Without duplicate");
strList.forEach(System.out::println);
输出:
Without duplicate
Alpha
Beta
Charlie
Delta
答案 1 :(得分:1)
来自java.util.Set#add的javadoc:@return如果此集合尚未包含指定的元素,则为true。 另外,对于此答案,BufferedReader提供了lines方法,该方法返回文件中的字符串流。 知道这一点,您可以编写如下内容:
List<Flight> result;//list of your choice;
Set<Flight> flightSet; //set of your choice;
BufferedReader reader; // init bufferedReader
reader.lines()
.forEach(line -> {
Flight flight;//transform into object;
if (flightSet.add(flight)) {
result.add(flight);
}
});
或者完全使用流,收集不同的映射线:
BufferedReader reader; // init bufferedReader
reader.lines()
.map(line->new Flight(/*... args*/))
.distinct()
.collect(Collectors.toList())
答案 2 :(得分:0)
为了最终避免重复,您需要搜索可用数据。
HashSet.contains()
平均运行时间O(1)
。
但是,在内部,ArrayList使用indexOf(object)方法检查对象是否在列表中。 indexOf(object)方法迭代整个数组,并将每个元素与equals(object)方法进行比较。
回到复杂度分析,ArrayList.contains()
方法需要O(n)
时间。
最有效的方法是使用SET存储没有重复的内容,然后将其转换为List。