我有一个1,000,000个不同年龄的用户列表,我想用Java进行搜索,根据他们的年龄范围输出该组中的人数。 例如:
Age Group Age Range
1 6 years old or younger
2 7 to 18 years old
3 19 to 26 years old
4 27 to 49 years old
5 50 to 64 years old
6 65 to 79 years old
7 80 years old or older
如果我进入特定年龄组,我希望我的输出显示属于年龄组的人数。那就是:
If I enter 1
输出应为:
**** users found (total number of users that falls within the
age range 6 years old or younger)
任何类型的数据结构都可以。
这是我到目前为止所做的:
/**
A template used to read data lines into java.util.ArrayList data structure.
Input file: pjData.csv
Input file must be saved under the same directory/folder as the program.
Each line contains 5 fields, separated by commas. For example,
959695171, 64, AZ, M, 1
355480298, 101, TN, F, 1
**/
import java.io.*;
import java.util.*;
public class pj3Template2
{
public static void main(String args[])
{
String line;
String id, s, g;
Integer a, sa;
StringTokenizer st;
HealthDS2 records = new HealthDS2();
try {
FileReader f = new FileReader("pjData.csv");
BufferedReader in = new BufferedReader(f);
while ((line = in.readLine()) != null)
{
st = new StringTokenizer(line, ",");
id = st.nextToken(",").trim();
a = Integer.valueOf(st.nextToken(",").trim());
s = st.nextToken(",").trim().toUpperCase();
g = st.nextToken(",").trim().toUpperCase();
sa = Integer.valueOf(st.nextToken().trim());
records.add(new HealthRec2(id, a, s, g, sa));
} // loop until the end of file
in.close();
f.close();
}
catch (Exception e) { e.printStackTrace(); };
System.out.println(records.getSize() + " records processed.");
// Search by age
System.out.print("Enter 1-character age abbreviation to search: ");
String ui;
Scanner input = new Scanner(System.in);
ui = input.next().trim();
System.out.println("Searching all records in: " + ui);
ArrayList <HealthRec2> al = records.searchByAge(Integer.valueOf(ui.trim()));
System.out.println(al.size() + " records found.");
}
}
/**
Data class Sample records:
5501986, 31, WV, F, 1
1539057187, 5, UT, M, 2
**/
class HealthRec2
{
String ID;
Integer age;
String state;
String gender;
int status;
public HealthRec2() { }
public HealthRec2(String i, Integer a, String s, String g, int sa)
{ ID = i; age = a; state = s; gender = g; status = sa; }
// Reader methods
public String getID() { return ID; }
public Integer getAge() { return age; }
public String getState() { return state; }
public String getGender() { return gender; }
public int getStatus() { return status; }
// Writer methods
public void setAge(Integer a) { age = a; }
public void setState(String s) { state = s; }
public void setGender(String g) { gender = g; }
public void setStatus(int sa) { status = sa; }
public String toString()
{ return ID + " " + age + " " + state + " " + gender + " " + status; }
} // HealthRec
// Data structure used to implement the requirement
// This implementation uses java.util.ArrayList
class HealthDS2
{
ArrayList <HealthRec2> rec;
public HealthDS2()
{ rec = new ArrayList <HealthRec2>(); }
public HealthDS2(HealthRec2 r)
{
rec = new ArrayList <HealthRec2>();
rec.add(r);
}
public int getSize() { return rec.size(); }
public void add(HealthRec2 r) { rec.add(r); }
// Search by age
// No data validation is needed -- assuming the 1-character age is valid
// Returns an ArrayList of records
public ArrayList <HealthRec2> searchByAge(Integer a)
{
ArrayList <HealthRec2> temp = new ArrayList <HealthRec2>();
for (int k=0; k < rec.size(); ++k)
{
if (rec.get(k).getAge().equals(a))
temp.add(rec.get(k));
}
return temp;
} // searchByAge
} // HealthDS
我的目标是根据state
,status
,gender
和age
群组进行搜索。我已经为其他人做了这个,但我只是对年龄组有一点问题,因为它是分组的,而不仅仅是在数据文件中搜索特定的年龄。我尝试为每个组创建七个arraylists但我仍然在组之间切换时遇到一些问题。
答案 0 :(得分:0)
此代码执行:
对于非常大的数据集,您需要使用更好的数据结构,例如@kyticka提及。
public static void main (String[] args) throws java.lang.Exception
{
int[] groupMin = new int[]{0, 10, 20};
int[] groupMax = new int[]{10, 20, 9999};
int[] ages = new int[]{ 1, 2, 3, 10, 12, 76, 56, 89 };
int targetGroup = 1;
int count = 0;
for( int age : ages ){
if( age >= groupMin[targetGroup] && age < groupMax[targetGroup] ){
count++;
}
}
System.out.println("Group " + targetGroup + " range is " +
groupMin[targetGroup] + " - " + groupMax[targetGroup]);
System.out.println("Count: " + count);
}
您可以在此处播放:http://ideone.com/DAWGYX
答案 1 :(得分:0)
您可以使用某种方式初始化您的1000000用户甚至以下代码将为用户生成随机年龄:
import java.util.ArrayList;
import java.util.Random;
import java.util.Scanner;
public class UserListDemo {
int age;
class Users{
int age=0;
public Users(int a)
{
age=a;
}
public void setAge(int age)
{
this.age=age;
}
public int getAge()
{
return this.age;
}
}
public static void main(String a[])
{
UserListDemo uld=new UserListDemo();
ArrayList<Users> data=new ArrayList<Users>();
uld.initializeUsers(data);
System.out.println("Enter age group choice");
System.out.println("Enter 1 for age group 1-6");
System.out.println("Enter 2 for age group 7-18");
System.out.println("Enter 3 for age group 19-26");
System.out.println("Enter 4 for age group 27-49");
System.out.println("Enter 5 for age group 50-64");
System.out.println("Enter 6 for age group 65-79");
System.out.println("Enter 7 for age group 80-Older");
Scanner sc=new Scanner(System.in);
String choice=sc.nextLine();
int ch=Integer.valueOf(choice);
long result=0;
switch(ch)
{
case 1:
for(Users us:data)
{
if(us.age<=6)
result++;
}
case 2:
for(Users us:data)
{
if( us.age>=7 && us.age<=18 )
result++;
}
case 3:
for(Users us:data)
{
if( us.age>=19 && us.age<=26 )
result++;
}
case 4:
for(Users us:data)
{
if( us.age>=27 && us.age<=49 )
result++;
}
case 5:
for(Users us:data)
{
if( us.age>=50 && us.age<=64 )
result++;
}
case 6:
for(Users us:data)
{
if( us.age>=65 && us.age<=79 )
result++;
}
case 7:
for(Users us:data)
{
if( us.age>=80)
result++;
}
}
System.out.println("For the entered age group :"+ch+" ::"+result+" user has been found");
}
public void initializeUsers(ArrayList<Users> data)
{
Users us;
Random rand=new Random();
for(long l=0;l<1000000L;l++)
{
us=new Users(rand.nextInt(100));
data.add(us);
}
}
}
答案 2 :(得分:0)
有1M条记录的有效答案是使用几个Map作为索引,甚至是一个实际的数据库。但是,由于该练习明确提到了ArrayList,因此您可能仍在学习基础知识,因此我将坚持这些基础。
首先,您需要能够检索给定人员的组。您可以通过两种方式做到这一点。
选项 A 是在初始化时将组添加为字段
// within HealthRec2
int group; // stores group number as an attribute
private static final int[] ageGroups = // age limits for each group
new int[]{6, 18, 26, 49, 64, 79};
private void updateGroup() { // <-- called from constructor and from setAge()
int currentGroup = 0;
for (int limit : ageGroups) {
currentGroup ++; // advance to next group
if (age <= limit) break; // stop looking at limits once we reach one
}
group = currentGroup;
}
private int getGroup() { return group; }
选项 B 用于为每条记录即时计算,而不是将其存储为属性:
// within HealthRec2
private static final int[] ageGroups = // age limits for each group
new int[]{6, 18, 26, 49, 64, 79};
public int getGroup() {
int currentGroup = 0;
for (int limit : ageGroups) {
currentGroup ++; // advance to next group
if (age <= limit) break; // stop looking at limits once we reach one
}
return currentGroup;
}
无论使用哪种选择,您现在都可以使用非常相似的逻辑来查找给定年龄组中的人,因为您必须查找来自给定州或给定性别的记录。
选项A的前期成本更高,因为即使您不需要年龄组,您仍然必须计算它并将其存储在属性中,以防万一。如果您需要为同一条记录多次调用getGroup
,则选项B会更昂贵-因为选项A的getGroup
要快得多。
答案 3 :(得分:-1)
创意一:排序并使用二分查找http://en.wikipedia.org/wiki/Binary_search