通过随机选择表来执行SELECT sql

时间:2013-02-24 02:10:33

标签: java database multithreading random executorservice

我正在开发一个项目,其中我在具有不同模式的不同数据库中有两个表。这意味着我有两个不同的连接参数,用于使用JDBC连接这两个表 -

我们假设下面是config.property文件 -

TABLES: table1 table2

#For Table1
table1.url: jdbc:mysql://localhost:3306/garden
table1.user: gardener
table1.password: shavel
table1.driver: jdbc-driver
table1.percentage: 80



#For Table2
table2.url: jdbc:mysql://otherhost:3306/forest
table2.user: forester
table2.password: axe
table2.driver: jdbc-driver
table2.percentage: 20

以下方法将阅读上述config.property file并为每个表格制作ReadTableConnectionInfo object

private static HashMap<String, ReadTableConnectionInfo> tableList = new HashMap<String, ReadTableConnectionInfo>();

private static void readPropertyFile() throws IOException {

    prop.load(Read.class.getClassLoader().getResourceAsStream("config.properties"));

    tableNames = Arrays.asList(prop.getProperty("TABLES").split(" "));

    for (String arg : tableNames) {

        ReadTableConnectionInfo ci = new ReadTableConnectionInfo();

        String url = prop.getProperty(arg + ".url");
        String user = prop.getProperty(arg + ".user");
        String password = prop.getProperty(arg + ".password");
        String driver = prop.getProperty(arg + ".driver");
        double percentage = Double.parseDouble(prop.getProperty(arg + ".percentage"));

        ci.setUrl(url);
        ci.setUser(user);
        ci.setPassword(password);
        ci.setDriver(driver);
        ci.setPercentage(percentage);

        tableList.put(arg, ci);
    }

}

下面是ReadTableConnectionInfo类,它将保存特定表的所有表连接信息。

public class ReadTableConnectionInfo {

    public String url;
    public String user;
    public String password;
    public String driver;
    public String percentage;

    public String getUrl() {
        return url;
    }

    public void setUrl(String url) {
        this.url = url;
    }

    public String getUser() {
        return user;
    }

    public void setUser(String user) {
        this.user = user;
    }

    public String getPassword() {
        return password;
    }

    public void setPassword(String password) {
        this.password = password;
    }

    public String getDriver() {
        return driver;
    }

    public void setDriver(String driver) {
        this.driver = driver;
    }

    public double getPercentage() {
        return percentage;
    }

    public void setPercentage(double percentage) {
        this.percentage = percentage;
    }
}

现在我正在为指定数量的线程创建ExecutorService,并将此tableList object传递给ReadTask类的构造函数 -

        // create thread pool with given size
        ExecutorService service = Executors.newFixedThreadPool(10);

        for (int i = 0; i < 10; i++) {
            service.submit(new ReadTask(tableList));
        }

下面是我的ReadTask实现Runnable interface,其中每个线程应该为每个表建立连接。

class ReadTask implements Runnable {

    private final HashMap<String, XMPReadTableConnectionInfo> tableLists;

public ReadTask(HashMap<String, ReadTableConnectionInfo> tableList) {
    this.tableLists = tableList;
}


@Override
public void run() {

    int j = 0;
    dbConnection = new Connection[tableLists.size()];
    statement = new Statement[tableLists.size()];

    //loop around the map values and make the connection list
    for (ReadTableConnectionInfo ci : tableLists.values()) {

        dbConnection[j] = getDBConnection(ci.getUrl(), ci.getUser(), ci.getPassword(), ci.getDriver());
        statement[j] = dbConnection[j].createStatement();

        j++;
    }

    while (System.currentTimeMillis() <= 60 minutes) {

    /* Generate random number and check to see whether that random number
     * falls between 1 and 80, if yes, then choose table1
     * and then use table1 connection and statement that I made above and do a SELECT * on that table.
     * If that random numbers falls between 81 and 100 then choose table2 
     * and then use table2 connection and statement and do a SELECT * on that table
     */

    ResultSet rs = statement[what_table_statement].executeQuery(selectTableSQL);

    }
     }
}

目前我有两个表,这意味着每个线程将为每个表创建两个连接,然后根据随机生成数使用该特定表连接在该表上执行SELECT *。

算法: -

  1. 生成1到100之间的随机数。
  2. 如果该随机数小于table1.getPercentage(),请选择table1 然后使用table1 statement object为该数据库创建SELECT sql call
  3. 其他选择table2,然后使用table2 statement object为该数据库建立SELECT sql call
  4. 我的问题 -

    我很难弄清楚应该如何应用上述算法以及如何将random number与每个tables percentage进行比较,然后决定我需要使用哪个表,然后确定哪个table connection and statements我需要用来制作SELECT sql call

    这意味着我需要检查每个表的getPercentage()方法,并将它们与随机数进行比较。

    现在我只有两个表,将来我可以有三个表,百分比分布可能是80 10 10

    更新: -

    class ReadTask implements Runnable {
    
        private Connection[] dbConnection = null;
        private ConcurrentHashMap<ReadTableConnectionInfo, Connection> tableStatement = new ConcurrentHashMap<ReadTableConnectionInfo, Connection>();
    
        public ReadTask(LinkedHashMap<String, XMPReadTableConnectionInfo> tableList) {
            this.tableLists = tableList;
        }
    
    
        @Override
        public run() {
    
        int j = 0;
        dbConnection = new Connection[tableLists.size()];
    
        //loop around the map values and make the connection list
        for (ReadTableConnectionInfo ci : tableLists.values()) {
    
        dbConnection[j] = getDBConnection(ci.getUrl(), ci.getUser(), ci.getPassword(), ci.getDriver());
        tableStatement.putIfAbsent(ci, dbConnection[j]);
    
        j++;
        }
    
          Random random = new SecureRandom();
    
          while ( < 60 minutes) {
    
            double randomNumber = random.nextDouble() * 100.0;
            ReadTableConnectionInfo table = selectRandomConnection(randomNumber);
    
            for (Map.Entry<ReadTableConnectionInfo, Connection> entry : tableStatement.entrySet()) {
    
                if (entry.getKey().getTableName().equals(table.getTableName())) {
    
                    final String id = generateRandomId(random);
                    final String selectSql = generateRandomSQL(table);
    
                    preparedStatement = entry.getValue().prepareCall(selectSql);
                    preparedStatement.setString(1, id);
    
                    rs = preparedStatement.executeQuery();
                }
            }
          }
        }
    
    
    
            private String generateRandomSQL(ReadTableConnectionInfo table) {
    
            int rNumber = random.nextInt(table.getColumns().size());
    
            List<String> shuffledColumns = new ArrayList<String>(table.getColumns());
            Collections.shuffle(shuffledColumns);
    
            String columnsList = "";
    
            for (int i = 0; i < rNumber; i++) {
                columnsList += ("," + shuffledColumns.get(i));
            }
    
            final String sql = "SELECT ID" + columnsList + "  from "
                    + table.getTableName() + " where id = ?";
    
            return sql;
        }
    
    
        private ReadTableConnectionInfo selectRandomConnection(double randomNumber) {
    
            double limit = 0;
            for (ReadTableConnectionInfo ci : tableLists.values()) {
                limit += ci.getPercentage();
                if (random.nextDouble() < limit) {
                    return ci;
                }
                throw new IllegalStateException();
            }
            return null;
        }
        }
    

3 个答案:

答案 0 :(得分:1)

您可以将其视为可用连接的循环,如下所示:

public run() {
  ...
  Random random = new SecureRandom();

  while ( < 60 minutes) {
    double randomNumber = random.nextDouble() * 100.0;
    ReadTableConnectionInfo tableInfo = selectRandomConnection(randomNumber);

    // do query...
  }
}


private ReadTableConnectionInfo selectRandomConnection(double randomNumber) {
  double limit = 0;
  for (ReadTableConnectionInfo ci : tableLists.values()) {
    limit += ci.getPercentage();
    if (randomNumber < limit) {
      return ci;
  }
  throw new IllegalStateException();
}

只要randomNumber的最大值小于sum(百分比),那就完成了工作。

我想到的另一件事是:如果你最终会遇到这么多可能的查询,循环查找成为一个问题,你可以构建一个查找表:创建一个数组,使得数组的总大小包含足够的条目,以便查询的相对权重可以用整数表示。

对于三个查询的示例,80:10:10,有一个10项数组ReadTableConnectionInfo,其中八个引用指向table1,一个指向table2,一个指向table3。然后只需将随机数缩放为0 <= rand < 10(例如(int)(Math.random() * 10),并使用它来索引数组。

答案 1 :(得分:0)

无论您有多少个表,它们的百分比总是加起来为100.概念化您选择方式的最简单方法是将每个表视为代表一系列百分比。

例如,如果有三个表具有您提到的百分比(80%,10%,10%),您可以将它们概念化为:

随机数   从To ==表== 0.0000 0.8000表1 0.8000 0.9000表2 0.9000 1.0000表_3

因此,在0.0000和1.0000之间生成一个随机#,然后沿着有序列表向下看,看看哪个范围适合,从而使用哪个表。

(顺便说一句:我不确定你为什么每个表都有两个连接。)

答案 2 :(得分:0)

您可以构建一个包含表名称及其权重的查找表:

class LookupTable {
    private int[]    weights;
    private String[] tables;
    private int      size = 0;

    public LookupTable(int n) {
        this.weights = new int[n];
        this.tables = new String[n];
    }

    public void addTable(String tableName, int r) {
        this.weights[size] = r;
        this.tables[size] = tableName;
        size++;
    }

    public String lookupTable(int n) {
        for (int i = 0; i < this.size; i++) {
            if (this.weights[i] >= n) {
                return this.tables[i];
            }
        }
        return null;
    }
}

初始化表格的代码:

    LookupTable tr = new LookupTable(3);
    // make sure adds the range from lower to upper!
    tr.addTable("table1", 20);
    tr.addTable("table2", 80);
    tr.addTable("table3", 100);

测试代码:

    Random r = new Random(System.currentTimeMillis());
    for (int i = 0; i < 10; i++) {
        // r.nextInt(101) + 1 would return a number of range [1~100]. 
        int n = r.nextInt(101) + 1;
        String tableName = tr.lookupTable(n);
        System.out.println(n + ":" + tableName);
    }