大型数据库的时间戳之间的Java计数插入

时间:2017-01-27 10:20:48

标签: java mysql multithreading postgresql timestamp

我有一个连接到MQTT代理的java程序。我需要为代理的每个传入消息插入一行。

消息表架构

   Column   |              Type              |                     Modifiers
------------+--------------------------------+------------------------------------
 content    | character(255)                 |
 user_id    | character(255)                 |
 sent_at    | timestamp(6) without time zone | default ('now'::text)::timestamp(6) with time zone
 message_id | character(255)                 |
 status     | character(1)                   | default 'w'::bpchar

我需要在一段时间间隔内跟踪消息。

我的主java应用程序建立数据库连接并包含一个MQTT侦听器,它为每个传入的新消息插入一行。

@Override
public void messageArrived(String s, MqttMessage mqttMessage) throws     Exception {
                //System.out.println("New Msg");
                //System.out.println(s);
                insertMessage(mqttMessage);

}

邮件插入方法

/***
 *
 * @param mqttMessage
 */
private static void insertMessage(MqttMessage mqttMessage) {
    arrived++ ;
    try {
        String mysql = "insert into messages (content, message_id, user_id, sent_at, status) values ('" + mqttMessage.getPayload() + "',  " + arrived + ", " + arrived + ", " + " CURRENT_TIMESTAMP (6) " + ", " + "'w'" + ") RETURNING sent_at";
        //System.out.println (mysql);
        ResultSet resultSet = statement.executeQuery(mysql);
        if (resultSet.next()) {
            // Log the last timestamp
             System.out.println(resultSet.getTimestamp(1));
        }
    } catch (SQLException e) {
        //System.out.println("Failed !");
        e.printStackTrace();
    }
    //System.out.println(arrived);
}

在同一个程序中,我实现了一个java类,它具有 db connection 并保持 latestTimestamp

我使用Executors.newScheduledThreadPool检查每隔10秒插入的消息数,并更新最新的时间戳。 获取最后插入的时间戳的方法是:

/**
 * Used to update the latest value from the db
 */
private void updateTimestamp() throws SQLException {
    //timestamp = new Timestamp(System.currentTimeMillis());
    resultSet = statement.executeQuery( "select sent_at from messages  order by sent_at  desc  limit 1 ;");
    if (resultSet.next()) {
//   Supposed to be the latest inserted row and the latest timestamp in the db
        latestTimestamp = resultSet.getTimestamp(1);
        System.out.print("new timestamp ==> ");
        System.out.println(timestamp);
    } else {
        timestamp = Timestamp.valueOf(Constants.MIN_TIMESTAMP_VALUE);
    }
}

然后当我需要获取在最新更新日期之后插入的消息计数时,我使用一个比较时间戳的查询。

/**
 * This function get all messages that have been sent from latest timestamp
 *
 * @return
 * @throws SQLException
 */
private ResultSet getMQTTMessagesDelayed() throws SQLException {
    oldTimeStamp = latestTimestamp ;
    // Update the new timestamp to reduce losing time in execution
    updateTimestamp();
    mysql = "Select count(*) as cn from messages where  sent_at > '" + oldTimeStamp  + "' ;";
    System.out.println(mysql);
    return statement.executeQuery(mysql);

}

现在的问题是,对于从〜5000开始的大量消息,我希望在计算选择计数的总和时有正确的一些消息,例如,如果我发送批量5000ms,当预定的线程执行并获得2500作为此次的计数,我需要在下一个纪元时间(下一个10秒)获得2500,不是这种情况并且我得到一些不正确的结果(大约45/20差异!)。

注意

  • 使用MysqL和postgres进行测试

  • 8 GB RAM

  • Windows 10

  • Java 8

1 个答案:

答案 0 :(得分:2)

当两个线程并行运行时,一个插入,另一个从同一个表中选择,您几乎不会得到可预测的结果,并且随着消息表的增长,性能可能会降低。我的理解是你只想保留在两个给定日期之间插入的消息数量。这些日期在一个相当短的间隔(10秒)内。因此,我认为如果使用内存列表跟踪传入消息会更好,其中最老的元素由工作线程按计划的时间间隔丢弃。

此外,您不需要从INSERT检索ResultSet。相反,在客户端站点生成sent_at Date字段然后使用PreparedStatement参数或STR_TO_DATE MySQL函数或{ts'YYYY-MM-DD HH:mm:SS'}标准将其传递到INSERT SQL语句会快得多日期的JDBC转义语法。

您的insertMessage将成为

private static void insertMessage(MqttMessage mqttMessage) {
    arrived++ ;
    try {
        Date now = new Date();
        SimpleDateFormat fmt = new SimpleDateFormat("yyyy-MM-dd HH.mm.ss");         
        String mysql = "insert into messages (content, message_id, user_id, sent_at, status) values ('" + mqttMessage.getPayload() + "',  " + arrived + ", " + arrived + ", { ts '" + fmt.format(now)  + "' }, " + "'w'" + ")";
        statement.executeUpdate(mysql);
        messageList.add(now);
    } catch (SQLException e) {
        e.printStackTrace();
    }
}

和(假设您只有一个编写器线程)列表的示例实现,用于跟踪传入的消息,如

import java.util.Date;
import java.util.List;
import java.util.LinkedList;
import java.util.Collections;

public class MessageList implements AutoCloseable {

    private List<Date> messages;
    private CleanUp cleaner;

    private final long MAX_KEEP_TRACK = 20l;
    private final long RUN_EVERY_SECS = 10l;

    public MessageList() {
        messages = Collections.synchronizedList(new LinkedList<Date>());
        cleaner = new CleanUp(messages, MAX_KEEP_TRACK, RUN_EVERY_SECS);
        cleaner.start();
    }

    @Override
    public void close() throws Exception {      
        cleaner.stop();
    }

    public void add(Date messageDate) {
        messages.add(messageDate);
    }

    public int countBetween(Date start, Date end) {
        int count =0;
        for (Date d : messages) {
            if (d.compareTo(end)>0) {
                break;
            } else if (d.compareTo(start)>=0) {
                count++;
            }
        }
        return count;
    }

    private class CleanUp extends Thread {

        private List<Date> msgs;
        private long maxKeepMilis;
        private long runEveryMilis;
        private boolean stop;

        public CleanUp(List<Date> messages, long maxKeepSecs, long runEverySecs) {
            msgs = messages;
            maxKeepMilis = maxKeepSecs * 1000l;
            runEveryMilis = runEverySecs * 1000l;
            stop = false;
        }

        @Override
        public void run() {
            Date d;
            while(!stop) {
                long now = new Date().getTime();
                while ((d=msgs.get(0))!=null)
                    if (now-d.getTime()>maxKeepMilis)
                        msgs.remove(0);
                try {
                    Thread.sleep(runEveryMilis);
                } catch (InterruptedException e) { }
            }
        }
    }
}

然后你只需拨打messageList.countBetween()即可获得两个日期之间收到的邮件数量。