将数百万条记录插入两个mysql表中

时间:2013-04-30 09:33:08

标签: java mysql

我有5张桌子,每张桌子都有数百万行。

每个表都具有相同的格式,即电子邮件,IP地址,位置。单个电子邮件地址可以存在于五个表中的任何一个表中,也可以存在于所有5个表中。还有另外两个表,即User_ipUser_location

我想在User_ip表中存储每个电子邮件地址的唯一IP地址,以及user_location表中每个电子邮件地址的不同位置。

目前,我一直在遵循这个程序,但这需要很多时间。有没有其他解决方案或方法?

Statement stmt = connection.createStatement();

        stmt.executeUpdate("insert into temp(email,ip,location) select email,ip,location from Apr_web group by email,ip");
        stmt.executeUpdate("insert into temp(email,ip,location) select email,ip,location from Apr_gov group by email,ip");
        stmt.executeUpdate("insert into temp(email,ip,location) select email,ip,location from Apr_mail group by email,ip");
        stmt.executeUpdate("insert into temp(email,ip,location) select email,ip,location from Apr_pop group by email,ip");
        stmt.executeUpdate("insert into temp(email,ip,location) select email,ip,location from Apr_imap group by email,ip");

stmt1 = connection.createStatement();
stmt1.executeQuery("select distinct email from temp");
        ResultSet rs = stmt1.getResultSet();
        while(rs.next()){    
 Statement stmt2 = connection.createStatement();
                stmt2.executeQuery("select distinct substring_index(ip,'.',2) from temp where email='"+email+"'");
                ResultSet rs2 = stmt2.getResultSet();
                while(rs2.next()`enter code here`){
                    ip=rs2.getString(1);

                    Statement stmt3 = connection.createStatement();
                    Statement stmt4 = connection.createStatement();
                    stmt3.executeQuery("select * from user_ip where uid='"+email+"' and ip='"+ip+"'");
                    ResultSet rs3 = stmt3.getResultSet();
                    if(rs3.next()){
                        System.out.println("THE ROW ALREADY EXISTS IN IP TABLE");
                    }
                    else{
                        stmt4.executeUpdate("insert into user_ip(uid,ip) values('"+email+"','"+ip+"')");
                        System.out.println("ROW INSERTED IN USER_IP");
                    }

                }

 Statement stmt5 = connection.createStatement();                    
                stmt5.executeQuery("select distinct location from temp where   email='"+email+"' and location !='no information found'");
                ResultSet rs4 = stmt5.getResultSet();
                while(rs4.next()){
                    location = rs4.getString(1);
                    //Statement stmt6 = connection.createStatement();
                    //Statement stmt7 = connection.createStatement();

                    pst1 = connection.prepareStatement("select * from user_location where uid=? and location=?");
                    pst1.setString(1, email);
                    pst1.setString(2, location);



                    ResultSet rs5 = pst1.executeQuery();
                    if(rs5.next()){
                        System.out.println("THE ROW ALREADY EXISTS IN USER_LOCATION");
                    }
                    else{
                        pst2 = connection.prepareStatement("insert into user_location(uid,location) values(?,?)");
                        pst2.setString(1,email);
                        pst2.setString(2,location);
                        pst2.executeUpdate();

                        System.out.println("ROW INSERTED IN USER_LOCATION");
                    }
                }
}

3 个答案:

答案 0 :(得分:0)

你关闭了自动提交吗?如果不是,则在创建连接时,请使用以下命令关闭自动提交:

SET autocommit=0;添加到连接字符串

的末尾

然后在最后,当添加所有项目时,使用命令:

COMMIT;

或者您要放弃所有更改(如果出现错误):

ROLLBACK;

这些命令作为查询发送到MySQL数据库以提交或回滚数据。这应该允许您更快地将大量项目放入表中。

查看documentation以获取更多详细信息

答案 1 :(得分:0)

你不能在2个SQL语句中执行此操作。类似于以下内容: -

INSERT IGNORE INTO user_ip (uid,ip)
SELECT DISTINCT email,SUBSTRING_INDEX(ip,'.',2) FROM Apr_web
UNION
SELECT DISTINCT email,SUBSTRING_INDEX(ip,'.',2) FROM Apr_gov
UNION
SELECT DISTINCT email,SUBSTRING_INDEX(ip,'.',2) FROM Apr_mail
UNION
SELECT DISTINCT email,SUBSTRING_INDEX(ip,'.',2) FROM Apr_pop
UNION
SELECT DISTINCT email,SUBSTRING_INDEX(ip,'.',2) FROM Apr_imap


INSERT IGNORE INTO user_location(uid,location)
SELECT DISTINCT email,location FROM Apr_web
UNION
SELECT DISTINCT email,location FROM Apr_gov
UNION
SELECT DISTINCT email,location FROM Apr_mail 
UNION
SELECT DISTINCT email,location FROM Apr_pop
UNION
SELECT DISTINCT email,location FROM Apr_imap

答案 2 :(得分:0)

会像

这样的解决方案
stmt.executeUpdate(
  "INSERT INTO user_ip(uid,ip) "+
  "SELECT DISTINCT email,ip FROM temp "+
  "   LEFT JOIN user_id ON "+
  "      (temp.email = user_ip.email AND temp.ip = user_ip.ip) "+
  "    WHERE user_ip.email IS NULL");

为你工作?

说明:找到临时表中但不在表user_ip中的唯一对(email,ip),然后将它们添加到user_ip。