我有5张桌子,每张桌子都有数百万行。
每个表都具有相同的格式,即电子邮件,IP地址,位置。单个电子邮件地址可以存在于五个表中的任何一个表中,也可以存在于所有5个表中。还有另外两个表,即User_ip
和User_location
。
我想在User_ip表中存储每个电子邮件地址的唯一IP地址,以及user_location表中每个电子邮件地址的不同位置。
目前,我一直在遵循这个程序,但这需要很多时间。有没有其他解决方案或方法?
Statement stmt = connection.createStatement();
stmt.executeUpdate("insert into temp(email,ip,location) select email,ip,location from Apr_web group by email,ip");
stmt.executeUpdate("insert into temp(email,ip,location) select email,ip,location from Apr_gov group by email,ip");
stmt.executeUpdate("insert into temp(email,ip,location) select email,ip,location from Apr_mail group by email,ip");
stmt.executeUpdate("insert into temp(email,ip,location) select email,ip,location from Apr_pop group by email,ip");
stmt.executeUpdate("insert into temp(email,ip,location) select email,ip,location from Apr_imap group by email,ip");
stmt1 = connection.createStatement();
stmt1.executeQuery("select distinct email from temp");
ResultSet rs = stmt1.getResultSet();
while(rs.next()){
Statement stmt2 = connection.createStatement();
stmt2.executeQuery("select distinct substring_index(ip,'.',2) from temp where email='"+email+"'");
ResultSet rs2 = stmt2.getResultSet();
while(rs2.next()`enter code here`){
ip=rs2.getString(1);
Statement stmt3 = connection.createStatement();
Statement stmt4 = connection.createStatement();
stmt3.executeQuery("select * from user_ip where uid='"+email+"' and ip='"+ip+"'");
ResultSet rs3 = stmt3.getResultSet();
if(rs3.next()){
System.out.println("THE ROW ALREADY EXISTS IN IP TABLE");
}
else{
stmt4.executeUpdate("insert into user_ip(uid,ip) values('"+email+"','"+ip+"')");
System.out.println("ROW INSERTED IN USER_IP");
}
}
Statement stmt5 = connection.createStatement();
stmt5.executeQuery("select distinct location from temp where email='"+email+"' and location !='no information found'");
ResultSet rs4 = stmt5.getResultSet();
while(rs4.next()){
location = rs4.getString(1);
//Statement stmt6 = connection.createStatement();
//Statement stmt7 = connection.createStatement();
pst1 = connection.prepareStatement("select * from user_location where uid=? and location=?");
pst1.setString(1, email);
pst1.setString(2, location);
ResultSet rs5 = pst1.executeQuery();
if(rs5.next()){
System.out.println("THE ROW ALREADY EXISTS IN USER_LOCATION");
}
else{
pst2 = connection.prepareStatement("insert into user_location(uid,location) values(?,?)");
pst2.setString(1,email);
pst2.setString(2,location);
pst2.executeUpdate();
System.out.println("ROW INSERTED IN USER_LOCATION");
}
}
}
答案 0 :(得分:0)
你关闭了自动提交吗?如果不是,则在创建连接时,请使用以下命令关闭自动提交:
将SET autocommit=0;
添加到连接字符串
然后在最后,当添加所有项目时,使用命令:
COMMIT;
或者您要放弃所有更改(如果出现错误):
ROLLBACK;
这些命令作为查询发送到MySQL数据库以提交或回滚数据。这应该允许您更快地将大量项目放入表中。
查看documentation以获取更多详细信息
答案 1 :(得分:0)
你不能在2个SQL语句中执行此操作。类似于以下内容: -
INSERT IGNORE INTO user_ip (uid,ip)
SELECT DISTINCT email,SUBSTRING_INDEX(ip,'.',2) FROM Apr_web
UNION
SELECT DISTINCT email,SUBSTRING_INDEX(ip,'.',2) FROM Apr_gov
UNION
SELECT DISTINCT email,SUBSTRING_INDEX(ip,'.',2) FROM Apr_mail
UNION
SELECT DISTINCT email,SUBSTRING_INDEX(ip,'.',2) FROM Apr_pop
UNION
SELECT DISTINCT email,SUBSTRING_INDEX(ip,'.',2) FROM Apr_imap
INSERT IGNORE INTO user_location(uid,location)
SELECT DISTINCT email,location FROM Apr_web
UNION
SELECT DISTINCT email,location FROM Apr_gov
UNION
SELECT DISTINCT email,location FROM Apr_mail
UNION
SELECT DISTINCT email,location FROM Apr_pop
UNION
SELECT DISTINCT email,location FROM Apr_imap
答案 2 :(得分:0)
会像
这样的解决方案stmt.executeUpdate(
"INSERT INTO user_ip(uid,ip) "+
"SELECT DISTINCT email,ip FROM temp "+
" LEFT JOIN user_id ON "+
" (temp.email = user_ip.email AND temp.ip = user_ip.ip) "+
" WHERE user_ip.email IS NULL");
为你工作?
说明:找到临时表中但不在表user_ip中的唯一对(email,ip),然后将它们添加到user_ip。