Question

我运行一个php / mysql应用程序，帮助管理学区的信息。我有一个＆＃34;学生＆＃34;我根据SFTP发送到我的服务器的信息，每晚用CRON作业更新的表。

作业解析发送的csv文件中的数据，并将数据插入名为＆＃34; students_temp＆＃34;的表中。然后我将students_temp与学生进行比较，删除和数据匹配的行，更新存在但已更改信息的学生，并插入新学生。

该作业大约需要380秒才能完成，97％的时间用于DELETE语句。下面是两个表，以及删除语句，我必须想象问题在于多个where子句，但我不知道如何解决它。

学生

student_id  int(11)
student_ts  timestamp
student_local_id    varchar(100)
student_nj  bigint(11)
student_first   varchar(250)
student_last    varchar(250)
student_grade   int(11)
student_school  int(11)
student_district    int(11)
student_gender  varchar(1)
student_eth_american_indian int(1)
student_eth_asian   int(1)
student_eth_black   int(1)
student_eth_hispanic    int(1)
student_eth_pacific int(1)
student_eth_white   int(1)
student_status  int(1)
student_contact_name    text
student_address text
student_city    text
student_state   text
student_zip varchar(30)

Students_Temp

student_temp_id int(11)
student_temp_ts timestamp
student_temp_local_id   varchar(100)
student_temp_nj bigint(11)
student_temp_first  varchar(250)
student_temp_last   varchar(250)
student_temp_grade  int(11)
student_temp_grade_code varchar(255)
student_temp_school int(11)
student_temp_school_code    varchar(255)
student_temp_district   int(11)
student_temp_gender varchar(1)
student_temp_eth_american_indian    int(1)
student_temp_eth_asian  int(1)
student_temp_eth_black  int(1)
student_temp_eth_hispanic   int(1)
student_temp_eth_pacific    int(1)
student_temp_eth_white  int(1)
student_temp_status int(1)
student_temp_contact_name   text
student_temp_address    text
student_temp_city   text
student_temp_state  text
student_temp_zip    varchar(30)

SQL删除语句

DELETE st FROM students_temp st
    INNER JOIN students s ON student_temp_local_id=student_local_id
    WHERE student_temp_nj=student_nj 
    AND student_temp_first=student_first 
    AND student_temp_last=student_temp_last
    AND student_temp_grade=student_grade
    AND student_temp_school=student_school
    AND student_temp_district=student_district
    AND student_temp_gender=student_gender
    AND student_temp_eth_american_indian=student_eth_american_indian
    AND student_temp_eth_asian=student_eth_asian
    AND student_temp_eth_black=student_eth_black
    AND student_temp_eth_hispanic=student_eth_hispanic
    AND student_temp_eth_pacific=student_eth_pacific
    AND student_temp_eth_white=student_eth_white
    AND student_temp_status=student_status
    AND student_temp_contact_name=student_contact_name
    AND student_temp_address=student_address
    AND student_temp_city=student_city
    AND student_temp_state=student_state
    AND student_temp_zip=student_zip;

Answer 1

首先，应该将用于加入表的字段编入索引（student_local_id和student_temp_local_id）。如果这些字段是标识符，则制作唯一索引如果它们未编入索引，则对于第一个表中的每一行，您将扫描另一个表中的所有行。只需添加索引即可加快您的流程。

第二，没有必要从temp_table中删除未修改的行来更新另一个。您只需更新已修改的行，然后插入学生中不存在的行

你可以做这样的事情来更新

update table students a join students_temp b on b.student_temp_local_id=a.student_local_id
set a.field1 = b.field1, a.field2 = b.field2, ...
where a.field1 <> b.field1 or a.field2 <> b.filed2 or...

并插入新的

insert into students (field1, field2, ...)
    select field1, field2, ...
    from students_temp a
    where not exists (select 1 from students where student_local_id = a.student_temp_local_id);

Answer 2

似乎在student_local_id字段上添加索引改进了New Relic报告的脚本上的事务时间，从大约385,000ms到5,700ms !!

删除多个where子句非常慢

2 个答案: