我有一个名为passive
的表,其中包含每个用户的时间戳事件列表。我想填充属性duration
,该属性对应于当前行的事件与该用户完成的下一个事件之间的时间。
我尝试了以下查询:
UPDATE passive as passive1
SET passive1.duration = (
SELECT min(UNIX_TIMESTAMP(passive2.event_time) - UNIX_TIMESTAMP(passive1.event_time) )
FROM passive as passive2
WHERE passive1.user_id = passive2.user_id
AND UNIX_TIMESTAMP(passive2.event_time) - UNIX_TIMESTAMP(passive1.event_time) > 0
);
这将返回错误消息Error 1093 - You can't specify target table for update in FROM
。
为了避免这种限制,我尝试遵循https://stackoverflow.com/a/45498/395857中给出的结构,该结构使用FROM子句中的嵌套子查询来创建隐式临时表,因此它不算相同我们正在更新表:
UPDATE passive
SET passive.duration = (
SELECT *
FROM (SELECT min(UNIX_TIMESTAMP(passive2.event_time) - UNIX_TIMESTAMP(passive.event_time))
FROM passive, passive as passive2
WHERE passive.user_id = passive2.user_id
AND UNIX_TIMESTAMP(passive2.event_time) - UNIX_TIMESTAMP(passive1.event_time) > 0
)
AS X
);
但是,嵌套子查询中的passive
表不会引用与主查询中相同的passive
。因此,所有行都具有相同的passive.duration
值。如何在嵌套子查询中引用主查询的passive
? (或者可能有一些替代方法来构建这样的查询?)
答案 0 :(得分:2)
尝试这样....
UPDATE passive as passive1
SET passive1.duration = (
SELECT min(UNIX_TIMESTAMP(passive2.event_time) - UNIX_TIMESTAMP(passive1.event_time) )
FROM (SELECT * from passive) Passive2
WHERE passive1.user_id = passive2.user_id
AND UNIX_TIMESTAMP(passive2.event_time) - UNIX_TIMESTAMP(passive1.event_time) > 0
)
;
答案 1 :(得分:0)
我们可以使用Python脚本来解决问题:
'''
We need an index on user_id, timestamp to speed up
'''
#!/usr/bin/python
# -*- coding: utf-8 -*-
# Download it at http://sourceforge.net/projects/mysql-python/?source=dlp
# Tutorials: http://mysql-python.sourceforge.net/MySQLdb.html
# http://zetcode.com/db/mysqlpython/
import MySQLdb as mdb
import datetime, random
def main():
start = datetime.datetime.now()
db=MySQLdb.connect(user="root",passwd="password",db="db_name")
db2=MySQLdb.connect(user="root",passwd="password",db="db_name")
cursor = db.cursor()
cursor2 = db2.cursor()
cursor.execute("SELECT observed_event_id, user_id, observed_event_timestamp FROM observed_events ORDER BY observed_event_timestamp ASC")
count = 0
for row in cursor:
count += 1
timestamp = row[2]
user_id = row[1]
primary_key = row[0]
sql = 'SELECT observed_event_timestamp FROM observed_events WHERE observed_event_timestamp > "%s" AND user_id = "%s" ORDER BY observed_event_timestamp ASC LIMIT 1' % (timestamp, user_id)
cursor2.execute(sql)
duration = 0
for row2 in cursor2:
duration = (row2[0] - timestamp).total_seconds()
if (duration > (60*60)):
duration = 0
break
cursor2.execute("UPDATE observed_events SET observed_event_duration=%s WHERE observed_event_id = %s" % (duration, primary_key))
if count % 1000 == 0:
db2.commit()
print "Percent done: " + str(float(count) / cursor.rowcount * 100) + "%" + " in " + str((datetime.datetime.now() - start).total_seconds()) + " seconds."
db.close()
db2.close()
diff = (datetime.datetime.now() - start).total_seconds()
print 'finished in %s seconds' % diff
if __name__ == "__main__":
main()