我的问题是在derived_daily表中填充我的最后两列(它是从源表中得到的,price_daily)。
(来源表) prices_daily:
sequence INT(11) NO PRI AUTO_INCREMENT
symbol VARCHAR(6) NO MUL
date DATE NO MUL
high DECIMAL(8,2) YES
low DECIMAL(8,2) YES
close_adj DECIMAL(8,2) YES
(目标表) derived_daily:
symbol VARCHAR(6) YES
date DATE NO
mov_avg10 DECIMAL(8,2) YES
std_dev10 DECIMAL(8,2) YES
range_daily DECIMAL(8,2) YES
range_std_dev30 DECIMAL(8,2) YES
我可以使用以下代码填充前4列:
INSERT INTO derived_daily(symbol, date, mov_avg10, std_dev10)
(
SELECT t1.symbol, t1.date, AVG(t2.close_adj) AS mov_avg10, STDDEV(t2.close_adj)
AS std_dev10
FROM prices_daily t1 LEFT OUTER JOIN prices_daily t2
ON t2.symbol = t1.symbol AND (t1.sequence - t2.sequence)BETWEEN 0 AND 9
WHERE t1.symbol = 'C' GROUP BY t1.date
) ;
但是当我尝试使用以下内容填充派生表中的'range_daily'时:
INSERT INTO derived_daily(symbol, date, range_daily)
(
SELECT t2.symbol, t2.date, (high - low) AS range_daily
FROM prices_daily t1
LEFT OUTER JOIN derived_daily t2
ON t2.symbol = t1.symbol AND t2.date = t1.date
WHERE t2.symbol = 'C'
ORDER BY t2.date
) ;
它将它放在正确的列中,但是在新行中的表的底部,而不是具有缺失数据的现有行(range_daily,最终,range_std_dev30)。我已经尝试了一些调整,我主要是“ERROR 1364,字段'日期'没有默认值。我想填充最后2列以匹配我已经放在表中的内容(相同)行),而不是底部的新行。
我花了很长时间查看相关的问题/答案,但仍然无法解决我的问题(noob ...道歉)。任何帮助/建议等。非常感谢!附:正在努力格式化问题,但不得不来到主要地方(这个网站是最重要的)因为我感到沮丧哈哈。
谢谢, 汤姆
答案 0 :(得分:0)
我认为您需要更新现有行,而不是INSERT新行。
例如:
UPDATE ( SELECT t1.symbol
, t1.date
, AVG(t2.close_adj) AS mov_avg10
, STDDEV(t2.close_adj) AS std_dev10
FROM prices_daily t1
LEFT
JOIN prices_daily t2
ON t2.symbol = t1.symbol
AND (t1.sequence - t2.sequence) BETWEEN 0 AND 9
WHERE t1.symbol = 'C'
GROUP
BY t1.symbol
, t1.date
) s
JOIN derived_daily t
ON t.symbol = s.symbol
AND t.date = s.date
SET t.mov_avg10 = s.mov_avg10
, t.std_dev10 = s.std_dev10
要进行测试,您可以将UPDATE
关键字替换为SELECT t.symbol, t.date, t.mov_avg10, t.std_dev10, s.mov_avg10, s.std_dev10 FROM
,并在结尾处删除SET
子句。这将列出将要更新的t行,包含行中的现有值以及将分配的新值。
<强>后续强>
要插入所有列,您可以组合三个查询。 这是一个怪物查询,但是一旦我们把它分解成它的组件,它就有意义了。
INSERT INTO derived_daily
( symbol
, `date`
, mov_avg10
, std_dev10
, range_daily
, range_std_dev30
)
SELECT q.symbol
, q.date
, q.mov_avg10
, q.std_dev10
, r.range_daily
, s.range_std_dev30
FROM ( SELECT t1.symbol
, t1.date
, AVG(t2.close_adj) AS mov_avg10
, STDDEV(t2.close_adj) AS std_dev10
FROM prices_daily t1
LEFT
JOIN prices_daily t2
ON t2.symbol = t1.symbol
AND (t1.sequence - t2.sequence) BETWEEN 0 AND 9
WHERE t1.symbol = 'C'
GROUP BY t1.symbol, t1.date
) q
LEFT
JOIN ( SELECT t3.symbol
, t3.date
, (t3.high - t3.low) AS range_daily
FROM prices_daily t3
WHERE t3.symbol = 'C'
-- GROUP BY t3.symbol, t3.date
) r
ON r.symbol = q.symbol
AND r.date = q.date
LEFT
JOIN ( SELECT t4.symbol
, t4.date
, STDDEV(d.range_daily) AS range_std_dev30
FROM prices_daily t4
LEFT
JOIN ( SELECT t5.symbol
, t5.date
, (t5.high - t5.low) AS range_daily
FROM prices_daily t5
WHERE t5.symbol = 'C'
-- GROUP BY t5.symbol, t5.date
) d
ON d.symbol = t4.symbol
AND d.date >= t4.date
AND d.date <= t4.date + INTERVAL -30 DAY
WHERE t4.symbol = 'C'
GROUP BY t4.symbol, t4.date
) s
ON s.symbol = q.symbol
AND s.date = q.date
ORDER BY q.date, q.symbol
该查询基本上是
形式INSERT INTO target
SELECT q.key
, q.val
, r.val
, s.val
FROM q
LEFT
JOIN r ON r.key = q.key
LEFT
JOIN s ON s.key = q.key
这假设(symbol,date)
在(源)prices_daily
表上是唯一的,并且在(目标)derived_daily
表上也是唯一的。
为了进行测试,我们可以运行分别从q,r和s提供值的查询。
我从当前日期和30天的范围值开始研究range_std_dev30 ... STDDEV的公式。
(使用AUTO_INCREMENT序列来导出过去的10行,我感到有点困扰......我们通常不依赖于auto_increment id值顺序无间隙的保证。这是我们观察到的,但不保证行为,特别是如果以不同的顺序插入行。
(在30天内,我使用了日期范围比较。)