我需要根据30个不同的标准将广告块显示到网站中(例如:用户的国家/地区,推荐网址,星期几,......)。表大小非常小,不超过400条记录,但许多空值。在超过一半的案例/记录中,不会应用任何标准,广告将显示给所有访问者,但在一半情况下将适用条件。
我有两个选择。
选项1 - 包含许多空值的表:
Table block
block_id | html_to_show | status | views | country_code | if_referral_url | day_of_week | ...
SQL
SELECT * FROM block WHERE (country_code=$country OR country_code IS NULL) AND (if_referral_url = $url OR if_referral_url IS NULL) AND ...
选项2 - 包含30个以上联接的2个表:
Table block
block_id | html_to_show | status | views
Table conditions
block_id | cond_name | cond_value
1 | country_code | US
1 | if_referral_url | %google%
1 | from_date | 1489143997
...
SQL
SELECT * FROM block
LEFT JOIN conditions from_date ON from_date.block_id=block.block_id
LEFT JOIN conditions if_referral ON if_referral.block_id=block.block_id
LEFT JOIN conditions country ON country.block_id=block.block_id
...20 more joins here
WHERE from_date.from_date IS NULL OR from_date.from_date>$fromDate
AND if_referral.if_referral IS NULL OR if_referral.if_referral=$ref
AND country.country_code IS NULL OR country.country_code=$country
....
通常答案可能是在有太多空值时使用连接。但是,如果只有400条记录和大量的必需连接(30+),该怎么办?在这样一个小表中,如果有空值,它可能会失败,但另一方面,30+连接可能会导致一些负面的性能影响。我是对的吗?
答案 0 :(得分:0)
我做了一个测试,毫无疑问地表明在这种情况下我不应该使用连接。我创建了一个简单的PHP脚本,从db获取结果。我使用了apache ab(http://httpd.apache.org/docs/2.0/programs/ab.html),1000个请求。加入它几乎慢了1000%。
对于没有连接的测试,我使用单表测试(除了字段banner_id外,所有字段都是varchar 50)
banner_id | field1 | field2 | field3 | field4 | field5 | field6 | field7 | field8 | field9 | field10
对于连接,我使用了2个表。表test2,其中包含banner_id的索引
banner_id | field | value
和表test3,索引为banner_id
banner_id | eee
我还添加了外键(banner_id)。
这是我将数据插入db:
的脚本for($n=0;$n<1000;$n++){
$length = 10;
$rand = substr(str_shuffle(str_repeat($x='0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', ceil($length/strlen($x)) )),1,$length);
$sql = "INSERT INTO test
(banner_id,field2,field3,field4,field5,field6,field7,field8,field9,field10)
VALUES ($n,'2".$n.$rand."','3".$n.$rand."','4".$n.$rand."',5".$n.",
6".$n.",".time().",8".$n.",9".$n.",10".$n.")";
$statement = $this->pdo->prepare($sql);
$statement->execute();
$sql = "INSERT INTO test2 (banner_id,field,value)
VALUES ($n,'field2','2".$n.$rand."'),
($n,'field3','3".$n.$rand."'),
($n,'field4','4".$n.$rand."'),
($n,'field5',5".$n."),
($n,'field6',6".$n."),
($n,'field7',".time()."),
($n,'field8',8".$n."),
($n,'field9',9".$n."),
($n,'field10',10".$n.")";
$statement = $this->pdo->prepare($sql);
$statement->execute();
$sql = "INSERT INTO test3 (banner_id,eee)
VALUES ($n,'".$n.$rand."')";
$statement = $this->pdo->prepare($sql);
$statement->execute();
}
这是没有连接的SQL:
$sql = "SELECT * FROM test WHERE
(field2 IS NULL OR field2>5)
AND (field3 IS NULL OR field3>5)
AND (field4 IS NULL OR field4>5)
AND (field5 IS NULL OR (field5>10 AND field5<".rand(200,3300)."))
AND (field6 IS NULL OR (field6>10 AND field6<".rand(200,3300)."))
AND (field7 IS NULL OR (field7>10 AND field7<".rand(1489408024,1499408024)."))
AND (field8 IS NULL OR (field8>10 AND field8<".rand(200,3300)."))
AND (field9 IS NULL OR (field9>10 AND field9<".rand(200,3300)."))
AND (field10 IS NULL OR (field10>10 AND field10<1100))
";
这与加入:
$sql = "SELECT * FROM test3
LEFT JOIN test2 cond2 ON cond2.banner_id = test3.banner_id AND cond2.field='field2'
LEFT JOIN test2 cond3 ON cond3.banner_id = test3.banner_id AND cond3.field='field3'
LEFT JOIN test2 cond4 ON cond4.banner_id = test3.banner_id AND cond4.field='field4'
LEFT JOIN test2 cond5 ON cond5.banner_id = test3.banner_id AND cond5.field='field5'
LEFT JOIN test2 cond6 ON cond6.banner_id = test3.banner_id AND cond6.field='field6'
LEFT JOIN test2 cond7 ON cond7.banner_id = test3.banner_id AND cond7.field='field7'
LEFT JOIN test2 cond8 ON cond8.banner_id = test3.banner_id AND cond8.field='field8'
LEFT JOIN test2 cond9 ON cond9.banner_id = test3.banner_id AND cond9.field='field9'
LEFT JOIN test2 cond10 ON cond10.banner_id = test3.banner_id AND cond10.field='field10'
WHERE (cond2.value IS NULL OR cond2.value>5)
AND (cond3.value IS NULL OR cond3.value>5)
AND (cond4.value IS NULL OR cond4.value>5)
AND (cond5.value IS NULL OR (cond5.value>10 AND cond5.value<".rand(200,3300)."))
AND (cond6.value IS NULL OR (cond6.value>10 AND cond6.value<".rand(200,3300)."))
AND (cond7.value IS NULL OR (cond7.value>10 AND cond7.value<".rand(1489408024,1499408024)."))
AND (cond8.value IS NULL OR (cond8.value>10 AND cond8.value<".rand(200,3300)."))
AND (cond9.value IS NULL OR (cond9.value>10 AND cond9.value<".rand(200,3300)."))
AND (cond10.value IS NULL OR (cond10.value>10 AND cond10.value<1100))
GROUP BY test3.banner_id";
这个测试显示我显然不应该使用规范化结构而只能使用1个表。