Mysql 30+加入非常小的表?

时间:2017-03-10 11:12:45

标签: mysql left-join

我需要根据30个不同的标准将广告块显示到网站中(例如:用户的国家/地区,推荐网址,星期几,......)。表大小非常小,不超过400条记录,但许多空值。在超过一半的案例/记录中,不会应用任何标准,广告将显示给所有访问者,但在一半情况下将适用条件。

我有两个选择。

选项1 - 包含许多空值的表:

Table block

block_id | html_to_show | status | views | country_code | if_referral_url | day_of_week | ...

SQL

SELECT * FROM block WHERE (country_code=$country OR country_code IS NULL) AND (if_referral_url = $url OR if_referral_url IS NULL) AND ...

选项2 - 包含30个以上联接的2个表:

Table block

block_id | html_to_show | status | views

Table conditions

block_id | cond_name | cond_value
1        | country_code | US
1        | if_referral_url | %google%
1        | from_date | 1489143997
...

SQL

SELECT * FROM block
LEFT JOIN conditions from_date ON from_date.block_id=block.block_id
LEFT JOIN conditions if_referral ON if_referral.block_id=block.block_id
LEFT JOIN conditions country ON country.block_id=block.block_id
...20 more joins here
WHERE from_date.from_date IS NULL OR from_date.from_date>$fromDate
AND if_referral.if_referral IS NULL OR if_referral.if_referral=$ref
AND country.country_code IS NULL OR country.country_code=$country
....

通常答案可能是在有太多空值时使用连接。但是,如果只有400条记录和大量的必需连接(30+),该怎么办?在这样一个小表中,如果有空值,它可能会失败,但另一方面,30+连接可能会导致一些负面的性能影响。我是对的吗?

1 个答案:

答案 0 :(得分:0)

我做了一个测试,毫无疑问地表明在这种情况下我不应该使用连接。我创建了一个简单的PHP脚本,从db获取结果。我使用了apache ab(http://httpd.apache.org/docs/2.0/programs/ab.html),1000个请求。加入它几乎慢了1000%。

对于没有连接的测试,我使用单表测试(除了字段banner_id外,所有字段都是varchar 50)

banner_id | field1 | field2 | field3 | field4 | field5 | field6 | field7 | field8 | field9 | field10

对于连接,我使用了2个表。表test2,其中包含banner_id的索引

banner_id | field | value

和表test3,索引为banner_id

banner_id | eee

我还添加了外键(banner_id)。

这是我将数据插入db:

的脚本
for($n=0;$n<1000;$n++){
  $length = 10;
  $rand = substr(str_shuffle(str_repeat($x='0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', ceil($length/strlen($x)) )),1,$length);
  $sql = "INSERT INTO test
          (banner_id,field2,field3,field4,field5,field6,field7,field8,field9,field10)
          VALUES ($n,'2".$n.$rand."','3".$n.$rand."','4".$n.$rand."',5".$n.",
          6".$n.",".time().",8".$n.",9".$n.",10".$n.")";
  $statement = $this->pdo->prepare($sql);
  $statement->execute();
  $sql = "INSERT INTO test2 (banner_id,field,value)
          VALUES ($n,'field2','2".$n.$rand."'),
                  ($n,'field3','3".$n.$rand."'),
                  ($n,'field4','4".$n.$rand."'),
                  ($n,'field5',5".$n."),
                  ($n,'field6',6".$n."),
                  ($n,'field7',".time()."),
                  ($n,'field8',8".$n."),
                  ($n,'field9',9".$n."),
                  ($n,'field10',10".$n.")";
  $statement = $this->pdo->prepare($sql);
  $statement->execute();
  $sql = "INSERT INTO test3 (banner_id,eee)
          VALUES ($n,'".$n.$rand."')";
  $statement = $this->pdo->prepare($sql);
  $statement->execute();                    
}

这是没有连接的SQL:

$sql = "SELECT * FROM test WHERE
        (field2 IS NULL OR field2>5)
        AND (field3 IS NULL OR field3>5)
        AND (field4 IS NULL OR field4>5)
        AND (field5 IS NULL OR (field5>10 AND field5<".rand(200,3300)."))
        AND (field6 IS NULL OR (field6>10 AND field6<".rand(200,3300)."))
        AND (field7 IS NULL OR (field7>10 AND field7<".rand(1489408024,1499408024)."))
        AND (field8 IS NULL OR (field8>10 AND field8<".rand(200,3300)."))
        AND (field9 IS NULL OR (field9>10 AND field9<".rand(200,3300)."))
        AND (field10 IS NULL OR (field10>10 AND field10<1100))
        ";

这与加入:

$sql = "SELECT * FROM test3
LEFT JOIN test2 cond2 ON cond2.banner_id = test3.banner_id AND cond2.field='field2'
LEFT JOIN test2 cond3 ON cond3.banner_id = test3.banner_id AND cond3.field='field3'
LEFT JOIN test2 cond4 ON cond4.banner_id = test3.banner_id AND cond4.field='field4'
LEFT JOIN test2 cond5 ON cond5.banner_id = test3.banner_id AND cond5.field='field5'
LEFT JOIN test2 cond6 ON cond6.banner_id = test3.banner_id AND cond6.field='field6'
LEFT JOIN test2 cond7 ON cond7.banner_id = test3.banner_id AND cond7.field='field7'
LEFT JOIN test2 cond8 ON cond8.banner_id = test3.banner_id AND cond8.field='field8'
LEFT JOIN test2 cond9 ON cond9.banner_id = test3.banner_id AND cond9.field='field9'
LEFT JOIN test2 cond10 ON cond10.banner_id = test3.banner_id AND cond10.field='field10'
WHERE (cond2.value IS NULL OR cond2.value>5)
  AND (cond3.value IS NULL OR cond3.value>5)
  AND (cond4.value IS NULL OR cond4.value>5)
  AND (cond5.value IS NULL OR (cond5.value>10 AND cond5.value<".rand(200,3300).")) 
  AND (cond6.value IS NULL OR (cond6.value>10 AND cond6.value<".rand(200,3300).")) 
  AND (cond7.value IS NULL OR (cond7.value>10 AND cond7.value<".rand(1489408024,1499408024).")) 
  AND (cond8.value IS NULL OR (cond8.value>10 AND cond8.value<".rand(200,3300)."))  
  AND (cond9.value IS NULL OR (cond9.value>10 AND cond9.value<".rand(200,3300).")) 
  AND (cond10.value IS NULL OR (cond10.value>10 AND cond10.value<1100))
  GROUP BY test3.banner_id"; 

这个测试显示我显然不应该使用规范化结构而只能使用1个表。