我有3个Mysql表:
[block_value]
[元数据]
[metadata_value]
在这些表中,有一对:metadata_name
= value
对列表放在块(id_block_value
)
(A)如果我想要身高= 1080:
SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "height" and value = "1080");
+---------+
| file_id |
+---------+
| 21 |
| 22 |
(...)
| 6962 |
(...)
| 8146 |
| 8147 |
+---------+
794 rows in set (0.06 sec)
(B)如果我想要文件扩展名= mpeg:
SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "file extension" and value = "mpeg");
+---------+
| file_id |
+---------+
| 6889 |
| 6898 |
| 6962 |
+---------+
3 rows in set (0.06 sec)
但是,如果我想要的话:
然后,我不知道什么是最好的。
对于A or B
,我尝试A union B
似乎可以解决问题。
SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "height" and value = "1080")
UNION
SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "file extension" and value = "mpeg");
+---------+
| file_id |
+---------+
| 21 |
| 22 |
| 34 |
(...)
| 6889 |
| 6898 |
+---------+
796 rows in set (0.13 sec)
对于A and B
,由于Mysql中没有intersect
,我尝试了A and file_id in(B)
,但请查看perfs(> 4mn)......
SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "height" and value = "1080")
and file_id in(
SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "file extension" and value = "mpeg"));
+---------+
| file_id |
+---------+
| 6962 |
+---------+
1 row in set (4 min 36.22 sec)
我也试过了B and file_id in(A)
,这好多了,但我永远都不知道先放哪个。
SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "file extension" and value = "mpeg")
and file_id in(
SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "height" and value = "1080"));
+---------+
| file_id |
+---------+
| 6962 |
+---------+
1 row in set (0.75 sec)
所以...我现在该怎么办? 布尔运算有没有更好的方法?有提示吗?我错过了什么吗?
编辑:看起来是什么数据:
此数据库在FILE
表中包含插入的每个音频/视频文件的行:
每个潜在信息的METADATA
表格中都有一行:
然后,BLOCK
表中的一行定义了一个容器:
文件可以包含多个元数据块,BLOCK_VALUE
表包含BLOCKS实例:
在此示例中,文件10有5个块:3个视频(101)+ 1个音频(102)+ 1个常规(104)
值存储在METADATA_VALUE
答案 0 :(得分:1)
对于“OR”为什么不在没有UNION的情况下尝试...我错过了什么?
SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "height" and value = "1080")
OR (metadata_name = "file extension" and value = "mpeg")
对于“AND”,在元数据表上使用两次内连接,以确保只获得满足两个条件的file_id ...
SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
AND (M.metadata_name = "height" and MV.value = "1080")
INNER JOIN metadata M2 ON MV.meta_id = M2.id_metadata
AND (M2.metadata_name = "file extension" and MV.value = "mpeg")
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
“A”而不是“B”,在“B”条件下使用左连接而不是内连接。添加WHERE子句,指定您不希望“B”
的结果SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
AND (M.metadata_name = "height" and MV.value = "1080")
LEFT JOIN metadata M2 ON MV.meta_id = M2.id_metadata
AND (M2.metadata_name = "file extension" and MV.value = "mpeg")
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE M2.id_metadata is NULL
答案 1 :(得分:1)
OR版本: (来自ChrisCamp答案的无耻复制和粘贴)
SELECT distinct file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "height" and value = "1080")
OR (metadata_name = "file extension" and value = "mpeg")
AND版本:
SELECT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "height" and value = "1080")
OR (metadata_name = "file extension" and value = "mpeg")
group by file_id having count(1)>1
关于AND版本的2个简短说明:
这实际上是一种根据之前的ORing来定义交叉点的方法..
当ANDind你有3种可能性时:
所以我刚删除了distinct子句,放了一个分组,并选择了两次出现的记录。
或者继续使用exists子句:)
编辑以下评论:
好的,试着保持简单...... id_block_values满足以下两个条件之一:SELECT BLOCK_VALUE_ID
FROM METADATA_VALUE MV
INNER JOIN
METADATA M
ON MV.META_ID=M.METADATA_ID
WHERE (METADATA_NAME='height' AND VALUE='1080')
OR (METADATA_NAME='file extension' AND VALUE='mpeg')
如果此处有2条以上的记录,则表示存在问题(重复元数据)。
现在是ANDing
SELECT FILE_ID
FROM BLOCK_VALUE BV
INNER JOIN
( SELECT BLOCK_VALUE_ID
FROM METADATA_VALUE MV
INNER JOIN
METADATA M
ON MV.META_ID=M.METADATA_ID
WHERE (METADATA_NAME='height' AND VALUE='1080')
OR (METADATA_NAME='file extension' AND VALUE='mpeg')
) X
ON BV.ID_BLOCK_VALUE=X.BLOCK_VALUE_ID
GROUP BY FILE_ID HAVING COUNT(1)>1
仍然,我无法理解为什么以前的查询不起作用.. 我担心如果你也删除了或查询中的DIstinct子句,你会看到一些记录超过两次,这没有意义。 顺便说一下,可以请你告诉我这些表的主键是什么?
答案 2 :(得分:1)
我正在开设一个新帖子,只是为了保持“正确”的解决方案整洁..
好的,对不起,我似乎在做出错误的假设。我从未想过两个块的定义完全相同。
所以,既然我是一个模仿者,我喜欢从OR解决方案(:P)获得AND,我得到了这两个解决方案..
ORing:我更喜欢Chris的解决方案......
SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "height" and value = "1080")
OR (metadata_name = "file extension" and value = "mpeg")
ANDing:我将使用您的ORing版本(UNION版本
) SELECT FILE_ID FROM (
SELECT DISTINCT 1, file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "height" and value = "1080")
UNION ALL
SELECT DISTINCT 2, file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "file extension" and value = "mpeg")
) IHATEAND
GROUP BY FILE_ID
HAVING COUNT(1)>1
给出了:
+---------+
| FILE_ID |
+---------+
| 6962 |
+---------+
1 row in set (0.24 sec)
它应该比看到你粘贴和挖掘的性能的ORing快一点(我慢3倍,升级的时间-.-),但仍然比以前的查询快得多;)
无论如何,ANDing如何工作? 简单地说,它只是执行两个单独的查询并根据它们来自的分支命名记录,然后计算来自它们的不同文件ID
更新:另一种方法,无需“命名”分支:
SELECT FILE_ID FROM (
SELECT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "height" and value = "1080")
GROUP BY FILE_ID
UNION ALL
SELECT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "file extension" and value = "mpeg")
GROUP BY FILE_ID
) IHATEAND
GROUP BY FILE_ID
HAVING COUNT(1)>1
这里的结果是相同的(和性能)我正在利用这样一个事实:虽然UNION会自动对重复项进行排序并删除重复项,但UNION ALL却没有...这是完美的,因为我不想要删除它们(并且通常联合所有也比联盟更快:)),这样我就可以忘记命名了。