我正在尝试找出生成WHERE查询的最有效方法。我之前问过另一个问题,这个问题很相似,但我会在这个问题上做到正确。
给定一个数字范围的集合,即1-1000
,1500-1600
,创建一个mysql非常简单,其中条件选择这些值之间的记录。
即,您可以这样做:
WHERE (lft BETWEEN 1 and 1000) OR (lft BETWEEN 1500-1600)
。但是,如果您想要合并NOT BETWEEN,该怎么办?
例如,如果您定义了几个规则,例如......
如何合并这些规则以便有效地生成WHERE条件。
我希望WHERE能够剖析ALLOW BETWEEN 1 - 1000
以便在其中创造一个空白。这样它就会成为1-24
和51-1000
。因为DENY规则是在第一个规则之后定义的,所以它会“覆盖”以前的规则。
另一个例子, 说你有
然后我想生成一个允许我这样做的WHERE条件:
WHERE (lft BETWEEN 5 and 9) OR (lft BETWEEN 45 and 60)
。
allow 10-50
,然后DENY必须完全被该范围消耗,即34-38。或者,完全消耗以前的规则。 9-51
。这是因为范围实际表示嵌套集模型中的lft和rgt值,并且您不能像我提出的那样重叠。 在提问时我没想到会提到这个问题,但在看到下面的工作示例代码后,我可以看到这个说明实际上非常重要。
(编辑示例mysql包含OR而不是AND,如下面的评论)
答案 0 :(得分:8)
WHERE (foo BETWEEN 1 AND 1000
OR foo BETWEEN 1500 AND 1600
OR foo BETWEEN 1250 AND 1300
) AND (
foo NOT BETWEEN 25 AND 50
)
你可以通过建造一个解剖器来扼杀一点效率,但我会怀疑它是否值得。所有WHERE子句项都不在索引之外,因此您不会阻止任何硬操作发生(这意味着您不会通过执行完全表扫描来停止)。
因此,不要花时间构建一个系统来为您完成,只需实现一个简单的解决方案(OR
将Allow和AND
放在一起Denys,然后转到更重要的位置的东西。然后,如果它后来成为一个问题,那么重新访问它。但我真的不认为这会成为一个太大的问题......
编辑好的,这是一个非常简单的算法。它使用字符串作为数据存储,因此对于较小的数字(低于100万)来说它是合理有效的:
class Dissector {
protected $range = '';
public function allow($low, $high) {
$this->replaceWith($low, $high, '1');
}
public function deny($low, $high) {
$this->replaceWith($low, $high, '0');
}
public function findRanges() {
$matches = array();
preg_match_all(
'/(?<!1)1+(?!1)/',
$this->range,
$matches,
PREG_OFFSET_CAPTURE
);
return $this->decodeRanges($matches[0]);
}
public function generateSql($field) {
$ranges = $this->findRanges();
$where = array();
foreach ($ranges as $range) {
$where[] = sprintf(
'%s BETWEEN %d AND %d',
$field,
$range['from'],
$range['to']
);
}
return implode(' OR ', $where);
}
protected function decodeRanges(array $matches) {
$range = array();
foreach ($matches as $match) {
$range[] = array(
'from' => $match[1] + 1,
'to' => ($match[1] + strlen($match[0]))
);
}
return $range;
}
protected function normalizeLengthTo($size) {
if (strlen($this->range) < $size) {
$this->range = str_pad($this->range, $size, '0');
}
}
protected function replaceWith($low, $high, $character) {
$this->normalizeLengthTo($high);
$length = $high - $low + 1;
$stub = str_repeat($character, $length);
$this->range = substr_replace($this->range, $stub, $low - 1, $length);
}
}
用法:
$d = new Dissector();
$d->allow(1, 10);
$d->deny(5, 15);
$d->allow(10, 20);
var_dump($d->findRanges());
var_dump($d->generateSql('foo'));
生成:
array(2) {
[0]=>
array(2) {
["from"]=>
int(1)
["to"]=>
int(4)
}
[1]=>
array(2) {
["from"]=>
int(10)
["to"]=>
int(20)
}
}
string(44) "foo BETWEEN 1 AND 4 OR foo BETWEEN 10 AND 20"
答案 1 :(得分:1)
我花了一点时间试图解决这个问题(这是一个很好的问题),并提出了这个问题。它不是最优的,我也不保证它是完美的,但它可能会让你开始:
<?php
/*$cond = array(
array('a', 5, 15),
array('d', 9, 50),
array('a', 45, 60)
);*/
$cond = array(
array('a', 1, 1000),
array('a', 1500, 1600),
array('a', 1250, 1300),
array('d', 25, 50)
);
$allow = array();
function merge_and_sort(&$allow)
{
usort($allow, function($arr1, $arr2)
{
if ($arr1[0] > $arr2[0])
{
return 1;
}
else
{
return -1;
}
});
$prev = false;
for ($i = 0; $i < count($allow); $i++)
{
$c = $allow[$i];
if ($i > 0 && $allow[$i][0] < $allow[$i - 1][1])
{
if ($allow[$i][1] <= $allow[$i - 1][1])
{
unset($allow[$i]);
}
else
{
$allow[$i - 1][1] = $allow[$i][1];
unset($allow[$i]);
}
}
}
usort($allow, function($arr1, $arr2)
{
if ($arr1[0] > $arr2[0])
{
return 1;
}
else
{
return -1;
}
});
}
function remove_cond(&$allow, $start, $end)
{
for ($i = 0; $i < count($allow); $i++)
{
if ($start > $allow[$i][0])
{
if ($end <= $allow[$i][1])
{
$temp = $allow[$i][1];
$allow[$i][1] = $start;
$allow []= array($end, $temp);
}
else
{
$found = false;
for ($j = $i + 1; $j < count($allow); $j++)
{
if ($end >= $allow[$j][0] && $end < $allow[$j][1])
{
$found = true;
$allow[$j][0] = $end;
}
else
{
unset($allow[$j]);
}
}
if (!$found)
{
$allow[$i][1] = $start;
}
}
}
}
}
foreach ($cond as $c)
{
if ($c[0] == "a")
{
$allow []= array($c[1], $c[2]);
merge_and_sort($allow);
}
else
{
remove_cond($allow, $c[1], $c[2]);
merge_and_sort($allow);
}
}
var_dump($allow);
最后var_dump
次输出:
array(4) {
[0]=>
array(2) {
[0]=>
int(1)
[1]=>
int(25)
}
[1]=>
array(2) {
[0]=>
int(50)
[1]=>
int(1000)
}
[2]=>
array(2) {
[0]=>
int(1250)
[1]=>
int(1300)
}
[3]=>
array(2) {
[0]=>
int(1500)
[1]=>
int(1600)
}
}
编辑使用第一个示例而不是第二个示例。
答案 2 :(得分:0)
我会一次处理一条指令,创建一个应该包含的数字列表。然后最终将该列表转换为where子句的一组范围。这是一些伪代码:
$numbers = array();
foreach (conditions as $condition) {
if ($condition is include) {
for ($i = $condition.start; $i <= $condition.end; $i++) {
$numbers[$i] = true;
}
} else {
for ($i = $condition.start; $i <= $condition.end; $i++) {
unset($numbers[$i]);
}
}
}
ksort($numbers);
答案 3 :(得分:0)
我在IRC上询问并收到两份回复。我打算将它们发布,以便其他人可能受益(因此我不会丢失它们,因为我很快就会对它们进行详细研究)。
<pre><?php
$cond = array(
array('a', 5, 15),
array('a', 5, 15),
array('d', 9, 50),
array('a', 45, 60),
array('a', 2, 70),
array('d', 1, 150),
);
function buildAcl($set) {
$allow = array();
foreach($set as $acl) {
$range = range($acl[1], $acl[2]);
switch($acl[0]) {
case 'a':
$allow = array_unique(array_merge(array_values($allow), $range));
break;
case 'd':
foreach($range as $entry) {
unset($allow[array_search($entry, $allow)]);
}
}
}
return $allow;
}
var_dump(buildAcl($cond));
var_dump(buildAcl(array(array('a', 5, 15), array('d', 10, 50), array('a', 45, 60))));
<?php
$conds = array(
array('a', 5, 15),
array('a', 5, 15),
array('d', 9, 50),
array('a', 45, 60),
array('a', 2, 70),
array('d', 1, 150),
);
$segments = array();
foreach($conds as $cond)
{
print($cond[0] . ': ' . $cond[1] . ' - ' . $cond[2] . "\n");
if ($cond[0] == 'a')
{
$new_segments = array();
$inserted = false;
$prev_segment = false;
foreach($segments as $segment)
{
if ($segment['begin'] > $cond[2])
{
$new_segments[] = array('begin' => $cond[1], 'end' => $cond[2]);
$new_segments[] = $segment;
$inserted = true;
print("begun\n");
continue;
}
if ($segment['end'] < $cond[1])
{
print("end\n");
$new_segments[] = $segment;
continue;
}
if ($cond[1] < $segment['begin'])
{
$segment['begin'] = $cond[1];
}
if ($cond[2] > $segment['end'])
{
$segment['end'] = $cond[2];
}
$inserted = true;
if (
$prev_segment &&
($prev_segment['begin'] <= $segment['begin']) &&
($prev_segment['end'] >= $segment['end'])
)
{
print("ignore identical\n");
continue;
}
print("default\n");
$prev_segment = $segment;
$new_segments[] = $segment;
}
if (!$inserted)
{
print("inserted at end\n");
$new_segments[] = array('begin' => $cond[1], 'end' => $cond[2]);
}
$segments = $new_segments;
print("---\n");
}
if ($cond[0] == 'd')
{
$new_segments = array();
foreach($segments as $segment)
{
# not contained in segment
if ($segment['begin'] > $cond[2])
{
print("delete segment is in front\n");
$new_segments[] = $segment;
continue;
}
if ($segment['end'] < $cond[1])
{
print("delete segment is behind\n");
$new_segments[] = $segment;
continue;
}
# delete whole segment
if (
($segment['begin'] >= $cond[1]) &&
($segment['end'] <= $cond[2])
)
{
print("delete whole segment\n");
continue;
}
# delete starts at boundary
if ($cond[1] <= $segment['begin'])
{
print("delete at boundary start\n");
$segment['begin'] = $cond[2];
$new_segments[] = $segment;
continue;
}
# delete ends at boundary
if ($cond[2] >= $segment['end'])
{
print("delete at boundary end\n");
$segment['end'] = $cond[1];
$new_segments[] = $segment;
continue;
}
# split into two segments
print("split into two\n");
$segment_pre = array('begin' => $segment['begin'], 'end' => $cond[1]);
$segment_post = array('begin' => $cond[2], 'end' => $segment['end']);
$new_segments[] = $segment_pre;
$new_segments[] = $segment_post;
}
print("--\n");
$segments = $new_segments;
}
print("----\n");
var_dump($segments);
print("----\n");
}
var_dump($segments);