Question

SO，

问题

从SQL我得到一个包含字符串的数组（平面数组） - 让它成为

$rgData = ['foo', 'bar', 'baz', 'bee', 'feo'];

现在，我想得到这个数组的对和三元组的可能组合（通常情况下，4个元素的组合e t.c.）。更具体一点：我的意思是combinations在数学意义上（没有重复），即那些，等于

enter image description here

- 对于上面的数组，对于对和三元组都是10。

我的方法

我已经开始将 enter image description here 的可能值映射到可能的数组选定项目。我目前的解决方案是指出一个元素被选为“1”，否则为“0”。对于上面的示例，将是：

foo bar baz bee feo
 0   0   1   1   1   -> [baz, bee, feo]
 0   1   0   1   1   -> [bar, bee, feo]
 0   1   1   0   1   -> [bar, baz, feo]
 0   1   1   1   0   -> [bar, baz, bee]
 1   0   0   1   1   -> [foo, bee, feo]
 1   0   1   0   1   -> [foo, baz, feo]
 1   0   1   1   0   -> [foo, baz, bee]
 1   1   0   0   1   -> [foo, baz, feo]
 1   1   0   1   0   -> [foo, bar, bee]
 1   1   1   0   0   -> [foo, bar, baz]

我需要做的就是以某种方式产生所需的位集。这是我在PHP中的代码：

function nextAssoc($sAssoc)
{
   if(false !== ($iPos = strrpos($sAssoc, '01')))
   {
      $sAssoc[$iPos]   = '1';
      $sAssoc[$iPos+1] = '0';
      return substr($sAssoc, 0, $iPos+2).
             str_repeat('0', substr_count(substr($sAssoc, $iPos+2), '0')).
             str_repeat('1', substr_count(substr($sAssoc, $iPos+2), '1'));
   }
   return false;
}

function getAssoc(array $rgData, $iCount=2)
{
   if(count($rgData)<$iCount)
   {
      return null;
   }
   $sAssoc   = str_repeat('0', count($rgData)-$iCount).str_repeat('1', $iCount);
   $rgResult = [];
   do
   {
      $rgResult[]=array_intersect_key($rgData, array_filter(str_split($sAssoc)));
   }
   while($sAssoc=nextAssoc($sAssoc));
   return $rgResult;
}

- 我选择将我的位存储为普通字符串。我生成下一个关联的算法是：

尝试找到“01”。如果没有找到，那么它是11..100..0的情况（所以它是最大的，不能找到更多）。如果找到，请转到第二步
在字符串中转到“01”的最右侧位置。将其切换为“10”，然后将所有比找到“01”位置更粗的零 - 向左移动。例如，01110：“01”的最右侧位置为0，因此首先我们将此“01”切换为“10”。字符串现在为10110。现在，转到右边的部分（没有10部分，所以它从0 + 2 = 2-nd符号开始），并将所有零移到左边，即110将是011 。因此，我们10 + 011 = 10111作为01110的下一个关联。

我发现了类似的问题here - 但OP希望组合重复，而我希望它们没有重复。

问题

我的问题是关于两点：

对于我的解决方案，可能还有另一种方法可以提高下一位的效率吗？
可能有更简单的解决方案吗？这似乎是标准问题。

Answer 1

我很抱歉没有提供PHP解决方案，因为我现在很长时间没有用PHP编程，但是让我向您展示一个快速的Scala解决方案。也许它会激励你：

val array = Vector("foo", "bar", "baz", "bee", "feo")
for (i <- 0 until array.size; 
     j <- i + 1 until array.size; 
     k <- j + 1 until array.size)      
    yield (array(i), array(j), array(k))

结果：

Vector((foo,bar,baz), (foo,bar,bee), (foo,bar,feo), (foo,baz,bee), (foo,baz,feo), (foo,bee,feo), (bar,baz,bee), (bar,baz,feo), (bar,bee,feo), (baz,bee,feo))

用于生成k组合的通用代码：

def combinations(array: Vector[String], k: Int, start: Int = 0): Iterable[List[String]] = { 
  if (k == 1 || start == array.length) 
    for (i <- start until array.length) yield List(array(i))
  else 
    for (i <- start until array.length; c <- combinations(array, k - 1, i + 1)) yield array(i) :: c 
}

结果：

scala> combinations(Vector("a", "b", "c", "d", "e"), 1)
res8: Iterable[List[String]] = Vector(List(a), List(b), List(c), List(d), List(e))

scala> combinations(Vector("a", "b", "c", "d", "e"), 2)
res9: Iterable[List[String]] = Vector(List(a, b), List(a, c), List(a, d), List(a, e), List(b, c), List(b, d), List(b, e), List(c, d), List(c, e), List(d, e))

scala> combinations(Vector("a", "b", "c", "d", "e"), 3)
res10: Iterable[List[String]] = Vector(List(a, b, c), List(a, b, d), List(a, b, e), List(a, c, d), List(a, c, e), List(a, d, e), List(b, c, d), List(b, c, e), List(b, d, e), List(c, d, e))

scala> combinations(Vector("a", "b", "c", "d", "e"), 4)
res11: Iterable[List[String]] = Vector(List(a, b, c, d), List(a, b, c, e), List(a, b, d, e), List(a, c, d, e), List(b, c, d, e))

scala> combinations(Vector("a", "b", "c", "d", "e"), 5)
res12: Iterable[List[String]] = Vector(List(a, b, c, d, e))

当然，关于可接受的元素类型和集合类型，真正的scala代码应该更加通用，但我只想展示基本思想，而不是最美丽的Scala代码。

Answer 2

这是一个递归解决方案：

function subcombi($arr, $arr_size, $count)
{
   $combi_arr = array();
   if ($count > 1) {
      for ($i = $count - 1; $i < $arr_size; $i++) {
         $highest_index_elem_arr = array($i => $arr[$i]);
         foreach (subcombi($arr, $i, $count - 1) as $subcombi_arr) {
            $combi_arr[] = $subcombi_arr + $highest_index_elem_arr;
         }
      }
   } else {
      for ($i = $count - 1; $i < $arr_size; $i++) {
         $combi_arr[] = array($i => $arr[$i]);
      }
   }
   return $combi_arr;
}

function combinations($arr, $count)
{
   if ( !(0 <= $count && $count <= count($arr))) {
      return false;
   }
   return $count ? subcombi($arr, count($arr), $count) : array();
}    

$input_arr = array('foo', 'bar', 'baz', 'bee', 'feo');
$combi_arr = combinations($input_arr, 3);
var_export($combi_arr); echo ";\n";

OUTPUT:

array (
  0 => 
  array (
    0 => 'foo',
    1 => 'bar',
    2 => 'baz',
  ),
  1 => 
  array (
    0 => 'foo',
    1 => 'bar',
    3 => 'bee',
  ),
  2 => 
  array (
    0 => 'foo',
    2 => 'baz',
    3 => 'bee',
  ),
  3 => 
  array (
    1 => 'bar',
    2 => 'baz',
    3 => 'bee',
  ),
  4 => 
  array (
    0 => 'foo',
    1 => 'bar',
    4 => 'feo',
  ),
  5 => 
  array (
    0 => 'foo',
    2 => 'baz',
    4 => 'feo',
  ),
  6 => 
  array (
    1 => 'bar',
    2 => 'baz',
    4 => 'feo',
  ),
  7 => 
  array (
    0 => 'foo',
    3 => 'bee',
    4 => 'feo',
  ),
  8 => 
  array (
    1 => 'bar',
    3 => 'bee',
    4 => 'feo',
  ),
  9 => 
  array (
    2 => 'baz',
    3 => 'bee',
    4 => 'feo',
  ),
);

递归是基于以下事实：要获得k $count（n）元素的所有组合，您必须为所有可能的选择最高零基指数$arr_size，找到所有＆＃34;子组合＆＃34;索引低于i的剩余k-1元素中的i个元素。

当数组传递给递归调用时，为了利用PHP＆＃34; lazy copy＆＃34;机制。这样就不会发生真正的复制，因为数组没有被修改。

保存数组索引很适合调试，但不是必需的。令人惊讶的是，只需删除i部分并使用array_slice替换数组$i =>会导致相当大的减速。要获得比原始版本略快的速度，您必须这样做：

<小时/> 关于问题的第一部分，您应该避免多次计算相同的数量，并且应该最小化函数调用。例如，像这样：

array_merge

很难对代码进行更深层次的更改而不会将其彻底改变。虽然它不是太糟糕，因为在我的测试中它的速度大约是我的递归解决方案的一半（即，时间大约是一倍）

Answer 3

我只是试图以最小的时间复杂度来解决此问题，并且不使用Go语言使用递归。

我已经看到了一些解决方案，但是使用了递归函数。避免递归来解决堆栈大小超出错误。

# Import module
import pandas as pd

df1 = pd.DataFrame([["A", 11,  "B", 20],
                    ["A", 11,  "C", 24],
                    ["B", 14,  "R", 19]],
                   columns=["Col_A", "Col_B", "Col_C", "Col_D"])
df2 = pd.DataFrame([[10, 15,  16, 21, 0.99],
                    [10, 15,  17, 22, 0.89],
                    [11, 15,  16, 20, 0.67]],
                   columns=["Col_X", "Col_Y", "Col_P", "Col_Q", "Col_Z"])

# Concat the dataframe
df = pd.concat([df1, df2], axis=1)
print(df)

# Define the conditions
condition_col_b = ((df.Col_X <= df.Col_B) & (df.Col_B < df.Col_Y))
condition_col_d = ((df.Col_P <= df.Col_D) & (df.Col_D < df.Col_Q))

print(condition_col_b & condition_col_d)
# 0     True
# 1    False
# 2     True

# Apply the condition
output = df.where(condition_col_b & condition_col_d)
print(output)
#   Col_A  Col_B Col_C  Col_D  Col_X  Col_Y  Col_P  Col_Q  Col_Z
# 0     A   11.0     B   20.0   10.0   15.0   16.0   21.0   0.99
# 1   NaN    NaN   NaN    NaN    NaN    NaN    NaN    NaN    NaN
# 2     B   14.0     R   19.0   11.0   15.0   16.0   20.0   0.67

# Filter output
print(output[['Col_A', 'Col_C', 'Col_Z']])
#   Col_A Col_C  Col_Z
# 0     A     B   0.99
# 1   NaN   NaN    NaN
# 2     B     R   0.67

工作示例在这里https://play.golang.org/p/D6I5aq8685-

获得可能的阵列组合

3 个答案: