Question

我有一个总数为540000的数字列表。我想将此列表排序为3个列表，每个列表总计180000.假设数字列表是一个平面文件，最有效的编程方法是什么？每行一个数字？

Answer 1

听起来像Knapsack problem的变体。知道这些数字的大小和数量是有用的 - 大小有很大的变化，或者它们在规模上是否相似 - 是否有很多（> 1000）或只是少数（<100）？

一种快速而又脏的方法是将它们按大小顺序排序 - 从大到小 - 然后循环它们，将第一个放在第一个列表中，第二个放入第二个列表，第三个放入第三个列表，然后返回并将第四个放入第一个列表......依此类推。对于许多小数字可能会很好地工作......但是对于数据集的不同类型还有其他方法。

Answer 2

for i as integer = 1 to 180000
put data in array 1
next i

for i as integer = 180001 to 360000
put data in array 2
next i

for i as integer = 360001 to 540000
put data in array 3
next i

Answer 3

这对我来说有NP-hardness的气味 - 在这种情况下，没有“有效”的方法来做到这一点。虽然你可能想出任何可以很好地解决它的启发式方法。

话虽如此，你仍然会遇到[179998,180001,180001]等名单的问题：）

Answer 4

我已经编写了一些Java代码来为您完成大部分工作。

较小的方法采用数字列表和要实现的总数，并返回一组数字列表，这些数字加起来总计。您可以使用18000和您的数字列表运行它。

对于返回的每个数字列表，您需要创建一个缺少已使用数字的新列表，并再次运行18000上的方法。

如果第二次调用返回一个或多个列表，你就会知道问题是可以解决的，因为剩下的数字也会增加到18000个。

无论如何，这是代码。是的，这只是递归蛮力。很可能没有经过验证的方法可以通过任何其他方法始终做得更好。如果它运行了很长时间，不要怪我;你可能想先用较小的例子来试试。

import java.util.*; 

public class Listen {

   private static Set<List<Integer>> makeFrom(int total, List<Integer> numbers) {
      Set<List<Integer>> results = new HashSet<List<Integer>>();
      List<Integer> soFar = new ArrayList<Integer>();
      makeFrom(results, total, soFar, numbers, 0);
      return results;
   }

   private static void makeFrom(Set<List<Integer>> results, int total, List<Integer> soFar, List<Integer> numbers, int startingAt) {
      if (startingAt >= numbers.size()) return;
      for (int p=startingAt; p<numbers.size(); p++) {
         Integer number = numbers.get(p);
         List<Integer> newSoFar = new ArrayList<Integer>(soFar);
         newSoFar.add(number);
         int newTotal = total - number;
         if (newTotal < 0) continue;
         if (newTotal == 0) {
            Collections.sort(newSoFar);
            results.add(newSoFar);
         } else {
            List<Integer> newNumbers = new ArrayList<Integer>(numbers);
            newNumbers.remove(number);
            makeFrom(results, newTotal, newSoFar, newNumbers, startingAt + 1);
         }
      }
   }

   public static void main(String[] args) {
      List<Integer> numbers = new ArrayList<Integer>();
      for (int j=1; j<11; j++) numbers.add(j);
      for (List<Integer> result : makeFrom(25, numbers)) {
         System.out.println(Arrays.deepToString(result.toArray(new Integer[result.size()])));
      }
   }
}

Answer 5

正如ian-witz已经指出的那样，这可能是NP完全排序的一个问题：这意味着对于一般情况没有真正好的解决方案，没有尝试所有可能性。随着他们处理的数据量的增加，执行此操作的算法往往变得非常缓慢。

那就是说，这是我用于形成子列表的算法，该子列表具有来自给定整数列表的指定总和：

Set up a place to hold your results. The results will all be lists of numbers, each some sub-set of your original list. We don't know how many such lists will result; possibly none.

Put your list of numbers into an array so you can refer to them and access them by index. In the following, I'm assuming array indices starting at 1. Say you have 10 numbers, so you put them into a 10 element array, indexed by positions 1 through 10.

For performance reasons, it may help to sort your array in descending order. It's not necessary though.

Run a first index, call it i, through this array, i.e. through index values 1 through 10. 
For each index value:
  select the number at index position i, call it n1.
  set up a new list of numbers, where we will be assembling a sub-list. call it sublist.
  add n1 to the (so far empty) sublist.
  If i is already at 10, there's nothing more we can do. Otherwise,
  Run a second index, call it j, through the arrray, starting at i+1 and going up to 10.
  For each value of j:
    select the number at index position j, call it n2.
    add n2 to the sublist containing n1
    calculate the sum of our sublist so far: Does it add up to 18000? 
    If the exact total is reached, add the current sublist to our result list.
    If the total is exceeded, there's nothing we can add to make it better, so skip to the next value of j.
    If the total is less than 18000, you need to pick a third number.
    Run a third index, call it k, through the array, starting at j+1 and going up to 10. Skip this if j is already at 10 and there's no place to go.
    For each value of k:
      select the number at k, call it n3
      add n3 to the sublist
      check the sublist total against the expected total
      if the exact total is reached, store the sublist as a result; 
      if it's less than the expected, start a 4th loop, and so on.

      When you're done with checking a value for a loop, e.g. n4, you need to take your latest n4, n3 or whatever back out of the sublist because you'll be trying a different number next.

    Whenever you find a combination of numbers with the correct sum, store it in your results set.

When you've run all your loop counters into the wall (i.e. i is 10 and there's nowhere left to go), your "results" set will contain all sub-lists of the original list that added up to the desired total. It's possible there will be none, in that case there's no (exact) solution to your problem.

If you have 3 or more sub-lists in your results set, you can check if you can find a pair of them that use non-overlapping sets of numbers from the original list. If you have 2, then there should also be a 3rd sub-list containing exactly all the numbers not contained in the first 2 lists, and you have your solution.

我的示例代码没有执行一系列循环;相反，它做一个循环从1到（比方说）10并寻找18000.然后，假设选择的第一个数字是2000，该函数再次递归调用自己，提示从2开始（= i + 1）并且尝试组装总计16000.该函数的调用然后再次调用自己的起始位置（无论+ 1）和总计（16000 - 无论如何），并且它一直用原始子集调用自己问题，直到索引没有更多的空间上升。如果它在途中找到“好”子列表，则将其存储在结果集中。

如何有效地实现这一点取决于你所使用的语言.FORTRAN 77没有递归，Lua没有有效地实现列表或集合，Lisp可能无法有效地索引到列表中。在Java中，我可能使用bitset而不是sublist。我对P4GL一无所知，所以：对于实现，你是独立的！

将值列表分成三个相等的小计

5 个答案: