Question

问题陈述： -

我最近问这个采访问题..我能够提出以下代码，只运行在O（k log n） -

给定k <= n个排序的数组，每个数组大小为n，存在一个数据结构，需要O（kn）预处理时间和内存，在O（k + log n）时间内回答迭代搜索查询。

我有k个排序列表，每个列表大小为n。目前我有5个排序列表的硬编码，每个列表大小为3，但一般来说可能是非常高的数字 -

我想在每个k列表中搜索单个元素。

显然，我可以单独搜索每个数组，这将导致O（k log n），其中k是排序数组的数量。

我们可以在O（k + log n）中进行，其中k是排序数组的数量吗？我认为可能有更好的方法，因为我们正在进行相同的搜索k次 -

private List<List<Integer>> dataInput;

public SearchItem(final List<List<Integer>> inputs) {
    dataInput = new ArrayList<List<Integer>>();
    for (List<Integer> input : inputs) {
        dataInput.add(new ArrayList<Integer>(input));
    }
}

public List<Integer> getItem(final Integer x) {
    List<Integer> outputs = new ArrayList<Integer>();
    for (List<Integer> data : dataInput) {
        int i = Collections.binarySearch(data, x); // binary searching the item
        if (i < 0)
            i = -(i + 1);
        outputs.add(i == data.size() ? null : data.get(i));
    }
    return outputs;
}

public static void main(String[] args) {
    List<List<Integer>> lists = new ArrayList<List<Integer>>();

    List<Integer> list1 = new ArrayList<Integer>(Arrays.asList(3, 4, 6));
    List<Integer> list2 = new ArrayList<Integer>(Arrays.asList(1, 2, 3));
    List<Integer> list3 = new ArrayList<Integer>(Arrays.asList(2, 3, 6));
    List<Integer> list4 = new ArrayList<Integer>(Arrays.asList(1, 2, 3));
    List<Integer> list5 = new ArrayList<Integer>(Arrays.asList(4, 8, 13));

    lists.add(list1);
    lists.add(list2);
    lists.add(list3);
    lists.add(list4);
    lists.add(list5);

    SearchItem search = new SearchItem(lists);
    System.out.println(dataInput);

    List<Integer> dataOuput = search.getItem(5);

    System.out.println(dataOuput);
}

无论我在上面的代码方法中看到什么输出，都应该使用新方法，该方法应该适用于O(k + log n)。

这有可能实现吗？任何人都可以提供一个例子，我的例子将如何运作？

Answer 1

这项技术被称为Fractional cascading，听起来非常酷。你做的是以下几点：

获取列表1.获取它的每一个元素并将其合并到列表2.现在，“新”列表2包含其所有元素和列表1中的一半元素。你记得哪些是从列表1和指针回到列表1然后你从前到后传递新创建的列表2，为每个元素添加指向您看到的列表1中的最后一个元素的指针和最后一个元素从你看到的清单2。从后到前做同样的事。
获取“新”列表2，其中嵌入了一半列表1的元素，并将其与列表3等合并。

生成的交错看起来像这样：

$fractional cascading$

（资料来源："You could have invented fractional cascading" by Edward Z. Yang）

并且每个列表元素都有几个指针可以快速查找某种类型的前任/后继者，并在列表i - 1中找到位置。

原来列表元素的总数只增加了一个常数因子，但很酷的是你现在可以快速查询：

在“新”列表k中进行二进制搜索以查找搜索元素。复杂性：O(log n)。您现在在原始列表k中找到了该元素，因为您可以在O（1）中找到最初在列表k中的周围元素。
您还可以在O(1)中的列表k - 1中找到元素的位置，因为您有指向列表k-1中的后继/前任的指针。因此，您可以报告所有其他列表的结果每个O(1)

总运行时间：O(log n + k)

如需了解更多信息，请务必阅读blog post，其中包含大量可视化插图和其他说明。

Answer 2

由于您的数组已排序，因此元素具有可比性。使用B树结构，并确保数组没有重叠的段，即每个数组都已排序，内部的任何项都是

item＆lt;首先是所有其他数组;要么 item＆gt;最后是所有其他数组。

然后通过比较搜索项目来实现O（k + logn），使得第一个＆lt;搜索项＆lt;持续;然后在里面进行log（n）搜索。

但基本上这可以是O（logk + logn）。

Answer 3

其他人可能已经回答了这个问题（我还没有刷新页面）。但是这里有一个合并应该在O（hn）中工作的列表的方法。我实际上没有在编辑器中测试语法，但我认为这个想法应该有用......

调用此方法后，您应该只能在合并列表上进行二进制搜索。

public static List<Integer> mergeSortedLists(List<List<Integer>> sortedLists){
  List<Integer> mergedList = new List<Integer>();
  int listIndexes[] = new int[sortedLists.size];
  //initialize indexes to 0
  for(int i=0; i<sortedLists.Count(); i++){
    listIndex[i] = 0;
  }  
  int completedLists=0;
  int lowestValue;
  int lowestIndex;
  while(completedLists < sortedLists.Count()){  
    lowestValue = sortedLists[0][listIndexes[0]];
    lowestIndex = 0;
    for(int i=0; i<sortedLists.Count(); i++){      
      int currentIndex = listIndexes[i];      
      List<Integer> currentList = sortedLists[i];
      if(currentIndex >= currentList) continue; //already finished merging this list skip
      int currentValue = currentList[currentIndex];
      if(currentValue < lowestValue){
         lowestValue = currentValue;
         lowestIndex = currentIndex;
      }
    }
    //put the lowest found value into mergedList and increment index
    mergedList.Add(lowestValue);
    listIndexes[lowestIndex]++;
    //if incremented index is equal to increment completed Lists - when all lists are marked
    //complete the while loop will be broken out of and merge should be complete
    if(listIndexes[lowestIndex] == sortedLists[lowestIndex].Count()){
        completedLists++;   
    }
  }
  return mergedList;
}

在多个排序列表中有效地查找元素？

3 个答案: