我正在阅读Cormen的算法介绍(第3版)中的B树一章,发现删除过程非常混乱。我能理解插入算法,因为它提供了伪代码以及一些示例,例如:
但是对于删除,它只是说:“我们勾画删除是如何工作的,而不是提供伪代码”,然后是非常混乱的步骤。第一步说:
- 如果键k在节点x中并且x是叶子,请从x中删除键k。
但是如果我从叶子中删除一个键,如果键的数量小于所需的最小数量,它是否会违反B树属性?
答案 0 :(得分:2)
根据Knuth的定义,阶为m的B树是满足以下属性的树:
让我们看一下下面的B树(第5阶)
让我们看看各种可能的删除。
删除21
没问题。
删除16
需要重新平衡。现在,根包含1个元素,比m / 2(向下舍入)少1。
为了修复它,我们借用一个元素(例如18或21),并将其从该叶移到根本身。如果树更大,我们将递归地向下重复此过程。
总论
请记住,大多数实现都使用“标记为已删除”节点,而不是实际删除节点。 与实际执行删除并可能重新平衡树相比,将节点标记为已删除相对容易。 此外,删除通常不会像插入那样频繁。
答案 1 :(得分:1)
终于可以解决这个问题了。我将使用书中提到的这些案例,但稍微修改一下陈述,使其看起来更清晰:
情况 1:如果 k 的节点是内部节点,但不是叶子节点。这 对应于原始描述案例2。
情况2:如果这个BTree只剩下一个根,并且想从根中删除一个键。请注意,原始描述遗漏了这种情况。虽然在那个阶段根确实变成了叶子(然后转到情况 3),但是当最后 3 个节点已合并为一个叶子节点时,您需要首先将其设置为叶子节点(从删除的角度来看,您只能通过这种方式到达一个叶子节点)。
案例 3:如果节点是叶子节点,并且删除一个键不会 导致节点的键太少。这对应于原来的 描述案例1.
案例4:如果节点是叶子节点,但是移除 来自节点的 k 导致节点本身太少 键或其父键太少。这对应于 原始描述案例3.
我想不出除这些之外的任何情况。
为了让这个删除代码更清晰,更容易想出来,这些是我写的子程序。
def search(self, k, node=None, pth=[], level=0):
"""Give a key to search, return the node (as the page in computer
system concepts), path to this node (which nodes were went through),
idx of the key in the node found, and level in this BTree
of this node."""
node = node or self.root
idx = bisect.bisect_left(node.keys, k)
if idx < len(node.keys) and k == node.keys[idx]:
return (node, pth, idx, level)
elif node.isLeaf:
if k in node.keys:
return (node, pth, idx, level)
raise KeyError("Key not found in this BTree!")
else:
pth.append(node)
return self.search(k, node.pointers[idx], pth, level + 1)
def getSiblings(self, node, parent, i=None):
if i == None:
i = bisect.bisect_left(parent.keys, node.keys[0])
leftSibling, rightSibling = None, None
try:
if i > 0: leftSibling = parent.pointers[i - 1]
except IndexError:
pass
try:
rightSibling = parent.pointers[i + 1]
except IndexError:
pass
return (leftSibling, rightSibling)
def __mergeWithSibling(self, node, leftSibling, rightSibling, pth, i):
if leftSibling != None:
self.__mergeNodes(leftSibling, node, pth[-1], i - 1, i)
else:
self.__mergeNodes(node, rightSibling, pth[-1], i)
def __mergeNodes(self, node, nodeToMerge, parent, i, ptr=None):
if ptr == None: ptr = i + 1
node.keys += parent.keys[i]
node.keys += nodeToMerge.keys
node.pointers += nodeToMerge.pointers
del parent.keys[i]
del parent.pointers[ptr]
def __checkAndMerge(self, node, pth=None):
"""Check if a given node should be merged with its sibling."""
if pth == None:
pth = self.search(node[0])[1]
if len(node.keys) < self.t - 1:
i = bisect.bisect_left(pth[-1].keys, node.keys[0])
leftSibling, rightSibling = self.getSiblings(node, pth[-1], I)
self.__mergeWithSibling(node, leftSibling, rightSibling, pth, I)
self.__checkAndMerge(pth[-1], pth[:-1])
下面是代码(很容易改成你可能想要的伪代码):
def deleteKey(self, k):
node, pth, idx, level = self.search(k)
if not node.isLeaf:
# Case 1: If node of k is an internal node, but not leaf:
i = bisect.bisect_left(node.keys, k)
y = node.pointers[i] # y is the child that precedes k.
if len(y.keys) >= self.t:
# Case 1a, if y has at least t keys
print("Running case 2a")
pred = y.keys[-1]
self.deleteKey(pred)
node.keys[i] = pred
else:
# Case 1b: y has fewer than t keys.
z = node.pointers[i + 1] # z is the child that follows k.
if len(z.keys) >= self.t:
succ = z.keys[0]
self.deleteKey(succ)
node.keys[i] = succ
else: # Case 1c: both y and z have only t - 1 keys.
self.__mergeNodes(y, z, node, I)
self.__checkAndMerge(node, pth[:-1])
self.deleteKey(k)
if self.root.keys == []: self.root = self.root.pointers[0]
return
if node == self.root and node.pointers == []:
"""Case 2: (not included in original specification): if this BTree
only has a root left and want to delete a key from root."""
node.isLeaf = True
node.keys.remove(k)
return
if len(node.keys) >= self.t:
"""Case 3: If node is a leaf, and removing a key does not cause
the node having too few keys."""
node.keys.remove(k)
return
i = bisect.bisect_left(pth[-1].keys, node.keys[0])
leftSibling, rightSibling = self.getSiblings(node, pth[-1], i)
nKeysLeft = len(leftSibling.keys) if leftSibling != None else 0
nKeysRight = len(rightSibling.keys) if rightSibling != None else 0
if nKeysLeft <= self.t - 1 and nKeysRight <= self.t - 1:
"""Case 4a, both siblings have # of keys either 0 or t - 1.
Merge the siblings with a key up level. As this may result in
parent node having keys less than t - 1, therefore merge parents
if necessary. """
#print("pth[-2].keys ", pth[-2].keys)
self.__mergeWithSibling(node, leftSibling, rightSibling, pth, i)
self.__checkAndMerge(pth[-1], pth[:-1])
self.deleteKey(k)
if self.root.keys == []:
self.root = self.root.pointers[0]
return
"""Case 4b: One of the siblings of the node to be deleted have more
than t - 1 keys that can be "borrowed". """
print("Running case 3a.")
if nKeysLeft > self.t - 1:
# Then borrow one key from the left sibling to delete.
node.keys.remove(k)
node.keys.insert(0, pth[-1].keys[-1])
pth[-1].keys.pop()
pth[-1].keys.insert(0, leftSibling.keys[-1])
leftSibling.keys.pop()
else: # Borrow one key from the right sibling to delete.
node.keys.remove(k)
node.keys.append(pth[-1].keys[0])
pth[-1].keys.pop(0)
pth[-1].keys.insert(0, rightSibling.keys[0])
rightSibling.keys.pop(0)
至于原书中的内容,案例 1-3 的描述是正确的(尽管我应该说它们的编写方式非常混乱,一个原因可能是在 BTree 中,您不能真正在这里使用节点的“父”这个词,因为你根本不存储父指针来节省内存,但我只是借用“父”这个词来描述某个链接到要操作的子节点的东西)。另外,我应该说,图 18.8 d) 是完全错误和令人困惑的,因为在这种情况下,删除“D”作为叶节点的键不会影响该节点的有效性,也不会影响第 18.1 章中介绍的 BTree。描述 BTree 的其他部分是有道理的。下面附上我这个BTree的其他子程序供参考:
# -*- coding: utf-8 -*-
"""
Created on Sat Feb 13 17:13:00 2021
@author: Sam_Yan
"""
import bisect
class BTNode:
def __init__(self, keys=None, pointers=None, isLeaf=True):
self.keys = keys or []
self.pointers = pointers or []
self.isLeaf = isLeaf
def __str__(self):
return ("keys: " + str(self.keys) + "\n")
class BTree:
def __init__(self, t=2):
"""t is the degree (# of keys a node contains), ranges between t
and 2t - 1 (both sides included). When t = 2 is a 2-3-4 tree."""
assert (t >= 2 and t == int(t)), "t value of a B-Tree should be >= 2!"
newNode = BTNode()
self.t = t
self.treeStr = ""
self.root = newNode
def insertNonFull(self, node, k):
if node.isLeaf:
bisect.insort(node.keys, k)
return
i = bisect.bisect(node.keys, k)
if len(node.pointers[i].keys) == 2 * self.t - 1:
self.splitChild(node, i)
if k > node.keys[i]: i += 1 # Determine which subtree to go to.
self.insertNonFull(node.pointers[i], k)
def insert(self, k):
r = self.root
if len(self.root.keys) == 2 * self.t - 1:
s = BTNode(isLeaf=False)
self.root = s
s.pointers.append(r)
self.splitChild(s, 0)
self.insertNonFull(s, k)
else:
self.insertNonFull(r, k)
def splitChild(self, node, i):
y = node.pointers[i]
z = BTNode(isLeaf=y.isLeaf)
z.keys = y.keys[self.t:]
if not y.isLeaf: # copy pointers if y is not a leaf:
z.pointers = y.pointers[self.t:]
node.pointers.insert(i + 1, z)
node.keys.insert(i, y.keys[self.t-1])
del y.keys[self.t-1:]
del y.pointers[self.t:]
def __printHelper(self, r=None, level=0):
r = r or self.root
if r != self.root:
self.treeStr += (" " * level + "L-" + str(level) + "-" + str(r))
for node in r.pointers:
self.__printHelper(node, level + 1)
def __delHelper(self, node=None):
if node.isLeaf:
del node.pointers
del node.keys
del node
return
for c in node.pointers:
self.__delHelper(c)
del node.keys
del node.pointers
def __del__(self): # Destruct this BTree.
self.__delHelper(self.root)
def __str__(self):
# Method to obtain string info about this BTree.
self.treeStr = "Root: " + str(self.root)
self.__printHelper()
return self.treeStr
if __name__ == '__main__':
# Testing samples:
t1 = BTree(t=2)
for i in range(28):
t1.insert(i)
print(t1)
print(t1.search(27)[0])
t2 = BTree(t=3)
alphas = "AGDJKNCMEORPSXYZTUV"
alphas = [ch for ch in alphas]
for ch in alphas:
t2.insert(ch)
#print(t2)
t2.insert('B')
#print(t2)
t2.insert('Q')
#print(t2)
t2.insert('L')
#print(t2)
t2.insert('F')
#print(t2)
t2.deleteKey('F')
print(t2)
t2.deleteKey('M')
print(t2)
t2.deleteKey('G')
print(t2)
t2.deleteKey('B')
print(t2)
t2.deleteKey('Z')
print(t2)
t2.deleteKey('D')
print(t2)
"""
t3 = BTree(t=2)
#for ch in ['F', 'S', 'Q', 'K', 'C', 'L', 'H', 'T', 'V', 'W', 'M',
# 'R', 'N', 'P', 'A', 'B', 'X', 'Y', 'D', 'Z', 'E']:
for ch in ['F', 'S', 'Q', 'K', 'C', 'L', 'H', 'T', 'V', 'W', 'M']:
t2.insert(ch)
print(t2)
"""
希望这个有帮助,我真诚地寻找这个实现的更简单的版本,或者指出潜在的错误或问题(从我对这些代码的侧面测试来看,在 BTree 中删除、插入和搜索键是有意义的)。
>