我正在尝试二元搜索。我希望它将值拆分为:
751 755 762 763 774 777 785 797 798 809 814 817 822 824 827 841 847 866 881 891 903 904 908 913 918 919 925 933 940 948 949 968 972 981 988 992 995 1010 1012 1016 1018 1024 1026 1040 1051 1070 1072 1075 1082 1087 1088 1090 1098 1099 1114 1126 1135 1141 1144 1152 1153 1156 1164 1174 1177 1179 1180 1186 1192 1202 1204 1207 1218 1224 1235 1249 1251 1253 1272 1289 1290 1301 1302 1315 1322 ......(再增加15K)
这样我就可以使用以下代码搜索它:
def binarySearch(newdata,number):
i = 0
lower = 0
upper = len(newdata)
while lower < upper:
x = lower + (upper - lower) // 2
val = newdata[x]
if number == val:
return x
elif number > val:
if lower == x:
break
lower = x
elif number < val:
upper = x
return None
如果未对二进制文件进行排序,请使用以下方法对其进行排序:
#SORT
def sorrt(data):
result = []
if len(data) < 2:
return data
mid = int(len(data)/2)
y = sorrt(data[:mid])
z = sorrt(data[mid:])
while (len(y)>0) or (len(z)>0):
if (len(y)>0) and (len(z)>0):
if y[0] > z[0]:
result.append(z[0])
z.pop(0)
else:
result.append(y[0])
y.pop(0)
elif len(z)>0:
for i in z:
result.append(i)
z.pop(0)
else:
for i in y:
result.append(i)
y.pop(0)
return result
所以我的整个代码是:
#BINARY SEARCH
fileName = 'sorted15000.txt'
#SORT
def sorrt(data):
result = []
if len(data) < 2:
return data
mid = int(len(data)/2)
y = sorrt(data[:mid])
z = sorrt(data[mid:])
while (len(y)>0) or (len(z)>0):
if (len(y)>0) and (len(z)>0):
if y[0] > z[0]:
result.append(z[0])
z.pop(0)
else:
result.append(y[0])
y.pop(0)
elif len(z)>0:
for i in z:
result.append(i)
z.pop(0)
else:
for i in y:
result.append(i)
y.pop(0)
return result
def binarySearch(newdata,number):
i = 0
lower = 0
upper = len(newdata)
#ACTIVATE
while lower < upper:
x = lower + (upper - lower) // 2
val = newdata[x]
if number == val:
return x
elif number > val:
if lower == x:
break
lower = x
elif number < val:
upper = x
return None
start =0
with open(fileName) as file:
data = file.read().split()
start_time = time.clock()
number = raw_input("What number?: ")
start_time
newdata = sorrt(data)
pos = binarySearch(newdata,number)
print pos
print "\nTime: "
print time.clock() - start_time, "seconds"
我想确定我在变量number
中搜索的二进制代码的位置。但我得到的是一个如此遥远的位置,例如755返回7565.这样的东西。是什么原因引起了这个问题?我确定我在这里正确实施了.split()
答案 0 :(得分:0)
您的算法是正确的,但它们是在字符串上运行,而您需要整数。使用字符串时,最终会得到'123' < '5'
。所以修改你的输入:
...
data = [int(x) for x in data]
number = int(number)
newdata = sorrt(data)
...