我有一个关于工作的入门测试的问题。我没有通过考试。我依照公司来掩饰这个问题。
想象一下,你在A X B空间的公园里有N个人。如果一个人在50英尺内没有其他人,他就享有他的隐私。否则,他的个人空间就会受到侵犯。给定一组(x,y),有多少人会违反他们的空间?
例如,在Python中提供此列表:
人= [(0,0),(1,1),(1000,1000)]
我们会发现2个人的空间受到侵犯:1,2。
我们不需要找到所有人;只是独特人的总数。
您无法使用粗暴方法来解决问题。换句话说,您不能在数组中使用简单数组。
我一直在研究这个问题已经持续了几个星期,虽然我得到的解决方案比n ^ 2更快,但还没有出现可扩展的问题。
我认为解决这个问题的唯一正确的方法是使用Fortune的算法?
这是我在Python中所拥有的(不使用Fortune' s算法):
import math
import random
random.seed(1) # Setting random number generator seed for repeatability
TEST = True
NUM_PEOPLE = 10000
PARK_SIZE = 128000 # Meters.
CONFLICT_RADIUS = 500 # Meters.
def _get_distance(x1, y1, x2, y2):
"""
require: x1, y1, x2, y2: all integers
return: a distance as a float
"""
distance = math.sqrt(math.pow((x1 - x2), 2) + math.pow((y1 - y2),2))
return distance
def check_real_distance(people1, people2, conflict_radius):
"""
determine if two people are too close
"""
if people2[1] - people1[1] > conflict_radius:
return False
d = _get_distance(people1[0], people1[1], people2[0], people2[1])
if d >= conflict_radius:
return False
return True
def check_for_conflicts(peoples, conflict_radius):
# sort people
def sort_func1(the_tuple):
return the_tuple[0]
_peoples = []
index = 0
for people in peoples:
_peoples.append((people[0], people[1], index))
index += 1
peoples = _peoples
peoples = sorted(peoples, key = sort_func1)
conflicts_dict = {}
i = 0
# use a type of sweep strategy
while i < len(peoples) - 1:
x_len = peoples[i + 1][0] - peoples[i][0]
conflict = False
conflicts_list =[peoples[i]]
j = i + 1
while x_len <= conflict_radius and j < len(peoples):
x_len = peoples[j][0] - peoples[i][0]
conflict = check_real_distance(peoples[i], peoples[j], conflict_radius)
if conflict:
people1 = peoples[i][2]
people2 = peoples[j][2]
conflicts_dict[people1] = True
conflicts_dict[people2] = True
j += 1
i += 1
return len(conflicts_dict.keys())
def gen_coord():
return int(random.random() * PARK_SIZE)
if __name__ == '__main__':
people_positions = [[gen_coord(), gen_coord()] for i in range(NUM_PEOPLE)]
conflicts = check_for_conflicts(people_positions, CONFLICT_RADIUS)
print("people in conflict: {}".format(conflicts))
答案 0 :(得分:3)
从评论中可以看出,有很多方法可以解决这个问题。在面试的情况下,你可能想尽可能多地列出,并说出每个人的优点和缺点。
对于上述问题,如果您有一个固定的半径,最简单的方法可能是rounding and hashing。 k-d树等是强大的数据结构,但它们也非常复杂,如果你不需要反复查询它们或者添加和删除对象,它们可能会有点过分。哈希可以实现线性时间,而空间树是n log n,尽管它可能取决于点的分布。
要理解散列和舍入,只需将其视为将空间划分为长度等于要检查的半径的正方形网格。每个方块都有自己的“邮政编码”,您可以将其用作哈希键,以便在该方格中存储值。您可以通过将x和y坐标除以半径来计算点的邮政编码,然后向下舍入,如下所示:
def get_zip_code(x, y, radius):
return str(int(math.floor(x/radius))) + "_" + str(int(math.floor(y/radius)))
我正在使用字符串,因为它很简单,但只要为每个方块生成唯一的邮政编码,就可以使用任何字符串。
创建一个字典,其中键是邮政编码,值是该邮政编码中所有人的列表。要检查冲突,请一次添加一个人,然后在添加每个人之前,测试与同一邮政编码中所有人的冲突,以及邮政编码的8个邻居。我重复使用你的方法来跟踪冲突:
def check_for_conflicts(peoples, conflict_radius):
index = 0
d = {}
conflicts_dict = {}
for person in peoples:
# check for conflicts with people in this person's zip code
# and neighbouring zip codes:
for offset_x in range(-1, 2):
for offset_y in range(-1, 2):
offset_zip_code = get_zip_code(person[0] + (offset_x * conflict_radius), person[1] + (offset_y * conflict_radius), conflict_radius)
if offset_zip_code in d:
# get a list of people in this zip:
other_people = d[offset_zip_code]
# check for conflicts with each of them:
for other_person in other_people:
conflict = check_real_distance(person, other_person, conflict_radius)
if conflict:
people1 = index
people2 = other_person[2]
conflicts_dict[people1] = True
conflicts_dict[people2] = True
# add the new person to their zip code
zip_code = get_zip_code(person[0], person[1], conflict_radius)
if not zip_code in d:
d[zip_code] = []
d[zip_code].append([person[0], person[1], index])
index += 1
return len(conflicts_dict.keys())
时间的复杂性取决于几个方面。如果你增加人数,但不增加你分配它们的空间的大小,那么它将是O(N 2 ),因为冲突的数量将以二次方式增加而且你必须把它们全部计算在内。但是,如果你增加空间和人数,那么密度是相同的,它将更接近O(N)。
如果您只计算独特的人数,则可以计算每个邮政编码中有多少人至少有1次冲突。如果它与邮政编码中的每个人相等,那么在第一次与新人发生冲突后,您可以提前退出检查给定邮编中的冲突的循环,因为不会再找到任何唯一身份证。您也可以循环两次,在第一个循环中添加所有人,并在第二个循环上进行测试,当您发现每个人的第一个冲突时突然出现循环。
答案 1 :(得分:0)
以下是我对这个有趣问题的解决方案:
from math import sqrt
import math
import random
class Person():
def __init__(self, x, y, conflict_radius=500):
self.position = [x, y]
self.valid = True
self.radius = conflict_radius**2
def validate_people(self, people):
P0 = self.position
for p in reversed(people):
P1 = p.position
dx = P1[0] - P0[0]
dy = P1[1] - P0[1]
dx2 = (dx * dx)
if dx2 > self.radius:
break
dy2 = (dy * dy)
d = dx2 + dy2
if d <= self.radius:
self.valid = False
p.valid = False
def __str__(self):
p = self.position
return "{0}:{1} - {2}".format(p[0], p[1], self.valid)
class Park():
def __init__(self, num_people=10000, park_size=128000):
random.seed(1)
self.num_people = num_people
self.park_size = park_size
def gen_coord(self):
return int(random.random() * self.park_size)
def generate(self):
return [[self.gen_coord(), self.gen_coord()] for i in range(self.num_people)]
def naive_solution(data):
sorted_data = sorted(data, key=lambda x: x[0])
len_sorted_data = len(sorted_data)
result = []
for index, pos in enumerate(sorted_data):
print "{0}/{1} - {2}".format(index, len_sorted_data, len(result))
p = Person(pos[0], pos[1])
p.validate_people(result)
result.append(p)
return result
if __name__ == '__main__':
people_positions = Park().generate()
with_conflicts = len(filter(lambda x: x.valid, naive_solution(people_positions)))
without_conflicts = len(filter(lambda x: not x.valid, naive_solution(people_positions)))
print("people with conflicts: {}".format(with_conflicts))
print("people without conflicts: {}".format(without_conflicts))
我确信代码仍然可以进一步优化
答案 2 :(得分:0)
您可以看到this topcoder链接和“最近对”部分。您可以修改最近的对算法,使距离h始终为50。
所以,你基本上做的是,
你可以在C ++中使用 set 作为二叉树。但是我找不到python set 是否支持范围查询或upper_bound和lower_bound。如果有人知道,请在评论中指出。
答案 3 :(得分:0)
我找到了相对解决问题的方法。按X值对坐标列表进行排序。然后一次查看每个X值。向右扫描,检查下一个位置的位置,直到达到扫描区域的末端(500米),或发现冲突。
如果未发现冲突,请以相同方式向左扫描。此方法可避免不必要的检查。例如,如果公园内有1,000,000人,那么所有人都会发生冲突。该算法只会检查每个人一次:一旦发现冲突,搜索就会停止。
我的时间似乎是O(N)。
以下是代码:
import math
import random
random.seed(1) # Setting random number generator seed for repeatability
NUM_PEOPLE = 10000
PARK_SIZE = 128000 # Meters.
CONFLICT_RADIUS = 500 # Meters.
check_real_distance = lambda conflict_radius, people1, people2: people2[1] - people1[1] <= conflict_radius \
and math.pow(people1[0] - people2[0], 2) + math.pow(people1[1] - people2[1], 2) <= math.pow(conflict_radius, 2)
def check_for_conflicts(peoples, conflict_radius):
peoples.sort(key = lambda x: x[0])
conflicts_dict = {}
i = 0
num_checks = 0
# use a type of sweep strategy
while i < len(peoples) :
conflict = False
j = i + 1
#sweep right
while j < len(peoples) and peoples[j][0] - peoples[i][0] <= conflict_radius \
and not conflict and not conflicts_dict.get(i):
num_checks += 1
conflict = check_real_distance(conflict_radius, peoples[i], peoples[j])
if conflict:
conflicts_dict[i] = True
conflicts_dict[j] = True
j += 1
j = i - 1
#sweep left
while j >= 0 and peoples[i][0] - peoples[j][0] <= conflict_radius \
and not conflict and not conflicts_dict.get(i):
num_checks += 1
conflict = check_real_distance(conflict_radius, peoples[j], peoples[i])
if conflict:
conflicts_dict[i] = True
conflicts_dict[j] = True
j -= 1
i += 1
print("num checks is {0}".format(num_checks))
print("num checks per size is is {0}".format(num_checks/ NUM_PEOPLE))
return len(conflicts_dict.keys())
def gen_coord():
return int(random.random() * PARK_SIZE)
if __name__ == '__main__':
people_positions = [[gen_coord(), gen_coord()] for i in range(NUM_PEOPLE)]
conflicts = check_for_conflicts(people_positions, CONFLICT_RADIUS)
print("people in conflict: {}".format(conflicts))
答案 4 :(得分:0)
我想出了一个似乎需要O(N)时间的答案。策略是按X值对数组进行排序。对于每个X值,向左扫描直到找到冲突,或者距离超过冲突距离(500 M)。如果未发现冲突,则以相同方式向左扫描。使用此技术,您可以限制搜索量。
以下是代码:
import math
import random
random.seed(1) # Setting random number generator seed for repeatability
NUM_PEOPLE = 10000
PARK_SIZE = 128000 # Meters.
CONFLICT_RADIUS = 500 # Meters.
check_real_distance = lambda conflict_radius, people1, people2: people2[1] - people1[1] <= conflict_radius \
and math.pow(people1[0] - people2[0], 2) + math.pow(people1[1] - people2[1], 2) <= math.pow(conflict_radius, 2)
def check_for_conflicts(peoples, conflict_radius):
peoples.sort(key = lambda x: x[0])
conflicts_dict = {}
i = 0
num_checks = 0
# use a type of sweep strategy
while i < len(peoples) :
conflict = False
j = i + 1
#sweep right
while j < len(peoples) and peoples[j][0] - peoples[i][0] <= conflict_radius \
and not conflict and not conflicts_dict.get(i):
num_checks += 1
conflict = check_real_distance(conflict_radius, peoples[i], peoples[j])
if conflict:
conflicts_dict[i] = True
conflicts_dict[j] = True
j += 1
j = i - 1
#sweep left
while j >= 0 and peoples[i][0] - peoples[j][0] <= conflict_radius \
and not conflict and not conflicts_dict.get(i):
num_checks += 1
conflict = check_real_distance(conflict_radius, peoples[j], peoples[i])
if conflict:
conflicts_dict[i] = True
conflicts_dict[j] = True
j -= 1
i += 1
print("num checks is {0}".format(num_checks))
print("num checks per size is is {0}".format(num_checks/ NUM_PEOPLE))
return len(conflicts_dict.keys())
def gen_coord():
return int(random.random() * PARK_SIZE)
if __name__ == '__main__':
people_positions = [[gen_coord(), gen_coord()] for i in range(NUM_PEOPLE)]
conflicts = check_for_conflicts(people_positions, CONFLICT_RADIUS)
print("people in conflict: {}".format(conflicts))