Question

当谷歌搜索有关Python列表理解的信息时，我获得了一个谷歌foobar挑战，我过去几天一直在慢慢地工作以获得乐趣。最新的挑战：

有效地要求生成ID列表，忽略每个新行中增加的数字，直到剩下一个ID为止。然后你应该XOR（^）ID来产生校验和。我创建了一个输出正确答案的工作程序，但是它没有足够的效率在分配的时间内通过所有测试用例（通过6/10）。长度为50,000应该在20秒内产生结果，但需要320.

有人可以引导我朝着正确的方向前进，但请不要为我做这件事，我很乐意用这个挑战推动自己。也许我可以实现一种数据结构或算法来加快计算时间？

代码背后的逻辑：

首先，在
生成ID列表，忽略每个新行中越来越多的ID，从忽略第一行的0开始。
使用for循环对IDS列表中的所有数字进行异或
答案以int

import timeit
def answer(start,length):
    x = start
    lengthmodified = length
    answerlist = []
    for i in range (0,lengthmodified): #Outter for loop runs an amount of times equal to the variable "length".
        prestringresult = 0
        templist = []
        for y in range (x,x + length): #Fills list with ids for new line
            templist.append(y)
        for d in range (0,lengthmodified): #Ignores an id from each line, increasing by one with each line, and starting with 0 for the first
            answerlist.append(templist[d])
        lengthmodified -= 1
        x += length    
        for n in answerlist: #XORs all of the numbers in the list via a loop and saves to prestringresult
            prestringresult ^= n
        stringresult = str(prestringresult) 
        answerlist = [] #Emptys list
        answerlist.append(int(stringresult)) #Adds the result of XORing all of the numbers in the list to the answer list
    #print(answerlist[0]) #Print statement allows value that's being returned to be checked, just uncomment it
    return (answerlist[0]) #Returns Answer



#start = timeit.default_timer()
answer(17,4)
#stop = timeit.default_timer()
#print (stop - start)

Answer 1

你可能需要一种不同的方法，而不仅仅是像John这样的小改进。我刚刚写了一个解决方案，可以在我的电脑上在大约2秒内完成answer(0, 50000)。我仍然是逐行进行的，但不是在行的范围内对所有数字进行xoring，而是一点一点地进行。行中有多少个数字设置为1位？^[*]奇数个数字？然后我翻转我的答案的1位。然后对于2位，4位，8位等，直到2 ³⁰位。因此，对于每一行，它只是31次小计算（而不是实际上有数万个数字）。

[*]可以在距离范围的开始/停止的恒定时间内快速计算。

编辑：由于您要求提供其他提示，以下是如何计算在某个范围（a，b）中设置1位的频率。计算它在范围（0，a）中设置的频率，并从范围（0，b）中设置的频率中减去它。如果范围从零开始，则更容易。 1位设置在某个范围（0，c）的频率是多少？容易：c//2次。那么1位设置在某个范围（a，b）中的频率是多少？只需b//2 - a//2次。较高的位是相似的，只是稍微复杂一点。

编辑2：哦等等，我记得......有一种简单的方法可以计算某些范围内所有数字的xor（a，b）。再次将工作分为做范围（0，a）和范围（0，b）。某些范围（0，c）中所有数字的xor都很容易，因为有一个很好的模式（如果你从0开始为所有c做这个，那就说30）。使用此功能，我现在可以在 0.04秒中解决answer(0, 50000)。

Answer 2

确实不需要templist和answerlist。让我们对你的代码进行几次传递，看看如何消除它们。

首先，让templist初始化为一行。这样：

templist = []
for y in range (x,x + length):
    templist.append(y)

成为这个：

templist = list(range(x, x + length))

然后让我们为answerlist做同样的事情。这样：

for d in range (0,lengthmodified):
    answerlist.append(templist[d])

成为这个：

answerlist.extend(templist[:lengthmodified])

现在让我们来看看他们以后如何使用它们。如果我们暂时忽略lengthmodified -= 1和x += length，我们会：

templist = list(range(x, x + length))
answerlist.extend(templist[:lengthmodified])

for n in answerlist:
    prestringresult ^= n

answerlist = []

而不是扩展answerlist，迭代它，然后清除它，而不是仅仅迭代templist。

templist = list(range(x, x + length))

for n in templist[:lengthmodified]:
    prestringresult ^= n

现在也不需要templist，所以让我们也跳过它。

for n in range(x, x + lengthmodified):
    prestringresult ^= n

templist和answerlist已消失。

此处唯一缺失的部分是answerlist.append(int(stringresult))重新开始工作。我会留下让你弄清楚。

总的来说，这里的教训是尽可能避免明确的for循环。编写大量迭代容器的for循环是一种C思维方式。在Python中，通常有一些方法可以同时咀嚼集合。这样做可以让您利用语言的快速内置操作。

作为奖励，惯用Python也更容易阅读。

Answer 3

我可以在不使用列表的情况下获得一点改进，但是在大数字上它仍然会失败。嵌套循环会降低速度。我认为你需要遵循Pochmann逻辑，因为蛮力很少是解决这些类型问题的方法。

Answer 4

大多数人会在这个问题上超出时间限制。我做到了！这个问题可以通过这种方式得出结论：“找出在一定时间内处于某个范围之间的所有数字的异或。”是的，恒定的时间！

所以从3-6开始，在O（1）时间内答案应该是3 ^ 4 ^ 5 ^ 6 = 4.

解决方案： XOR本质上是关联的。所以A ^ B ^ C可以写成B ^ A ^ C. 另外，我们知道XOR意味着：'和'相同的比特结果为真，即1，不同的比特结果为2。

从这两个性质我们可以写： 3-6中所有数字之间的XOR可写为：

3^4^5^6 = (0^1^2)^(0^1^2) ^ (3^4^5^6)
        = (0^1^2^3^4^5^6) ^ (0^1^2) (this comes from the associative nature of xor)
        = XOR betn all the numbers from (0-6) ^ XOR betn all the numbers from (0-2)...eq(1)

所以现在如果我们能够在恒定时间内找到从0到某个整数的所有数字的XOR，我们将得到答案。

幸运的是，我们有一种模式：

请参阅此示例：

(0-1): 0 ^ 1 = 1 (1)
(0-2): 0 ^ 1 ^ 2 = 3 (2+1)
(0-3): 0 ^ 1 ^ 2 ^ 3 = 0 (0)
(0-4): 0 ^ 1 ^ 2 ^ 3 ^ 4 = 4 (4)

(0-5): 0 ^ 1 ^ 2 ^ 3 ^ 4 ^ 5 = 1 (1)
(0-6): 0 ^ 1 ^ 2 ^ 3 ^ 4 ^ 5 ^ 6 = 7 (6+1)
(0-7): 0 ^ 1 ^ 2 ^ 3 ^ 4 ^ 5 ^ 6 ^  7 = 0 (0)
(0-8): 0 ^ 1 ^ 2 ^ 3 ^ 4 ^ 5 ^ 6 ^ 7 ^ 8 = 8 (8)


So the pattern for finding the xor between all the integers between 0 to n is:
if n%4 == 1 then, answer = 1
if n%4 == 2 then, answer = n+1
if n%4 == 3 then, answer = 0
if n%4 == 0 then answer = n 

Therefore, XOR(0-6) becomes 7 (since 6%4 ==2) and XOR(0-2) becomes 3 (since 2%4 ==2)

Therefore, the eq(1) now becomes:
3^4^5^6 = 7 ^ 3 = 4

现在问题很简单，我们大多数人因为时间限制超出错误而陷入困境，因为我们尝试在每个循环中进行xor，如果输入/迭代次数增加，这将是巨大的。这是我在python中的工作解决方案，其中所有测试用例都是由google传递的：

#Main Program
def answer(start, length):
    checkSum = 0
    for l in range(length, 0, -1):
        checkSum = checkSum ^ (getXor(start + l-1) ^ getXor(start-1))
        start = start + length
    return checkSum

def getXor(x):
    result = [x, 1, x+1, 0]
    return result[x % 4]

基于ID列表有效计算XOR（^）校验和的方法

4 个答案: