我编写了代码来解决以下问题,但是在最后两个测试用例中失败了。我用来解决问题的逻辑听起来很合理,即使在同事审查之后,我们都无法弄清为什么它对前8个测试用例有效,但对后两个(随机生成的)无效。
给出一个字符串,返回输入字符串在一个字符串中的位置 按字母顺序排列的所有可能排列的列表 该字符串中的字符。例如,ABAB的排列是 [AABB,ABAB,ABBA,BAAB,BABA,BBAA]其中ABAB的位置 清单是2。
对于较大的输入,不可能(效率不高)生成排列列表,因此重点是找到位置而不生成字母列表。这可以通过找到字符的频率来完成。对于上面的示例,ABAB中的第一个字符为A,因此在= 0之前和之后= .5,以及= 6之间,因此对于minn为1且maxx为3,将max减小0.5 * 6,即3。 ,只留下[AABB,ABAB,ABBA],以A作为第一个字符的烫发!然后剩下的字符是BAB。 minn = 1且maxx = 3,且介于= 3之间。因此,对于B,minn为2且maxx为3时,将minn增大3 * .33,使minn等于3,等于[ABAB ,ABBA]以AB作为前两个字符的权限!继续对每个字符执行此操作,它将在列表中找到输入。
## Imports
import operator
from collections import Counter
from math import factorial
from functools import reduce
## Main function, returns list position
def listPosition(word):
#turns string into list of numbers, A being 1, B being 2, and so on
val = [ord(char) - 96 for char in word.lower()]
#the result has to be between 1 and the number of permutations
minn = 1
maxx = npermutations(word)
#so we just increase the min and decrease the max based on the sum of freq
#of the characters less than and greater than each character
for indx in range(len(word)):
between = (maxx+1-minn)
before,after = sumfreq(val[indx:],val[indx])
minn = minn + int(round((between * before),0))
maxx = maxx - int(between * after)
return maxx #or minn, doesn't matter. they're equal at this point
## returns the number of permutations for the string (this works)
def npermutations(word):
num = factorial(len(word))
mults = Counter(word).values()
den = reduce(operator.mul, (factorial(v) for v in mults), 1)
return int(num / den)
## returns frequency as a percent for the character in the list of chars
def frequency(val,value):
f = [val.count(i)/len(val) for i in val]
indx = val.index(value)
return f[indx]
#returns sum of frequencies for all chars < (before) and > (after) the said char
def sumfreq(val,value):
before = [frequency(val,i) for i in [i for i in set(val) if i < value]]
after = [frequency(val,i) for i in [i for i in set(val) if i > value]]
return sum(before),sum(after)
tests= ['A','ABAB','AAAB','BAAA','QUESTION','BOOKKEEPER','ABCABC','IMMUNOELECTROPHORETICALLY','ERATXOVFEXRCVW','GIZVEMHQWRLTBGESTZAHMHFBL']
print(listPosition(tests[0]),"should equal 1")
print(listPosition(tests[1]),"should equal 2")
print(listPosition(tests[2]),"should equal 1")
print(listPosition(tests[3]),"should equal 4")
print(listPosition(tests[4]),"should equal 24572")
print(listPosition(tests[5]),"should equal 10743")
print(listPosition(tests[6]),"should equal 13")
print(listPosition(tests[7]),"should equal 718393983731145698173")
print(listPosition(tests[8]),"should equal 1083087583") #off by one digit?
print(listPosition(tests[9]),"should equal 5587060423395426613071") #off by a lot?
答案 0 :(得分:3)
您可以使用仅需要整数算术运算的逻辑。首先,按字典顺序创建第一个排列:
BOOKKEEPER -> BEEEKKOOPR
然后,对于每个字母,您可以计算将其移至其位置所用的唯一排列数。由于第一个字母B已经存在,我们可以忽略它,然后查看其余字母:
B EEEKKOOPR (first)
B OOKKEEPER (target)
要知道将O置于最前面需要进行多少排列,我们计算在E前面,然后在K前面有多少个唯一排列:
E+EEKKOOPR -> 8! / (2! * 2! * 2!) = 40320 / 8 = 5040
K+EEEKOOPR -> 8! / (3! * 2!) = 40320 / 12 = 3360
其中8是要排列的字母数,而2和3是字母的倍数数。因此,经过8400个排列后,我们位于:
BO EEEKKOPR
现在,我们再次计算将第二个O置于最前面所需的排列:
E+EEKKOPR -> 7! / (2! * 2!) = 5040 / 4 = 1260
K+EEEKOPR -> 7! / (3!) = 5040 / 6 = 840
所以经过10500个排列后,我们位于:
BOO EEEKKPR
然后,我们计算将K置于最前面需要进行多少排列:
E+EEKKPR -> 6! / (2! * 2!) = 720 / 4 = 180
所以经过10680个排列后,我们位于:
BOOK EEEKPR
然后,我们计算将第二个K置于最前面需要进行多少排列:
E+EEKPR -> 5! / 2! = 120 / 2 = 60
所以经过10740个排列后,我们位于:
BOOKK EEEPR
接下来的两个字母已经到位,因此我们可以跳至:
BOOKKEE EPR
然后我们计算将P放在最前面需要多少排列:
E+PR -> 2! = 2
因此,经过10742个排列后,我们位于:
BOOKKEEP ER
最后两个字母也已经按顺序排列,所以答案是10743(添加1,因为要求从1开始的索引)。
答案 1 :(得分:2)
@rici指出这是一个浮点错误(请参见Is floating point math broken?)。幸运的是python有fractions
。
明智地使用fractions.Fraction
可以解决此问题,而无需更改代码正文,例如:
from fractions import Fraction
...
## returns the number of permutations for the string (this works)
def npermutations(word):
num = factorial(len(word))
mults = Counter(word).values()
den = reduce(operator.mul, (factorial(v) for v in mults), 1)
return int(Fraction(num, den))
## returns frequency as a percent for the character in the list of chars
def frequency(val,value):
f = [Fraction(val.count(i),len(val)) for i in val]
indx = val.index(value)
return f[indx]
...
In []:
print(listPosition(tests[0]),"should equal 1")
print(listPosition(tests[1]),"should equal 2")
print(listPosition(tests[2]),"should equal 1")
print(listPosition(tests[3]),"should equal 4")
print(listPosition(tests[4]),"should equal 24572")
print(listPosition(tests[5]),"should equal 10743")
print(listPosition(tests[6]),"should equal 13")
print(listPosition(tests[7]),"should equal 718393983731145698173")
print(listPosition(tests[8]),"should equal 1083087583")
print(listPosition(tests[9]),"should equal 5587060423395426613071")
Out[]:
1 should equal 1
2 should equal 2
1 should equal 1
4 should equal 4
24572 should equal 24572
10743 should equal 10743
13 should equal 13
718393983731145698173 should equal 718393983731145698173
1083087583 should equal 1083087583
5587060423395426613071 should equal 5587060423395426613071
基于@ m69的出色解释,这是一个更简单的实现:
from math import factorial
from collections import Counter
from functools import reduce
from operator import mul
def position(word):
charset = Counter(word)
pos = 1 # Per OP 1 index
for letter in word:
chars = sorted(charset)
for char in chars[:chars.index(letter)]:
ns = Counter(charset) - Counter([char])
pos += factorial(sum(ns.values())) // reduce(mul, map(factorial, ns.values()))
charset -= Counter([letter])
return pos
给出与上面相同的结果:
In []:
tests = ['A', 'ABAB', 'AAAB', 'BAAA', 'QUESTION', 'BOOKKEEPER', 'ABCABC',
'IMMUNOELECTROPHORETICALLY', 'ERATXOVFEXRCVW', 'GIZVEMHQWRLTBGESTZAHMHFBL']
print(position(tests[0]),"should equal 1")
print(position(tests[1]),"should equal 2")
print(position(tests[2]),"should equal 1")
print(position(tests[3]),"should equal 4")
print(position(tests[4]),"should equal 24572")
print(position(tests[5]),"should equal 10743")
print(position(tests[6]),"should equal 13")
print(position(tests[7]),"should equal 718393983731145698173")
print(position(tests[8]),"should equal 1083087583")
print(position(tests[9]),"should equal 5587060423395426613071")
Out[]:
1 should equal 1
2 should equal 2
1 should equal 1
4 should equal 4
24572 should equal 24572
10743 should equal 10743
13 should equal 13
718393983731145698173 should equal 718393983731145698173
1083087583 should equal 1083087583
5587060423395426613071 should equal 5587060423395426613071