我需要生成所有数字的序列,匹配特定数字范围内的模式。例如:
[0-9]*9[0-9]*
)当然我可以使用蛮力方法,我遍历所有数量的范围并根据模式测试每个数字。这是一个用Python完成的示例实现:
def brute_force(start, end, limit, pattern):
if start < pattern:
start = pattern
if pattern > end:
return []
pattern_str = str(pattern)
generator = (n for n in range(start, end) if pattern_str in str(n))
return list(next(generator) for _ in range(limit))
不幸的是,范围非常大(10 ^ 7个数字),我必须经常在我的程序中更改模式。因此,我需要一种更有效的方法。
重要的是要注意我需要有一个排序列表,我经常只需要X个匹配的数字(参见上面例子中的limit参数)。
我猜这种问题有某种标准算法,但我不知道要搜索什么。有什么想法吗?
答案 0 :(得分:3)
这是一个通用解决方案,可生成最大匹配*(pat)*
的固定大小的数字,其中pat
是任意数字序列。
它按递增顺序生成数字,没有重复。因为它是一个生成器,所以可以很容易地使用它生成前几个结果而不生成整个列表。 (见下面的最后一个例子)。
def numbers(pat, k, matched=1):
if (matched >> len(pat)) & 1:
for x in xrange(10 ** k):
yield x
return
if not ((matched << k) >> len(pat)):
return
for d in xrange(10):
nm = 1
for i, pd in enumerate(pat):
if pd == str(d) and (matched >> i) & 1:
nm |= 1 << (i+1)
for n in numbers(pat, k - 1, nm):
yield 10 ** (k - 1) * d + n
它的工作原理是记录到目前为止已经匹配了多少pat
。为了避免在pat
部分匹配并且下一个数字不匹配时回溯,它保留了所有可能的部分匹配的位图,这与(非回溯)正则表达式匹配器的方式非常相似。 / p>
这里有一些测试代码,它会根据缓慢但明显正确的实现进行检查。
def numbers_slow(pat, k):
for i in xrange(10 ** k):
if pat in str(i):
yield i
test_cases = [
('9', 3),
('9', 4),
('123', 4),
('22', 4),
]
for pat, k in test_cases:
got = list(numbers(pat, k))
want = list(numbers_slow(pat, k))
if got != want:
print 'numbers(%s, %d) = %s, want %s' % (pat, k, got, want)
以下是问题中给出的例子:
print list(numbers('9', 3))
[9, 19, 29, 39, 49, 59, 69, 79, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 109,
119, 129, 139, 149, 159, 169, 179, 189, 190, 191, 192, 193, 194, 195, 196, 197,
198, 199, 209, 219, 229, 239, 249, 259, 269, 279, 289, 290, 291, 292, 293, 294,
295, 296, 297, 298, 299, 309, 319, 329, 339, 349, 359, 369, 379, 389, 390, 391,
392, 393, 394, 395, 396, 397, 398, 399, 409, 419, 429, 439, 449, 459, 469, 479,
489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 509, 519, 529, 539, 549,
559, 569, 579, 589, 590, 591, 592, 593, 594, 595, 596, 597, 598, 599, 609, 619,
629, 639, 649, 659, 669, 679, 689, 690, 691, 692, 693, 694, 695, 696, 697, 698,
699, 709, 719, 729, 739, 749, 759, 769, 779, 789, 790, 791, 792, 793, 794, 795,
796, 797, 798, 799, 809, 819, 829, 839, 849, 859, 869, 879, 889, 890, 891, 892,
893, 894, 895, 896, 897, 898, 899, 900, 901, 902, 903, 904, 905, 906, 907, 908,
909, 910, 911, 912, 913, 914, 915, 916, 917, 918, 919, 920, 921, 922, 923, 924,
925, 926, 927, 928, 929, 930, 931, 932, 933, 934, 935, 936, 937, 938, 939, 940,
941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952, 953, 954, 955, 956,
957, 958, 959, 960, 961, 962, 963, 964, 965, 966, 967, 968, 969, 970, 971, 972,
973, 974, 975, 976, 977, 978, 979, 980, 981, 982, 983, 984, 985, 986, 987, 988,
989, 990, 991, 992, 993, 994, 995, 996, 997, 998, 999]
这是一个不太容易的案例:
print list(numbers('11', 3))
[11, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 211, 311, 411, 511, 611,
711, 811, 911]
并且显示该方法的情况即使对于大量数字也是有效的:
print list(numbers('121212121', 10))
[121212121, 1121212121, 1212121210, 1212121211, 1212121212, 1212121213,
1212121214, 1212121215, 1212121216, 1212121217, 1212121218, 1212121219,
2121212121, 3121212121, 4121212121, 5121212121, 6121212121, 7121212121,
8121212121, 9121212121]
为了证明限制生成结果的能力,这里包含长达20个数字的前21个数字包含&#39; 100000&#39;:
print itertools.islice(numbers('100000', 20), 21)
[100000, 1000000, 1000001, 1000002, 1000003, 1000004, 1000005, 1000006, 1000007,
1000008, 1000009, 1100000, 2100000, 3100000, 4100000, 5100000, 6100000, 7100000,
8100000, 9100000, 10000000]
答案 1 :(得分:1)
由于您只需要第一个x
个数字,让我们从数字位数的角度来看一下。
一位数很容易 对于两位数,请获取所需的其他数字的所有选项(8种可能性)。然后,从最小到最大的工作,获得最小值的组合。然后使用更大的值组合再次迭代。 三位数或更多位的情况与两位数的情况相同。
示例 - 三位数字:
counter = 0;
options = generate_all_combinations_for_k_digits_in_order(2) //for the 3 digit case.
//options = [10, 11, 12, ..., 20, 21, ..., 91, 92, ..., 99]
for (int i = 0; i < number_of_digits - 1; i++)
{
for (int j = 0; j < options.length; j++)
{
print_9_in_position_i_of_number_j(i,j);
counter++;
if (counter == x)
break; // End the loop somehow...
}
}
答案 2 :(得分:1)
以下是模式:
n < 10^m and n has m digits
我们可以为f(n = 10^m - 1)
=> 9*10^(m-1) + 1..10^(m-2)-1 // that's like the 9000 + (001..999)
=> 8*10^(m-1) + f(n-1) // now for each leftmost digit down to 1,
=> ... // we append all the results from f(n-1)
=> 1*10^(m-1) + f(n-1)
:
def f m
if m == 1
return [9]
end
result = []
((10**m).to_i - 1).downto(9 * 10**(m-1).to_i).each do |i|
result << i
end
temp = 10**(m-1)
8.downto(0).each do |i|
f(m-1).each do |j|
result << (i * temp + j)
end
end
result
end
Ruby代码:
ruby 2.3.1p112 (2016-04-26 revision 54768) [x86_64-linux]
=> :f
> f 3
=> [999, 998, 997, 996, 995, 994, 993, 992, 991, 990, 989, 988, 987, 986, 985, 984, 983
, 982, 981, 980, 979, 978, 977, 976, 975, 974, 973, 972, 971, 970, 969, 968, 967, 966
, 965, 964, 963, 962, 961, 960, 959, 958, 957, 956, 955, 954, 953, 952, 951, 950, 949
, 948, 947, 946, 945, 944, 943, 942, 941, 940, 939, 938, 937, 936, 935, 934, 933, 932
, 931, 930, 929, 928, 927, 926, 925, 924, 923, 922, 921, 920, 919, 918, 917, 916, 915
, 914, 913, 912, 911, 910, 909, 908, 907, 906, 905, 904, 903, 902, 901, 900, 899, 898
, 897, 896, 895, 894, 893, 892, 891, 890, 889, 879, 869, 859, 849, 839, 829, 819, 809
, 799, 798, 797, 796, 795, 794, 793, 792, 791, 790, 789, 779, 769, 759, 749, 739, 729
, 719, 709, 699, 698, 697, 696, 695, 694, 693, 692, 691, 690, 689, 679, 669, 659, 649
, 639, 629, 619, 609, 599, 598, 597, 596, 595, 594, 593, 592, 591, 590, 589, 579, 569
, 559, 549, 539, 529, 519, 509, 499, 498, 497, 496, 495, 494, 493, 492, 491, 490, 489
, 479, 469, 459, 449, 439, 429, 419, 409, 399, 398, 397, 396, 395, 394, 393, 392, 391
, 390, 389, 379, 369, 359, 349, 339, 329, 319, 309, 299, 298, 297, 296, 295, 294, 293
, 292, 291, 290, 289, 279, 269, 259, 249, 239, 229, 219, 209, 199, 198, 197, 196, 195
, 194, 193, 192, 191, 190, 189, 179, 169, 159, 149, 139, 129, 119, 109, 99, 98, 97
, 96, 95, 94, 93, 92, 91, 90, 89, 79, 69, 59, 49, 39, 29, 19, 9]
> f(4).length
=> 3439
结果:
{{1}}
答案 3 :(得分:1)
我尝试使用固定长度字符串的“下一个词典”算法。
JavaScript代码:
function f(str,pat){
if (str[0] == 0
|| isNaN(Number(str))
|| isNaN(Number(pat))
|| str.length == pat.length
|| str == String(Math.pow(10,str.length - pat.length) - 1 + pat) ){
return str;
}
var AP = "",
i = 0;
while (!AP.match(pat) && i < str.length){
AP += str[i++];
}
// if there are digits to the right of the pattern
if (AP.length < str.length){
// increment the string if the pattern won't break
if (String(Number(str) + 1).match(pat)){
return Number(str) + 1;
// otherwise, find first number that may be incremented
} else {
var i = str.length - pat.length - 1
+ (pat.match(/^9+/) ? pat.match(/9+/)[0].length : 0)
while (str[i] == 9 && i-- >= 0){
}
// increment at i, move pattern to the right, set zeros in between
return (Number(str.substr(0,i + 1)) + 1)
* Math.pow(10,str.length - pat.length)
+ Number(pat)
}
// if the pattern is all the way on the right
} else {
// find rightmost placement for the pattern along an adjacent
// match where the string could be incremented
i = 1;
for (;i<pat.length; i++){
var j = str.length - pat.length - i,
patShifted = Number(pat) * Math.pow(10,i);
if (str.substr(j,i) == pat.substr(0,i)
&& patShifted > Number(str.substr(j))){
return Number(str.substr(0,j) + String(patShifted));
}
}
// if no left-shifted placement was found
return (Number(str.substr(0,str.length - pat.length)) + 1)
* Math.pow(10,pat.length) + Number(pat);
}
}