假设您按以下方式组织了数千个文件:首先按文件名排序(区分大小写,以便大写文件在小写之前),然后将它们分组到包含名称的文件夹中该文件夹中的第一个和最后一个文件。例如,文件夹可能如下所示:
Abel -> Cain
Camel -> Sloth
Stork -> basket
basking -> sleuth
tiger -> zebra
现在,给定不区分大小写的搜索字符串s
,确定哪些文件夹可以包含与s
匹配的文件。您不能也不必查看文件夹 - 文件实际上不必存在。
一些例子:
("Abel", "Cain") matches s = "blue", since it contains "Blue"
("Stork", "basket") matches s = "arctic", since it contains "arctic"
("FA", "Fb") matches s = "foo", since it contains "FOo"
("Fa", "Fb") does NOT match s = "foo"
正式:给定一个封闭范围[a,b]
和一个小写字符串s
,确定c
中是否有任何字符串[a,b]
,lower(c) = s
。< / p>
我的第一个预感是对范围的边界进行不区分大小写的搜索。但从最后一个例子可以很容易地看出这是不正确的。
布鲁斯力解决方案是生成所有潜在的文件名。例如,输入字符串"abc"
将生成候选"ABC", "ABc", "AbC", "Abc", "aBC", "aBc", "abC", "abc"
。然后你只需要测试每个边界。下面将介绍这种强力解决方案的一个例子。这是O(2^n)
。
我的问题是,如果有一个快速正确的算法吗?
Clojure中的暴力解决方案:
(defn range-contains
[first last string]
(and (<= (compare first string) 0)
(>= (compare last string) 0)))
(defn generate-cases
"Generates all lowercase/uppercase combinations of a word"
[string]
(if (empty? string)
[nil]
(for [head [(java.lang.Character/toUpperCase (first string))
(java.lang.Character/toLowerCase (first string))]
tail (generate-cases (rest string))]
(cons head tail))))
(defn range-contains-insensitive
[first last string]
(let [f (fn [acc candidate] (or acc (range-contains first last (apply str candidate))))]
(reduce f false (generate-cases string))))
(fact "Range overlapping case insensitive"
(range-contains-insensitive "A" "Z" "g") => true
(range-contains-insensitive "FA" "Fa" "foo") => true
(range-contains-insensitive "b" "z" "a") => false
(range-contains-insensitive "B" "z" "a") => true)
答案 0 :(得分:1)
我认为不是创建所有大小写组合,而是可以通过分别检查每个字符的upper,然后降低来解决,这会将2 ^ N更改为2N。
这个想法如下:
听起来不错吗? C#中的代码(可能写得更简洁):
public Bracket(string l, string u)
{
Low = l;
High = u;
}
public bool IsMatch(string s)
{
string su = s.ToUpper();
string sl = s.ToLower();
bool lowdone = false;
bool highdone = false;
for (int i = 0; i < s.Length; i++)
{
char[] c = new char[]{su[i], sl[i]};
bool possible = false;
bool ld = lowdone;
bool hd = highdone;
for (int j = 0; j < 2; j++)
{
if ((lowdone || i >= Low.Length || c[j] >= Low[i]) && (highdone || i >= High.Length || c[j] <= High[i]))
{
if (i >= Low.Length || c[j] > Low[i])
ld = true;
if (i >= High.Length || c[j] < High[i])
hd = true;
possible = true;
}
}
lowdone = ld;
highdone = hd;
if (!possible)
return false;
}
if (!lowdone && Low.Length > s.Length)
return false;
return true;
}
}
答案 1 :(得分:0)
本着完全公开的精神,我想我还应该添加我想出的算法(Java,使用Guava):
public static boolean inRange(String search, String first, String last) {
int len = search.length();
if (len == 0) {
return true;
}
char low = Strings.padEnd(first, len, (char) 0).charAt(0);
char high = Strings.padEnd(last, len, (char) 0).charAt(0);
char capital = Character.toLowerCase(search.charAt(0));
char small = Character.toUpperCase(search.charAt(0));
if (low == high) {
if (capital == low || small == low) {
// All letters equal - remove first letter and restart
return inRange(search.substring(1), first.substring(1), last.substring(1));
}
return false;
}
if (containsAny(Ranges.open(low, high), capital, small)) {
return true; // Definitely inside
}
if (!containsAny(Ranges.closed(low, high), capital, small)) {
return false; // Definitely outside
}
// Edge case - we are on a bound and the bounds are different
if (capital == low || small == low) {
return Ranges.atLeast(first.substring(1)).contains(search.substring(1).toLowerCase());
}
else {
return Ranges.lessThan(last.substring(1)).contains(search.substring(1).toUpperCase());
}
}
private static <T extends Comparable<T>> boolean containsAny(Range<T> range, T value1, T value2) {
return range.contains(value1) || range.contains(value2);
}