我阅读了一篇很好的文章,内容涉及检查字符串是否是另一个的子字符串。
练习的内容是:
编写一个从命令行获取2个字符串参数的程序。 程序必须验证第二个字符串是否是第一个字符串的子字符串 字符串(您不能使用substr,substring或任何其他标准 函数,包括正则表达式库)。
第二个子字符串中每次出现*表示它可以是一个 匹配第一个字符串的零个或多个字符。
请考虑示例:输入字符串1:abcd输入字符串2:a * c程序 应该评估字符串2是字符串1的子字符串。
如果满足以下条件,则可以将星号(*)视为常规字符: 它前面有一个反斜杠(\)。反斜杠(\)被视为 除星号(*)之前的所有情况下的常规字符。
我编写了一个简单的应用程序,该应用程序首先检查第二个字符串是否不比第一个字符串长(但是存在一个问题,当在(“ ab”,“ a * b”)处进行测试时,这是正确的测试,但是该方法失败了) :
public static boolean checkCharactersCount(String firstString, String secondString) {
return (firstString.length() > 0 && secondString.length() > 0) &&
(firstString.length() > secondString.length());
...然后下一个验证是一个subtring:
public static boolean checkSubstring(String firstString, String secondString) {
int correctCharCounter = 0;
int lastCorrectCharAtIndex = -1;
for (int i = 0; i < secondString.length(); i++) {
for (int j = 0; j < firstString.length(); j++) {
if (j > lastCorrectCharAtIndex) {
if ((secondString.charAt(i) == firstString.charAt(j)) || secondString.charAt(i) == '*') {
correctCharCounter++;
lastCorrectCharAtIndex = j;
}
if (correctCharCounter >= secondString.length())
return true;
}
}
}
return false;
}
但是有两个问题:
您对解决方案的看法如何? :)
答案 0 :(得分:3)
尝试以下方法:(添加注释以供解释)
// only for non empty Strings
public boolean isSubString(String string1,String string2)
{
// step 1: split by *, but not by \*
List<String>list1 = new ArrayList<String>();
char[]cs = string2.toCharArray();
int lastIndex = 0 ;
char lastChar = 0 ;
int i = 0 ;
for(; i < cs.length ; ++i)
{
if(cs[i]=='*' && lastChar!='\\')
{
list1.add(new String(cs,lastIndex,i-lastIndex).replace("\\*", "*"));
//earlier buggy line:
//list1.add(new String(cs,lastIndex,i-lastIndex));
lastIndex = i + 1 ;
}
lastChar = cs[i];
}
if(lastIndex < i )
{
list1.add(new String(cs,lastIndex,i-lastIndex).replace("\\*", "*"));
}
// step 2: check indices of each string in the list
// Note: all indices should be in proper order.
lastIndex = 0;
for(String str : list1)
{
int newIndex = string1.indexOf(str,lastIndex);
if(newIndex < 0)
{
return false;
}
lastIndex = newIndex+str.length();
}
return true;
}
如果不允许您使用String.indexOf()
,然后编写一个函数public int indexOf(String string1,String string2, int index2)
并替换此语句
int newIndex = string1.indexOf(str,lastInxdex);
带有以下语句:
int newIndex = indexOf(string1, str,lastInxdex);
================================================ =========
附录A:我测试过的代码:
package jdk.conf;
import java.util.ArrayList;
import java.util.List;
public class Test01 {
public static void main(String[] args)
{
Test01 test01 = new Test01();
System.out.println(test01.isSubString("abcd", "a*c"));
System.out.println(test01.isSubString("abcd", "bcd"));
System.out.println(test01.isSubString("abcd", "*b"));
System.out.println(test01.isSubString("abcd", "ac"));
System.out.println(test01.isSubString("abcd", "bd"));
System.out.println(test01.isSubString("abcd", "b*d"));
System.out.println(test01.isSubString("abcd", "b\\*d"));
System.out.println(test01.isSubString("abcd", "\\*d"));
System.out.println(test01.isSubString("abcd", "b\\*"));
System.out.println(test01.isSubString("a*cd", "\\*b"));
System.out.println(test01.isSubString("", "b\\*"));
System.out.println(test01.isSubString("abcd", ""));
System.out.println(test01.isSubString("a*bd", "\\*b"));
}
// only for non empty Strings
public boolean isSubString(String string1,String string2)
{
// step 1: split by *, but not by \*
List<String>list1 = new ArrayList<String>();
char[]cs = string2.toCharArray();
int lastIndex = 0 ;
char lastChar = 0 ;
int i = 0 ;
for(; i < cs.length ; ++i)
{
if(cs[i]=='*' && lastChar!='\\')
{
list1.add(new String(cs,lastIndex,i-lastIndex).replace("\\*", "*"));
lastIndex = i + 1 ;
}
lastChar = cs[i];
}
if(lastIndex < i )
{
list1.add(new String(cs,lastIndex,i-lastIndex).replace("\\*", "*"));
}
// step 2: check indices of each string in the list
// Note: all indices should be in proper order.
lastIndex = 0;
for(String str : list1)
{
int newIndex = string1.indexOf(str,lastIndex);
if(newIndex < 0)
{
return false;
}
lastIndex = newIndex+str.length();
}
return true;
}
}
输出:
true
true
true
false
false
true
false
false
false
false
false
true
true
答案 1 :(得分:1)
我将分两个阶段进行。
让我们调用潜在的子字符串p和我们正在测试的包含子字符串s的字符串。
将“包含”部分简化为“ p匹配从s的第N个位置开始?”的一系列问题。显然,您从第一个位置开始经过s,以查看p在s的任何位置是否匹配。
在匹配中,我们有可能碰到“ *”;在这种情况下,我们想知道*后面的p部分是否是s直到p的部分匹配到*之前s的部分的子串。这建议使用一个递归例程,该例程获取要匹配的部分和要匹配的字符串,然后返回true / false。当您遇到*时,形成两个新字符串并给自己打电话。
如果遇到\,则只需继续与下一个字符进行常规匹配,而无需进行递归调用。鉴于您需要这样做,我想如果从原始p构建pPrime可能是最简单的方法,这样您就可以在遇到反斜杠时将其删除,就像从通配符中删除星号一样匹配。
我实际上还没有编写任何代码,您只是要求方法。
答案 2 :(得分:1)
我发现这是一个很好的挑战。这种练习确实迫使我们在一般的语言和算法的基础上进行思考。没有lambda,没有流,没有正则表达式,找不到,没有子字符串,什么都没有。只是旧的CharAt,有一些缺点,而没有。从本质上讲,我做了一个查找方法,该方法查找要找到的字符串的第一个字符,然后从该点开始再考虑您的规则的另一个查找。如果失败,则返回找到的第一个索引,添加一个索引,并执行必要的迭代次数,直到字符串结束。如果找不到匹配项,则应返回false。如果仅找到一个,则足以将其视为子字符串。在演算的开始考虑最重要的极端情况,以便确定是否检测到错误就不会进一步。因此,单独的“ *”表示任何字符匹配,我们可以使用\对其进行转义。我试图包括大多数极端情况,这确实是一个挑战。我不确定我的代码是否涵盖了所有情况,但应该涵盖很多情况。我真的很想帮助您,所以这是我的方法,这是我的代码:
package com.jesperancinha.string;
public class StringExercise {
private static final char ASTERISK = '*';
private static final char BACKSLASH = '\\';
public boolean checkIsSubString(String mainString, String checkString) {
int nextIndex = getNextIndex(0, checkString.charAt(0), mainString);
if (nextIndex == -1) {
return false;
}
boolean result = checkFromIndex(nextIndex, mainString, checkString);
while (nextIndex < mainString.length() - 1 && nextIndex > -1) {
if (!result) {
nextIndex = getNextIndex(nextIndex + 1, checkString.charAt(0), mainString);
if (nextIndex > -1) {
result = checkFromIndex(nextIndex, mainString, checkString);
}
} else {
return result;
}
}
return result;
}
private int getNextIndex(int start, char charAt, String mainString) {
if (charAt == ASTERISK || charAt == BACKSLASH) {
return start;
}
for (int i = start; i < mainString.length(); i++) {
if (mainString.charAt(i) == charAt) {
return i;
}
}
return -1;
}
private boolean checkFromIndex(int nextIndex, String mainString, String checkString) {
for (int i = 0, j = 0; i < checkString.length(); i++, j++) {
if (i < (checkString.length() - 2) && checkString.charAt(i) == BACKSLASH
&& checkString.charAt(i + 1) == ASTERISK) {
i++;
if (mainString.charAt(j + nextIndex) == BACKSLASH) {
j++;
}
if (checkString.charAt(i) != mainString.charAt(j + nextIndex)) {
return false;
}
}
if (i > 0 && checkString.charAt(i - 1) != BACKSLASH
&& checkString.charAt(i) == ASTERISK) {
if (i < checkString.length() - 1 && (j + nextIndex) < (mainString.length() - 1)
&& checkString.charAt(i + 1) !=
mainString.charAt(j + nextIndex + 1)) {
i--;
} else {
if (j + nextIndex == mainString.length() - 1
&& checkString.charAt(checkString.length() - 1) != ASTERISK
&& checkString.charAt(checkString.length() - 2) != BACKSLASH) {
return false;
}
}
} else {
if ((j + nextIndex) < (mainString.length() - 2) &&
mainString.charAt(j + nextIndex)
!= checkString.charAt(i)) {
return false;
}
}
}
return true;
}
}
我进行了一组单元测试,但是如果我将整个类放在这里,那将太长了,我想向您展示的唯一一件事就是在单元测试中实现的测试用例。这是我针对这种情况的单元测试的精简版本:
package com.jesperancinha.string;
import static org.assertj.core.api.Assertions.assertThat;
import org.junit.jupiter.api.Test;
class StringExerciseMegaTest {
@Test
void checkIsSubString() {
StringExercise stringExercise = new StringExercise();
boolean test = stringExercise.checkIsSubString("abcd", "a*c");
assertThat(test).isTrue();
test = stringExercise.checkIsSubString("abcd", "a\\*c");
assertThat(test).isFalse();
test = stringExercise.checkIsSubString("a*c", "a\\*c");
assertThat(test).isTrue();
test = stringExercise.checkIsSubString("aasdsadasa*c", "a\\*c");
assertThat(test).isTrue();
test = stringExercise.checkIsSubString("aasdsadasa*csdfdsfdsfdsf", "a\\*c");
assertThat(test).isTrue();
test = stringExercise.checkIsSubString("aasdsadasa**csdfdsfdsfdsf", "a\\*c");
assertThat(test).isFalse();
test = stringExercise.checkIsSubString("aasdsadasa**csdfdsfdsfdsf", "a*c");
assertThat(test).isTrue();
test = stringExercise.checkIsSubString("aasdsadasa*csdfdsfdsfdsf", "a*c");
assertThat(test).isTrue();
test = stringExercise.checkIsSubString("aasdweriouiauoisdf9977675tyhfgh", "a*c");
assertThat(test).isFalse();
test = stringExercise.checkIsSubString("aasdweriouiauoisdf9977675tyhfgh", "dwer");
assertThat(test).isTrue();
test = stringExercise.checkIsSubString("aasdweriouiauoisdf9977675tyhfgh", "75tyhfgh");
assertThat(test).isTrue();
test = stringExercise.checkIsSubString("aasdweriou\\iauoisdf9977675tyhfgh", "riou\\iauois");
assertThat(test).isTrue();
test = stringExercise.checkIsSubString("aasdweriou\\*iauoisdf9977675tyhfgh", "riou\\\\*iauois");
assertThat(test).isTrue();
test = stringExercise.checkIsSubString("aasdweriou\\*iauoisdf9\\*977675tyhfgh", "\\\\*977675tyhfgh");
assertThat(test).isTrue();
test = stringExercise.checkIsSubString("aasdweriou\\*iauoisdf9\\*977675tyhfgh", "\\*977675tyhfgh");
assertThat(test).isTrue();
test = stringExercise.checkIsSubString("\\*aasdweriou\\*iauoisdf9\\*977675tyhfgh", "\\*aasdwer");
assertThat(test).isTrue();
test = stringExercise.checkIsSubString("*aasdweriou\\*iauoisdf9\\*977675tyhfgh", "*aasdwer");
assertThat(test).isTrue();
test = stringExercise.checkIsSubString("abcd", "bc");
assertThat(test).isTrue();
test = stringExercise.checkIsSubString("abcd", "zbc");
assertThat(test).isFalse();
test = stringExercise.checkIsSubString("abcd", "*bc*");
assertThat(test).isTrue();
test = stringExercise.checkIsSubString("*bcd", "\\*bc*");
assertThat(test).isTrue();
test = stringExercise.checkIsSubString("abcd", "a*c");
assertThat(test).isTrue();
test = stringExercise.checkIsSubString("abcd", "az*bc");
assertThat(test).isFalse();
}
}
答案 3 :(得分:0)
我的解决方案如下所示,我评论了所有内容,希望您能理解。
public static void main(String [] args) throws Exception {
System.err.println(contains("bruderMusssLos".toCharArray(),"Mu*L*".toCharArray()));
}
public static boolean contains(char [] a, char [] b) {
int counterB = 0; // correct characters
char lastChar = '-'; //last Character encountered in B
for(int i = 0; i < a.length; i++) {
//if last character * it can be 0 to infinite characters
if(lastChar == '*') {
//if next characters in a is next in b reset last char
// this will be true as long the next a is not the next b
if(a[i] == b[counterB]) {
lastChar = b[counterB];
counterB++;
}else {
counterB++;
}
}else {
//if next char is * and lastchar is not \ count infinite to next hit
//otherwise * is normal character
if(b[counterB] == '*' && lastChar != '\\') {
lastChar = '*';
counterB++;
}else {
//if next a is next b count
if(a[i] == b[counterB]) {
lastChar = b[counterB];
counterB++;
}else {
//otherwise set counter to 0
counterB = 0;
}
}
}
//if counterB == length a contains b
if(counterB == b.length)
return true;
}
return false;
}
例如,当前测试返回true: