我认为这是一个逻辑问题。我在C#编码,但欢迎使用一般的伪代码解决方案。
我有这个文本文件,例如,包含这个文本:
blah "hello john"
blah 'the code is "flower" '
blah "good night"
我想循环使用双引号并对它们执行某些操作,但我想忽略单引号中包含的双引号。我得到开头双引号和结尾双引号的位置(string data
包含文本文件的内容):
C#
// Start searching from beginning
int quotestart = 0, quoteend = 0;
while (data.IndexOf('"', quotestart) != -1)
{
// Get opening double quote
quotestart = data.IndexOf('"', quotestart);
// Get ending double quote
quoteend = data.IndexOf('"', quotestart + 1);
string sub = data.Substring(quotestart + 1, quoteend - quotestart - 1);
Console.WriteLine(sub);
// Set the start position for the next round
quotestart = quoteend + 1;
}
使用我的代码,输出将是:
hello john
flower
good night
因为“花”在单引号内,我希望我的输出为:
hello john
good night
修改
我目前正在开发一种方法,我首先在单引号之间填充所有数据,例如'A'。这样,当我遍历双引号时,忽略单引号之间的任何数据。不确定这是否是正确的方法。
答案 0 :(得分:7)
我尝试谷歌搜索有限状态机,但没有正式的计算机工程培训我必须承认我有点迷失。你还有其他指示吗?
FSM是最简单的计算机形式之一。这个想法是你有一定数量的“状态”信息和稳定的输入流。每个输入都会导致状态以可预测的方式发生变化,仅基于当前状态和当前输入,并导致发生可预测的输出。
因此,假设您的输入是单个字符,输出是单个字符或“空”字符。这是一个可以满足您需求的FSM:
OUTSIDE
,INSIDEDOUBLE
和INSIDESINGLE
。"
,'
和x
。 (WOLOG让x
代表任何其他角色。)我们有三种状态和三种输入,因此有九种可能的组合。
OUTSIDE
并获得x
,请保留OUTSIDE
并发出null
。OUTSIDE
并获得"
,请转到INSIDEDOUBLE
并发出null
。OUTSIDE
并获得'
,请转到INSIDESINGLE
并发出null
。INSIDEDOUBLE
并获得x
,请保留INSIDEDOUBLE
并发出x
INSIDEDOUBLE
并获得"
,请转到OUTSIDE
并发出null
INSIDEDOUBLE
并获得'
,请保留INSIDEDOUBLE
并发出'
INSIDESINGLE
并获得x
,请保留INSIDESINGLE
并发出null
INSIDESINGLE
并获得"
,请保留INSIDESINGLE
并发出null
INSIDESINGLE
并获得'
,请转到OUTSIDE
并发出null
唯一剩下的就是说开始状态是OUTSIDE
。
所以我们假设输入为1 " 2 " 3 ' 4 " 5 " ' 6
。州和产出是:
OUTSIDE
获取1
,发出null
,保留OUTSIDE
。 OUTSIDE
获取"
,发出null
,发送INSIDEDOUBLE
。INSIDEDOUBLE
获取2
,发出2
,保留INSIDEDOUBLE
INSIDEDOUBLE
获取"
,发出null
,发送OUTSIDE
。OUTSIDE
获取3
,发出null
,保留OUTSIDE
。 OUTSIDE
获取'
,发出null
,发送INSIDESINGLE
...自己填写其余部分。
这是否足以让您编写代码?
答案 1 :(得分:5)
很好的解决方案;使用switch语句是为小型FSM执行此操作的传统方法,但是当状态和输入的数量变得庞大且复杂时,它变得难以处理。以下是一种更易于扩展的备用解决方案:表驱动的解决方案。也就是说,将有关转换和动作的事实放入数组中,然后FSM只不过是一系列数组查找:
// States
const int Outside = 0;
const int InDouble = 1;
const int InSingle = 2;
// Inputs
const int Other = 0;
const int DoubleQuote = 1;
const int SingleQuote = 2;
static readonly int[,] stateTransitions =
{ /* Other DoubleQ SingleQ */
/* Outside */ { Outside, InDouble, InSingle },
/* InDouble */ { InDouble, Outside, InDouble },
/* InSingle */ { InSingle, InSingle, Outside }
};
// Do we emit the character or ignore it?
static readonly bool[,] actions =
{ /* Other DoubleQ SingleQ */
/* Outside */ { false, false, false },
/* InDouble */{ true, false, true },
/* InSingle */{ false, false, false }
};
static int Classify(char c)
{
switch (c)
{
case '\'': return SingleQuote;
case '\"': return DoubleQuote;
default: return Other;
}
}
static IEnumerable<char> FSM(IEnumerable<char> inputs)
{
int state = Outside;
foreach (char input in inputs)
{
int kind = Classify(input);
if (actions[state, kind])
yield return input;
state = stateTransitions[state, kind];
}
}
现在我们可以用
获得结果string.Join("", FSM(@"1""2'3""4""5'6""7'8""9""A'B"))
答案 2 :(得分:2)
非常感谢Eric Lippert提供此解决方案背后的逻辑。如果有人需要,我在下面提供我的C#解决方案。为了清晰起见,我留下了一些不必要的重新分配。
string state = "outside";
for (int i = 0; i < data.Length; i++)
{
c = data[i];
switch (state)
{
case "outside":
switch (c)
{
case '\'':
state = "insidesingle";
break;
case '"':
state = "insidedouble";
break;
default:
state = "outside";
break;
}
break;
case "insidedouble":
switch (c)
{
case '\'':
state = "insidedouble";
Console.Write(c);
break;
case '"':
state = "outside";
break;
default:
state = "insidedouble";
Console.Write(c);
break;
}
break;
case "insidesingle":
switch (c)
{
case '\'':
state = "outside";
break;
case '"':
state = "insidesingle";
break;
default:
state = "insidesingle";
break;
}
break;
}
}
答案 3 :(得分:2)
为了好玩,我决定使用名为stateless的非常轻量级的FSM库来解决这个问题。
如果您要使用此库,代码将如何显示。
就像Eric的解决方案一样,下面的代码可以轻松更改以满足新的要求。
void Main()
{
Console.WriteLine(string.Join("", GetCharacters(@"1""2'3""4""5'6""7'8""9""A'B")));
}
public enum CharacterType
{
Other,
SingleQuote,
DoubleQuote
}
public enum State
{
OutsideQuote,
InsideSingleQuote,
InsideDoubleQuote
}
public IEnumerable<char> GetCharacters(string input)
{
//Initial state of the machine is OutSideQuote.
var sm = new StateMachine<State, CharacterType>(State.OutsideQuote);
//Below, we configure state transitions.
//For a given state, we configure how CharacterType
//transitions state machine to a new state.
sm.Configure(State.OutsideQuote)
.Ignore(CharacterType.Other)
//If you are outside quote and you receive a double quote,
//state transitions to InsideDoubleQuote.
.Permit(CharacterType.DoubleQuote, State.InsideDoubleQuote)
//If you are outside quote and you receive a single quote,
//state transitions to InsideSingleQuote.
//Same logic applies for other state transitions below.
.Permit(CharacterType.SingleQuote, State.InsideSingleQuote);
sm.Configure(State.InsideDoubleQuote)
.Ignore(CharacterType.Other)
.Ignore(CharacterType.SingleQuote)
.Permit(CharacterType.DoubleQuote, State.OutsideQuote);
sm.Configure(State.InsideSingleQuote)
.Ignore(CharacterType.Other)
.Ignore(CharacterType.DoubleQuote)
.Permit(CharacterType.SingleQuote, State.OutsideQuote);
foreach (var character in input)
{
var characterType = GetCharacterType(character);
sm.Fire(characterType);
if(sm.IsInState(State.InsideDoubleQuote) && characterType != CharacterType.DoubleQuote)
yield return character;
}
}
public CharacterType GetCharacterType(char input)
{
switch (input)
{
case '\'': return CharacterType.SingleQuote;
case '\"': return CharacterType.DoubleQuote;
default: return CharacterType.Other;
}
}