Question

我正在循环使用C＃中的很多字符串：

“看起来，对抗遥控器是好事，对生活有好处，那就是别的。”

在这些字符串中，我有一个选定的单词，由前一个函数的索引确定，就像上面例子中的第二个“好”一样。

“看，好的（＆lt; - 不是这一个）对抗遥控器是一回事，好（＆lt; - 这一个）对抗生活，那是别的。“

我想查找所选单词周围的单词。在上面的例子中，事物和反对。

“看，对于遥控器来说好的是一个的东西，好的反对生活，那就是别的了。”

我尝试用.split()和正则表达式的不同方法分开字符串，但我找不到实现这个目标的好方法。我可以访问上面示例中的单词 good ，以及位于字符串中的索引（上面的41）。

如果它会忽略标点符号和逗号，那将是一个巨大的奖励，所以在上面的例子中，我的理论函数只会返回反对，因为 thing 之间有逗号和的好

有没有简单的方法来实现这一目标？任何帮助表示赞赏。

Answer 1

包括＆＃34;巨额奖金＆＃34;：

string text = "Look, good against remotes is one thing, good against the living, that’s something else.";
string word = "good";
int index = 41;

string before = Regex.Match(text.Substring(0, index), @"(\w*)\s*$").Groups[1].Value;
string after = Regex.Match(text.Substring(index + word.Length), @"^\s*(\w*)").Groups[1].Value;

在这种情况下，由于逗号，before将为空字符串，而after将为＆＃34;反对＆＃34;。

说明：获取before时，第一步是抓住字符串的第一部分直到目标字text.Substring(0, index)执行此操作。然后我们使用正则表达式(\w*)\s*$匹配并捕获一个单词（\w*），后跟字符串末尾的任意数量的空格\s*（$）。第一个捕获组的内容是我们想要的单词，如果我们无法匹配一个单词，正则表达式仍将匹配，但它将匹配一个空字符串或只有空格，并且第一个捕获组将包含一个空字符串。

获取after的逻辑几乎相同，只是text.Substring(index + word.Length)用于获取目标字之后的其余字符串。正则表达式^\s*(\w*)类似，只是它被^锚定到字符串的开头，而\s*位于\w*之前，因为我们需要删除空格上的空格这个词的前端。

Answer 2

string phrase = "Look, good against remotes is one thing, good against the living, that’s something else.";
int selectedPosition = 41;
char[] ignoredSpecialChars = new char[2] { ',', '.' };

string afterWord = phrase.Substring(selectedPosition)
                         .Split(' ')[1]
                         .Trim(ignoredSpecialChars);
string beforeWord = phrase.Substring(0, selectedPosition)
                          .Split(' ')
                          .Last()
                          .Trim(ignoredSpecialChars);

您可以更改ignoredSpecialChars数组，以摆脱您不需要的特殊字符。

<强>更新

如果你的单词和它周围的单词之间有任何特殊字符，则会返回null。

string phrase = "Look, good against remotes is one thing, good against the living, that’s something else.";
int selectedPosition = 41;
char[] ignoredSpecialChars = new char[2] { ',', '.' };

string afterWord = phrase.Substring(selectedPosition)
                         .Split(' ')[1];
afterWord = Char.IsLetterOrDigit(afterWord.First()) ?
            afterWord.TrimEnd(ignoredSpecialChars) : 
            null;

string beforeWord = phrase.Substring(0, selectedPosition)
                          .Split(' ')
                          .Last();
beforeWord = Char.IsLetterOrDigit(beforeWord.Last()) ?
             beforeWord.TrimStart(ignoredSpecialChars) : 
             null;

Answer 3

我还没有测试过，但它应该可行。你可以在单词之前和之后查看Substring，然后搜索第一个或最后一个“”。然后你知道单词的开始和结束位置。

string word = "good";
int index = 41

string before = word.Substring(0,index-1).Trim();   //-1 because you want to ignore the " " right in front of the word
string after = word.Substring(index+word.length+1).Trim();   //+1 because of the " " after the word

int indexBefore = before.LastIndexOf(" ");
int indexAfter = after.IndexOf(" ");

string wordBefore = before.Substring(indexBefore, index-1);
string wordAfter = after.Substring(index+word.length+1, indexAfter);

修改

如果您想忽略标点符号和逗号，只需将其从字符串
中删除即可

Answer 4

您可以使用正则表达式[^’a-zA-Z]+从字符串中获取字词：

words = Regex.Split(text, @"[^’a-zA-Z0-9]+");

实施导航取决于您。存储所选单词的索引并使用它来获取下一个或前一个单词：

int index = Array.IndexOf(words, "living");
if (index < words.Count() - 1)
    next = words[index + 1]; // that's

if (index > 0)
    previous = words[index - 1]; // the

Answer 5

这是用vb编写的linqpad程序

    Sub Main
    dim input as string = "Look, good against remotes is one thing, good against the living, that’s something else."

    dim words as new list(of string)(input.split(" "c))

    dim index = getIndex(words)

    dim retVal = GetSurrounding(words, index, "good", 2)

    retVal.dump()
End Sub

function getIndex(words as list(of string)) as dictionary(of string, list(of integer))

    for i as integer = 0 to words.count- 1
            words(i) = getWord(words(i))
    next

    'words.dump()

    dim index as new dictionary(of string, List(of integer))(StringComparer.InvariantCultureIgnoreCase)
    for j as integer = 0 to words.count- 1
            dim word = words(j)
            if index.containsKey(word) then
                    index(word).add(j)
            else  
                    index.add(word, new list(of integer)({j}))
            end if
    next

    'index.dump()
    return index
end function

function getWord(candidate) as string
    dim pattern as string = "^[\w'’]+"
    dim match = Regex.Match(candidate, pattern)
    if match.success then
            return match.toString()
    else
            return candidate
    end if
end function 

function GetSurrounding(words, index, word, position) as tuple(of string, string)        

    if not index.containsKey(word) then
            return nothing
    end if

    dim indexEntry = index(word)
    if position > indexEntry.count
            'not enough appearences of word
            return nothing
    else
            dim left = ""
            dim right = ""
            dim positionInWordList = indexEntry(position -1)
            if PositionInWordList >0
                    left = words(PositionInWordList-1)
            end if
            if PositionInWordList < words.count -1
                    right = words(PositionInWordList +1)
            end if

            return new tuple(of string, string)(left, right)
    end if
end function

Answer 6

如果没有正则表达式，可以使用Array.IndexOf递归执行此操作。

public class BeforeAndAfterWordFinder
{
    public string Input { get; private set; }
    private string[] words;

    public BeforeAndAfterWordFinder(string input)
    {
        Input = input;
        words = Input.Split(new string[] { ", ", " " }, StringSplitOptions.None);
    }

    public void Run(int occurance, string word)
    {
        int index = 0;
        OccuranceAfterWord(occurance, word, ref index);
        Print(index);            
    }

    private void OccuranceAfterWord(int occurance, string word, ref int lastIndex, int thisOccurance = 0)
    {
        lastIndex = lastIndex > 0 ? Array.IndexOf(words, word, lastIndex + 1) : Array.IndexOf(words, word);

        if (lastIndex != -1)
        {
            thisOccurance++; 
            if (thisOccurance < occurance)
            {
                OccuranceAfterWord(occurance, word, ref lastIndex, thisOccurance);
            }                
        }            
    }

    private void Print(int index)
    {            
        Console.WriteLine("{0} : {1}", words[index - 1], words[index + 1]);//check for index out of range
    }
}

用法：

  string input = "Look, good against remotes is one thing, good against the living, that’s something else.";
  var F = new BeforeAndAfterWordFinder(input);
  F.Run(2, "good");

Answer 7

创建一个字符串，用于删除标点符号和逗号（使用“删除”）。从该字符串中，搜索Substring“thing good against”。等等，如果需要的话。

选择字符串中的上一个和下一个单词

7 个答案: