如何用C ++中的单词知道每个字符串的长度

时间:2015-10-18 20:28:44

标签: c++-cli

想知道如何拆分每个字符串并获取,单词数量。但我一直收到错误'Split':不是'System :: Array'的成员,在第三行有一个split或piece。

String^ originalString = textBox1->Text;//original text string
cli::array<String^>^ piece= originalString->Split('.');//text is being split into sentences    
cli::array<String^>^ sentence = piece->Split(' ');// text is being split into words, also I get error here
for (int i = 0; i < sentence->Length; ++i) {
datagridview1->Rows[i]->Cells[2]->Value = i;}

2 个答案:

答案 0 :(得分:2)

您可以先获取句子,这些句子是以“。”分隔的单词组。然后获取每个句子的单词,用空白字符分隔。

using namespace System;
using namespace System::Collections::Generic;
using namespace System::Diagnostics;

String^ originalString = "This is a chord. This is another. This is a third. Now form a band.";

// This array contains the sentences, which are separated by '.'
array<String^>^ sentences = originalString->Split(
    gcnew array<String^> { "." },
    StringSplitOptions::RemoveEmptyEntries);

Debug::Assert(sentences->Length == 4);

// This list contains individual words for all sentences.
List<String^>^ words = gcnew List<String^>();
for each(String^ sentence in sentences) {
    words->AddRange(sentence->Split(
        gcnew array<String^> { " " },
        StringSplitOptions::RemoveEmptyEntries));
}

Debug::Assert(words->Count == 15);

for each(String^ word in words) {
    Console::WriteLine(word);
}

但如果你唯一感兴趣的是单个单词,你可以使用LINQ将它们放在一个表达式中:

using namespace System;
using namespace System::Collections::Generic;
using namespace System::Diagnostics;
using namespace System::Linq;

System::String^ StripDot(System::String^ input) {
    return input->Replace(".", "");
}

void Test()
{
    String^ originalString = "This is a chord. This is another. This is a third. Now form a band.";

    IEnumerable<String^>^ words = Enumerable::Select<String^,String^>(
        originalString->Split(
            gcnew array<String^> { " " },
            StringSplitOptions::RemoveEmptyEntries),
        gcnew Func<String^,String^>(StripDot));

    Debug::Assert(Enumerable::Count(words) == 15);

    for each(String^ word in words) {
        Console::WriteLine(word);
    }
}

答案 1 :(得分:1)

我认为你能做的最简单的事就是使用Regex

String^ text = "This is a chord. This is another. This is a third. Now form a band.";

int wordCount = Regex::Matches(text, "\\w+")->Count; // = 15

,其中

  

\w代表&#34;字符&#34;。它始终与ASCII字符[A-Za-z0-9_]匹配。请注意包含下划线和数字。

Shorthand Character Classes

更新至:

  

但我在每个句子中需要多个单词

在这种情况下,这应该适合你:

using namespace System;
using namespace System::Collections::Generic;
using namespace System::Diagnostics;
using namespace System::Linq;
using namespace System::Text::RegularExpressions;

static int CountWords(String^ text) 
{
    return Regex::Matches(text, "\\w+")->Count;
}

int main(array<System::String ^> ^args)
{
    String^ text = "This is a chord. This is another. This is a third. Now form a band.";

    // split sentences
    IEnumerable<String^>^ sentences = Regex::Split(text, "[.!?](?!$)"); 
    List<int>^ wordCounts = Enumerable::ToList(
        // count words for each sentence
        Enumerable::Select<String^, int>(sentences, gcnew Func<String^, int>(&CountWords)));
}

其中:

  • [.!?]匹配这三个句子结尾中的任何一个,从而将文本分成
  • (?!$)它是一个消极的预测?!,它确保结束.!?的最后一个句子不在文本$的末尾,这将导致空字符串