想知道如何拆分每个字符串并获取,单词数量。但我一直收到错误'Split':不是'System :: Array'的成员,在第三行有一个split或piece。
String^ originalString = textBox1->Text;//original text string
cli::array<String^>^ piece= originalString->Split('.');//text is being split into sentences
cli::array<String^>^ sentence = piece->Split(' ');// text is being split into words, also I get error here
for (int i = 0; i < sentence->Length; ++i) {
datagridview1->Rows[i]->Cells[2]->Value = i;}
答案 0 :(得分:2)
您可以先获取句子,这些句子是以“。”分隔的单词组。然后获取每个句子的单词,用空白字符分隔。
using namespace System;
using namespace System::Collections::Generic;
using namespace System::Diagnostics;
String^ originalString = "This is a chord. This is another. This is a third. Now form a band.";
// This array contains the sentences, which are separated by '.'
array<String^>^ sentences = originalString->Split(
gcnew array<String^> { "." },
StringSplitOptions::RemoveEmptyEntries);
Debug::Assert(sentences->Length == 4);
// This list contains individual words for all sentences.
List<String^>^ words = gcnew List<String^>();
for each(String^ sentence in sentences) {
words->AddRange(sentence->Split(
gcnew array<String^> { " " },
StringSplitOptions::RemoveEmptyEntries));
}
Debug::Assert(words->Count == 15);
for each(String^ word in words) {
Console::WriteLine(word);
}
但如果你唯一感兴趣的是单个单词,你可以使用LINQ将它们放在一个表达式中:
using namespace System;
using namespace System::Collections::Generic;
using namespace System::Diagnostics;
using namespace System::Linq;
System::String^ StripDot(System::String^ input) {
return input->Replace(".", "");
}
void Test()
{
String^ originalString = "This is a chord. This is another. This is a third. Now form a band.";
IEnumerable<String^>^ words = Enumerable::Select<String^,String^>(
originalString->Split(
gcnew array<String^> { " " },
StringSplitOptions::RemoveEmptyEntries),
gcnew Func<String^,String^>(StripDot));
Debug::Assert(Enumerable::Count(words) == 15);
for each(String^ word in words) {
Console::WriteLine(word);
}
}
答案 1 :(得分:1)
我认为你能做的最简单的事就是使用Regex
:
String^ text = "This is a chord. This is another. This is a third. Now form a band.";
int wordCount = Regex::Matches(text, "\\w+")->Count; // = 15
,其中
\w
代表&#34;字符&#34;。它始终与ASCII字符[A-Za-z0-9_]
匹配。请注意包含下划线和数字。
更新至:
但我在每个句子中需要多个单词
在这种情况下,这应该适合你:
using namespace System;
using namespace System::Collections::Generic;
using namespace System::Diagnostics;
using namespace System::Linq;
using namespace System::Text::RegularExpressions;
static int CountWords(String^ text)
{
return Regex::Matches(text, "\\w+")->Count;
}
int main(array<System::String ^> ^args)
{
String^ text = "This is a chord. This is another. This is a third. Now form a band.";
// split sentences
IEnumerable<String^>^ sentences = Regex::Split(text, "[.!?](?!$)");
List<int>^ wordCounts = Enumerable::ToList(
// count words for each sentence
Enumerable::Select<String^, int>(sentences, gcnew Func<String^, int>(&CountWords)));
}
其中:
[.!?]
匹配这三个句子结尾中的任何一个,从而将文本分成(?!$)
它是一个消极的预测?!
,它确保结束.!?
的最后一个句子不在文本$
的末尾,这将导致空字符串