我有两个单独的字符串:
string s1 = "Hello welcome to the world of C sharp";
String s2 = "Hello world welcome to the world of C";
现在我想要获取两个字符串中的唯一字词,例如{sharp}
。
另外,我想在同一个程序中找到类似的单词,例如{Hello, welcome, to, the , world of, C}
。
我无法继续。有人可以帮忙吗?
答案 0 :(得分:4)
在C#中,您可以使用:
string[] words1 = s1.Split(" ", StringSplitOptions.RemoveEmptyEntries);
string[] words2 = s2.Split(" ", StringSplitOptions.RemoveEmptyEntries);
// Retrieve words that only exist in one list
var unique = words1.Except(words2).Concat(words2.Except(words1));
// Retrieve all "similar words" - exist in either list
var matches = words1.Intersect(words2);
答案 1 :(得分:1)
我建议您使用Split()
和Except()
:
string s1 = "Hello welcome to the world of C sharp";
string s2 = "Hello world welcome to the world of C";
var s1Words = s1.Split(' ', StringSplitOptions.RemoveEmptyEntries);
var s2Words = s2.Split(' ', StringSplitOptions.RemoveEmptyEntries);
var s1Only = s1Words.Except(s2Words);
var s2Only = s2Words.Except(s1Words);
Console.WriteLine("The unique words in S1 are: " + string.Join(",", s1Only));
Console.WriteLine("The unique words in S2 are: " + string.Join(",", s2Only));
如果您需要在同一列表中,可以使用Concat()
:
var allUniqueWords = s1Only.Concat(s2Only);
您还可以使用Intersect()
找到相同的字词:
var sameWords = s1Words.Intersect(s2Words);
LINQ中的set操作非常适合这些类型的事情。还有一个Union()
可以为您提供两者中所有单词的明确列表,例如:
var allWords = s1Words.Union(s2Words);
答案 2 :(得分:0)
老实说,我不确定你的目的是什么,但这里有几个可能的答案:
获取仅存在于一个字符串或另一个字符串中的单词:
using System.Linq;
...
string s1 ="Hello welcome to the world of C sharp";
string s2 = "Hello world welcome to the world of C";
List<string> s1List = (s1 + " " + s2)
.Split(' ')
.Where(s=> (!s2.Split(' ').Contains(s) || !s1.Split(' ').Contains(s)))
.Distinct()
.ToList();
获取所有独特的单词:
using System.Linq;
...
string s1 ="Hello welcome to the world of C sharp";
string s2 = "Hello world welcome to the world of C";
List<string> s1List = (s1 + " " + s2).Split(' ').Distinct().ToList();
答案 3 :(得分:0)
使用框架提供的一些不错的集合操作:
string s1 ="Hello welcome to the world of C sharp";
string s2 = "Hello world welcome to the world of C";
string[] words1 = s1.Split(' ');
string[] words2 = s2.Split(' ');
var s1UniqueWords = words1.Except(words2);
var s2UniqueWords = words2.Except(words1);
var sharedWords = words1.Intersect(words2);
有关各种设置操作的更多信息:http://msdn.microsoft.com/en-us/library/bb546153.aspx
答案 4 :(得分:0)
public List<string> UniqueWords(string[] setsOfWords)
{
List<string> words = new List<string>();
foreach (var setOfWords in setsOfWords)
{
words.AddRange(setOfWords.Split(new char[] { ' ' }));
}
return words.Distinct().ToList();
}
答案 5 :(得分:0)
在C ++中。假设您有某种StringTokenizer类来拆分字符串:
string s1 ="Hello welcome to the world of C sharp";
string s2 = "Hello world welcome to the world of C";
int main( int argc, char* argv[] )
{
stringTokenizer lStrToken1(s1);
stringTokenizer lStrToken2(s2);
vector<string> lVS1 = lStrToken1.getTokens();
vector<string> lVS2 = lStrToken2.getTokens();
sort( lVS1.begin(), lVS1.end() );
sort( lVS2.begin(), lVS2.end() );
vector<string> lDiff;
set_difference( lVS1.begin(), lVS1.end(), lVS2.begin(), lVS2.end(),
inserter( lDiff, lDiff.end() ) );
vector<string>::iterator lIter = lDiff.begin();
for ( ; lIter != lDiff.end(); ++lIter ) {
cout << *lIter << endl;
}
cout << endl;
}