在C ++中,如何将字符串拆分为大小均匀的字符串?
例如,我有一个字符串“012345678”,并希望它将它分成5个较小的字符串,这应该给我一些像“01”,“23”,“45”,“67”,“8”这样的东西。
我无法确定较小字符串的长度。在前面的示例中,原始字符串的大小为9,我想将其拆分为5个较小的字符串,因此除了最后一个字符串之外的每个较小的字符串应该是9/5 = 1的长度,但最后一个字符串的长度为9 - 1 * 4 = 5,这是不可接受的。
所以这个问题的正式定义:原始字符串被拆分为完全n个子字符串,并且没有两个子字符串的长度差异应大于1。
我的重点不是C ++语法或库。这是设计算法的方法,以便返回的字符串的大小几乎相等。
答案 0 :(得分:5)
至divide N items into M parts,长度在一个单位内,您可以使用公式(N*i+N)/M - (N*i)/M
作为i
'部分的长度,如下所示。
#include <string>
#include <iostream>
using namespace std;
int main() {
string text = "abcdefghijklmnopqrstuvwxyz";
int N = text.length();
for (int M=3; M<14; ++M) {
cout <<" length:"<< N <<" parts:"<< M << "\n";
int at, pre=0, i;
for (pre = i = 0; i < M; ++i) {
at = (N+N*i)/M;
cout << "part " << i << "\t" << pre << "\t" << at;
cout << "\t" << text.substr(pre, at-pre) << "\n";
pre = at;
}
}
return 0;
}
例如,当M
为4或5时,上面的代码会产生:
length:26 parts:4
part 0 0 6 abcdef
part 1 6 13 ghijklm
part 2 13 19 nopqrs
part 3 19 26 tuvwxyz
length:26 parts:5
part 0 0 5 abcde
part 1 5 10 fghij
part 2 10 15 klmno
part 3 15 20 pqrst
part 4 20 26 uvwxyz
答案 1 :(得分:4)
我的解决方案:
std::vector<std::string> split(std::string const & s, size_t count)
{
size_t minsize = s.size()/count;
int extra = s.size() - minsize * count;
std::vector<std::string> tokens;
for(size_t i = 0, offset=0 ; i < count ; ++i, --extra)
{
size_t size = minsize + (extra>0?1:0);
if ( (offset + size) < s.size())
tokens.push_back(s.substr(offset,size));
else
tokens.push_back(s.substr(offset, s.size() - offset));
offset += size;
}
return tokens;
}
测试代码:
int main()
{
std::string s;
while (std::cin >> s)
{
std::vector<std::string> tokens = split(s, 5);
//output
std::copy(tokens.begin(), tokens.end(),
std::ostream_iterator<std::string>(std::cout, ", "));
std::cout << std::endl;
}
}
输入:
012345
0123456
01234567
012345678
0123456789
01234567890
输出:
01, 2, 3, 4, 5,
01, 23, 4, 5, 6,
01, 23, 45, 6, 7,
01, 23, 45, 67, 8,
01, 23, 45, 67, 89,
012, 34, 56, 78, 90,
此解决方案趋向使令牌甚至,即所有令牌的大小可能不同。
答案 2 :(得分:1)
知道子串的长度就足够了;
假设m是字符串的size()
:
int k = (m%n == 0)? n : n-m%n;
然后,k
子字符串的长度应为m/n
,n-k
长度为m/n+1
。
答案 3 :(得分:0)
尝试substr
。
答案 4 :(得分:0)
您可以获得要将其拆分的迭代器,然后使用它们构造新的字符串。例如:
std::string s1 = "string to split";
std::string::iterator halfway = s1.begin() + s1.size() / 2;
std::string s2(s1.begin(), halfway);
std::string s3(halfway, s1.end());
答案 5 :(得分:0)
让我们说字符串长度为L
,并且必须在n
子字符串中拆分。
# Find the next multiple of `n` greater than or equal to `L`
L = 9
n = 5
LL = n * (L / n)
if LL < L:
LL += n
# Split a string of length LL into n equal sizes. The string is at
# most (n-1) longer than L.
lengths = [(LL / n) for x in range (n)]
# Remove one from the first (or any) (LL-L) elements.
for i in range (LL-L):
lengths [i] = lengths [i] - 1
# Get indices from lengths.
s = 0
idx = []
for i in lengths:
idx.append (s)
s = s + i
idx.append (L)
print idx
修改强> 好的,好的,我忘记它应该是C ++。
修改强> 这就是......
#include <vector>
#include <iostream>
unsigned int L = 13;
unsigned int n = 5;
int
main ()
{
int i;
unsigned int LL;
std::vector<int> lengths, idx;
/* Find the next multiple of `n` greater than or equal to `L` */
LL = n * (L / n);
if (LL < L)
LL += n;
/* Split a string of length LL into n equal sizes. The string is at
most (n-1) longer than L. */
for (i = 0; i < n; ++i)
lengths.push_back (LL/n);
/* Remove one from the first (or any) (LL-L) elements. */
for (i = 0; i < LL - L; ++i)
--lengths [i];
/* Get indices from lengths. */
int s = 0;
for (auto &ii: lengths)
{
idx.push_back (s);
s += ii;
}
idx.push_back (L);
for (auto &i : idx)
std::cout << i << " ";
std::cout << std::endl;
return 0;
}