使用C#基于计数和单词拆分字符串

时间:2016-02-06 13:01:39

标签: c# regex string split substring

我需要在单词上拆分字符串,每行应该有25个字符。例如:

字符串ORIGINAL_TEXT =                 “请写一个程序,将这个文本分成小块状。每个块的最大长度应为25”

输出应为:

“请写一个程序”,

“打破了这个文字”,

“进入小chucks。每个”,

“chunk应该有一个”,

“最大长度为25”

我尝试使用子串 - 但它打破了像

这样的词

请编写程序” - 错误

请编写程序” - 正确

请写一个程序 - 只有23个字符,它可能需要2个字符,但它会破坏 这个词。

string[] splitSampArr = splitSamp.Split(',', '.', ';');
string[] myText = new string[splitSampArr.Length + 1];

int i = 0;
foreach (string splitSampArrVal in splitSampArr)
{
    if (splitSampArrVal.Length > 25)
    {
        myText[i] = splitSampArrVal.Substring(0, 25);
        i++;
    }
    myText[i] = splitSampArrVal;

    i++;
}

2 个答案:

答案 0 :(得分:4)

您可以通过以下方式实现这一目标:

@"(\b.{1,25})(?:\s+|$)"

请参阅regex demo

此正则表达式匹配并捕获到第1组中的任何字符,但换行符(.)前面带有单词边界(因此,我们只开始匹配整个单词),1到25发生(由于限制量词{1,25}),然后匹配1个或多个空白字符(带\s+)或字符串结尾($)。

查看code demo

using System;
using System.Linq;
using System.Collections.Generic;
using System.Text.RegularExpressions;
public class Test
{
    public static void Main()
    {
        var str = "Please write a program that breaks this text into small chucks. Each chunk should have a maximum length of 25 ";
        var chunks = Regex.Matches(str, @"(\b.{1,25})(?:\s+|$)")
                 .Cast<Match>().Select(p => p.Groups[1].Value)
                 .ToList();
        Console.WriteLine(string.Join("\n", chunks));
    }
}

答案 1 :(得分:1)

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace ConsoleApplication3
{
    class Program
    {
        static void Main(string[] args)
        {
            var sentence = "Please write a program that breaks this text into small chucks. Each chunk should have a maximum length of 25 ";
            StringBuilder sb = new StringBuilder();
            int count = 0;
            var words = sentence.Split(' ');
            foreach (var word in words)
            {
                if (count + word.Length > 25)
                {
                    sb.Append(Environment.NewLine);
                    count = 0;
                }
                sb.Append(word + " ");
                count += word.Length + 1;
            }
            Console.WriteLine(sb.ToString());
            Console.ReadKey();
        }
    }
}