在C#中构建智能字符串修剪功能

时间:2010-08-17 16:55:48

标签: c# string

我正在尝试构建一个字符串扩展方法,将字符串剪裁到一定长度,但不会破坏单词。我想检查框架中是否有任何内置或者比我更聪明的方法。到目前为止,这是我的(未经过全面测试):

public static string SmartTrim(this string s, int length)
        {
            StringBuilder result = new StringBuilder();

            if (length >= 0)
            {
                if (s.IndexOf(' ') > 0)
                {
                    string[] words = s.Split(' ');
                    int index = 0;

                    while (index < words.Length - 1 && result.Length + words[index + 1].Length <= length)
                    {
                        result.Append(words[index]);
                        result.Append(" ");
                        index++;
                    }

                    if (result.Length > 0)
                    {
                        result.Remove(result.Length - 1, 1);
                    }
                }
                else
                {
                    result.Append(s.Substring(0, length));
                }
            }
            else
            {
                throw new ArgumentOutOfRangeException("length", "Value cannot be negative.");
            }

            return result.ToString();
        }

7 个答案:

答案 0 :(得分:14)

我会使用string.LastIndexOf - 至少我们只关心空格。然后就没有必要创建任何中间字符串......

尚未经过测试:

public static string SmartTrim(this string text, int length)
{
    if (text == null)
    {
        throw new ArgumentNullException("text");
    }
    if (length < 0)
    {
        throw new ArgumentOutOfRangeException();
    }
    if (text.Length <= length)
    {
        return text;
    }
    int lastSpaceBeforeMax = text.LastIndexOf(' ', length);
    if (lastSpaceBeforeMax == -1)
    {
        // Perhaps define a strategy here? Could return empty string,
        // or the original
        throw new ArgumentException("Unable to trim word");
    }
    return text.Substring(0, lastSpaceBeforeMax);        
}

测试代码:

public class Test
{
    static void Main()
    {
        Console.WriteLine("'{0}'", "foo bar baz".SmartTrim(20));
        Console.WriteLine("'{0}'", "foo bar baz".SmartTrim(3));
        Console.WriteLine("'{0}'", "foo bar baz".SmartTrim(4));
        Console.WriteLine("'{0}'", "foo bar baz".SmartTrim(5));
        Console.WriteLine("'{0}'", "foo bar baz".SmartTrim(7));
    }
}

结果:

'foo bar baz'
'foo'
'foo'
'foo'
'foo bar'

答案 1 :(得分:2)

基于Regex的解决方案怎么样?你可能想要测试更多,并做一些边界检查;但这是我心中想到的:

using System;
using System.Text.RegularExpressions;

namespace Stackoverflow.Test
{
    static class Test
    {
        private static readonly Regex regWords = new Regex("\\w+", RegexOptions.Compiled);

        static void Main()
        {
            Console.WriteLine("The quick brown fox jumped over the lazy dog".SmartTrim(8));
            Console.WriteLine("The quick brown fox jumped over the lazy dog".SmartTrim(20));
            Console.WriteLine("Hello, I am attempting to build a string extension method to trim a string to a certain length but with not breaking a word. I wanted to check to see if there was anything built into the framework or a more clever method than mine".SmartTrim(100));
        }

        public static string SmartTrim(this string s, int length)
        {
            var matches = regWords.Matches(s);
            foreach (Match match in matches)
            {
                if (match.Index + match.Length > length)
                {
                    int ln = match.Index + match.Length > s.Length ? s.Length : match.Index + match.Length;
                    return s.Substring(0, ln);
                }
            }
            return s;
        }
    }
}

答案 2 :(得分:2)

试一试。它是null安全的,如果长度比字符串长,则不会中断,并且涉及更少的字符串操作。

编辑:根据建议,我删除了中间字符串。我会留下答案,因为它可能在不需要例外的情况下有用。

public static string SmartTrim(this string s, int length)
{
    if(s == null || length < 0 || s.Length <= length)
        return s;

    // Edit a' la Jon Skeet. Removes unnecessary intermediate string. Thanks!
    // string temp = s.Length > length + 1 ? s.Remove(length+1) : s;
    int lastSpace = s.LastIndexOf(' ', length + 1);
    return lastSpace < 0 ? string.Empty : s.Remove(lastSpace);
}

答案 3 :(得分:1)

string strTemp = "How are you doing today";
int nLength = 12;
strTemp = strTemp.Substring(0, strTemp.Substring(0, nLength).LastIndexOf(' '));

我认为应该这样做。当我跑步时,它结束了“你好吗”。

所以你的功能是:

public static string SmartTrim(this string s, int length) 
{  
    return s.Substring(0, s.Substring(0, length).LastIndexOf(' '));; 
} 

我肯定会添加一些异常处理,例如确保整数长度不大于字符串长度且不小于0.

答案 4 :(得分:1)

如果您只关心空格作为单词边界,则必须使用LINQ one liner:

return new String(s.TakeWhile((ch,idx) => (idx < length) || (idx >= length && !Char.IsWhiteSpace(ch))).ToArray());

答案 5 :(得分:1)

像这样使用

var substring = source.GetSubstring(50, new string[] { " ", "." })

此方法可以根据一个或多个分隔符

获取子字符串
public static string GetSubstring(this string source, int length, params string[] options)
    {
        if (string.IsNullOrWhiteSpace(source))
        {
            return string.Empty;
        }

        if (source.Length <= length)
        {
            return source;
        }

        var indices =
            options.Select(
                separator => source.IndexOf(separator, length, StringComparison.CurrentCultureIgnoreCase))
                .Where(index => index >= 0)
                .ToList();

        if (indices.Count > 0)
        {
            return source.Substring(0, indices.Min());
        }

        return source;
    }

答案 6 :(得分:0)

即使其他人已经充分回答了这个问题,我还是会考虑一些Linq的善良:

public string TrimString(string s, int maxLength)
{
    var pos = s.Select((c, idx) => new { Char = c, Pos = idx })
        .Where(item => char.IsWhiteSpace(item.Char) && item.Pos <= maxLength)
        .Select(item => item.Pos)
        .SingleOrDefault();

    return pos > 0 ? s.Substring(0, pos) : s;
}

我遗漏了参数检查,其他人只是强调了重要的代码......