如何在单词之前或之后修剪逗号和点

时间:2016-05-01 17:18:05

标签: sql asp.net trim

net web project我试图插入带有3段的文本的所有单词。如果句子是这样的话

"此文本是测试文本。"

然后最后一个词用点数据库。 (文字)

 strNew = strNew.Trim(new Char[] { ' ', ',', '.', '?' });

我已经尝试过这段代码,但没有帮助。这是我的全部代码。

            {int id = 0;
            ListBox1.Items.Clear();
            string strNew = Request.Form["TextBox1"];

            int n = strNew.Split(' ').Length;
            ListBox3.Items.Add(String.Format("Number of Words: {0}", n));

            int m = Regex.Matches(strNew, "[^\r\n]+((\r|\n|\r\n)[^\r\n]+)*").Count;//Counts Number of Paragraphes

            ListBox3.Items.Add(String.Format("Number of Paragraphes: {0}", m));
            strNew = strNew.ToLower();// all lower case
            strNew = strNew.Trim(new Char[] { ' ', ',', '.', '?' });
            var results = strNew.Split(' ').Where(x => x.Length > 1)
                                          .GroupBy(x => x)
                                          .Select(x => new { Count = x.Count(), Word = x.Key });//splitting sentences in to words



            using (con){
    con.Open();

    foreach (var item in results) {//here trying to insert word its id and some other informations but for now they can stay null(yes,null allowed for them)
        id++;
        string w = item.Word.ToString();
        SqlCommand cmd= con.CreateCommand();
        cmd.Parameters.AddWithValue("@id", id);
        cmd.Parameters.AddWithValue("@word", item.Word);
        cmd.CommandText= "INSERT INTO word(id, word, sid, frequency, weight, f) VALUES (@id, @word, 0, 0, 0, 0) ";



       cmd.ExecuteNonQuery(); 

    }
        con.Close();
}        }

3 个答案:

答案 0 :(得分:0)

问题不在一条线上,是多段拆分和修剪。

如果你应用修剪,它只能在当前行的开头和结尾处工作,而不是段落,因为术语段落在计算机语言字符串中是未知的。

在将其拆分为段落后,您应修剪每一行。

示例:“此文字为测试文字。\ r \ n这是另一项测试。”

当前操作只会裁减上一个.因为test.\r\nThis将被识别为1个字。

<强>代码:

char[] trimCharacters = { ' ', ',', '.', '?' };
var results = strNew.Split(new string[] { " ", "\r\n", "\n" }, StringSplitOptions.RemoveEmptyEntries)
                    .GroupBy(x => x)
                    .Select(x => new { Count = x.Count(), Word = x.Key.Trim(trimCharacters) });

答案 1 :(得分:0)

事实上,没有必要修剪你的琴弦。

    var matches = System.Text.RegularExpressions.Regex.Matches(strNew, @"(\b\w+\b)")
        .Cast<System.Text.RegularExpressions.Match>();
    var result = matches.GroupBy(m => m.Value)
        .Select(gr => new { Word = gr.Key, Count = gr.Count() });

    foreach (var r in result)
    {
        //do whatever you want with r.Word and r.Count
    }

答案 2 :(得分:0)

为什么在拆分后不对每个单词使用strNew.TrimEnd('。')或TrimStart?