c#使用另一个字符串作为分隔符拆分字符串,并将分隔符作为分割字符串的一部分包含在内

时间:2017-11-14 14:14:51

标签: c# regex split

我需要使用c#regex拆分输入字符串。 需要知道如何在输出中包含分隔符内容,如下所示。

输入:

string content="heading1: contents with respect to heading1 heading2: heading2 contents heading3: heading 3 related contents sample strings";

string[] delimters = new string[] {"heading1:","heading2:","heading3:"};

预期产出:

outputArray[0] = heading1: contents with respect to heading1
outputArray[1] = heading2: heading2 contents
outputArray[2] = heading3: heading 3 related contents sample strings

我尝试了什么:

var result = content.Split(delimters,StringSplitOptions.RemoveEmptyEntries);

我得到的输出:

result [0]: " contents with respect to heading1 "
result [1]: " heading2 contents "
result [2]: " heading 3 related contents sample strings"

我无法在string.split或Regex中找到一个API来分割为期望结果。

3 个答案:

答案 0 :(得分:1)

而不是拆分,我建议匹配,然后我们可以订购

private static IEnumerable<string> Solution(string source, string[] delimiters) {
  int from = 0;
  int length = 0;

  // Points at which we can split
  var points = delimiters
      .SelectMany(delimiter => Regex
        .Matches(source, delimiter)
        .OfType<Match>()
        .Select(match => match.Index)
        .Select(index => new {
          index = index,
          delimiter = delimiter,
        }))
      .OrderBy(item => item.index)
      .ThenBy(item => Array.IndexOf(delimiters, item.delimiter)); // tie break

  foreach (var point in points) {
    if (point.index >= from + length) {
      // Condition: we don't want the very first empty part
      if (from != 0 || point.index - from != 0)
        yield return source.Substring(from, point.index - from);

      from = point.index;
      length = point.delimiter.Length;
    }
  }

  yield return source.Substring(from);
}

测试:

string content = 
  "heading1: contents with respect to heading1 heading2: heading2 contents heading3: heading 3 related contents sample strings";

string[] delimiters = new string[] { 
  "heading1:", "heading2:", "heading3:" };

Console.WriteLine(Solution(content, delimiters));

结果:

heading1: contents with respect to heading1 
heading2: heading2 contents 
heading3: heading 3 related contents sample strings

如果我们按数字分割(第二次测试)

Console.WriteLine(Solution(content, new string[] {"[0-9]+"}));

我们会得到

heading
1: contents with respect to heading
1 heading
2: heading
2 contents heading
3: heading 
3 related contents sample strings

答案 1 :(得分:1)

您可以使用基于前瞻性的积极解决方案:

import java.io.FileOutputStream;

import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.apache.poi.xssf.usermodel.XSSFSheet;

public class ExcelRATEFunction {

 private static double calculateRateNewton(double nper, double pmt, double pv, double fv, double type, double guess) {

  int FINANCIAL_MAX_ITERATIONS = 20;
  double FINANCIAL_PRECISION = 0.0000001;

  double y, y1, xN = 0, f = 0, i = 0;

  double rate = guess;

  //find root by Newtons method (https://en.wikipedia.org/wiki/Newton%27s_method), not secant method!
  //Formula see: https://wiki.openoffice.org/wiki/Documentation/How_Tos/Calc:_Derivation_of_Financial_Formulas#PV.2C_FV.2C_PMT.2C_NPER.2C_RATE

  f = Math.pow(1 + rate, nper);
  y = pv * f + pmt * ((f - 1) / rate) * (1 + rate * type) + fv;

  //first derivative:
  //y1 = (pmt * nper * type * Math.pow(rate,2) * f - pmt * f - pmt * rate * f + pmt * nper * rate * f + pmt * rate + pmt + nper * pv * Math.pow(rate,2) * f) / (Math.pow(rate,2) * (rate+1));
  y1 = (f * ((pmt * nper * type + nper * pv) * Math.pow(rate,2) + (pmt * nper - pmt) * rate - pmt) + pmt * rate + pmt) / (Math.pow(rate,3) + Math.pow(rate,2));

  xN = rate - y/y1;

  while ((Math.abs(rate - xN) > FINANCIAL_PRECISION) && (i < FINANCIAL_MAX_ITERATIONS)) {

   rate = xN;

   f = Math.pow(1 + rate, nper);
   y = pv * f + pmt * ((f - 1) / rate) * (1 + rate * type) + fv;

   //first derivative:
   //y1 = (pmt * nper * type * Math.pow(rate,2) * f - pmt * f - pmt * rate * f + pmt * nper * rate * f + pmt * rate + pmt + nper * pv * Math.pow(rate,2) * f) / (Math.pow(rate,2) * (rate+1));
   y1 = (f * ((pmt * nper * type + nper * pv) * Math.pow(rate,2) + (pmt * nper - pmt) * rate - pmt) + pmt * rate + pmt) / (Math.pow(rate,3) + Math.pow(rate,2));

   xN = rate - y/y1;
   ++i;

   System.out.println(rate+", "+xN+", "+y+", "+y1);
  }

  rate = xN;    
  return rate;

 }

 public static void main(String[] args) throws Exception {

  Workbook workbook = new XSSFWorkbook();
  Sheet sheet = workbook.createSheet();
  Row  row = sheet.createRow(1);
  Cell  cell = row.createCell(1);

  cell.setCellFormula("RATE(85.77534246575343, -1589.0, -18664.0, 5855586.0, 0, 0.06)");
  FormulaEvaluator evaluator = workbook.getCreationHelper().createFormulaEvaluator();
  CellType celltype = evaluator.evaluateFormulaCellEnum(cell);

  double value = 0.0;
  if (celltype == CellType.NUMERIC) {
   value = cell.getNumericCellValue();
   System.out.println(value);
  }

  workbook.setForceFormulaRecalculation(true);

  value = calculateRateNewton(85.77534246575343, -1589.0, -18664.0, 5855586.0, 0, 0.1);
  System.out.println(value);

  workbook.write(new FileOutputStream("ExcelRATEFunction.xlsx"));
  workbook.close();

 }

}

请参阅Newton's method

var result = Regex.Split(content, $@"(?={string.Join("|", delimiters.Select(m => Regex.Escape(m)))})")
                  .Where(x => !string.IsNullOrEmpty(x))

输出:

var content="heading1: contents with respect to heading1 heading2: heading2 contents heading3: heading 3 related contents sample strings";
var delimiters = new string[] {"heading1:","heading2:","heading3:"};
Console.WriteLine(
    string.Join("\n", 
        Regex.Split(content, $@"(?={string.Join("|", delimiters.Select(m => Regex.Escape(m)))})")
             .Where(x => !string.IsNullOrEmpty(x))
    )
);

heading1: contents with respect to heading1 heading2: heading2 contents heading3: heading 3 related contents sample strings 将动态构建一个正则表达式,它看起来像

(?={string.Join("|", delimiters.Select(m => Regex.Escape(m)))})

请参阅C# demo。该模式基本上匹配字符串中跟随(?=heading1:|heading2:|heading3:) herring1:herring2:的任何位置而不消耗这些子字符串,因此它们将落在输出中。

请注意,herring3:用于确保可能在分隔符中的所有特殊正则表达式元字符都被转义并被正则表达式引擎视为文字字符。

答案 2 :(得分:0)

string content = "heading1: contents with respect to heading1 heading2: heading2 contents heading3: heading 3 related contents sample strings";
string[] delimters = new string[] { "heading1:", "heading2:", "heading3:" };

var dels = string.Join("|", delimters);
var pattern = "(" + dels + ").*?(?=" + dels + "|\\Z)";

var outputArray = Regex.Matches(content, pattern);

foreach (Match match in outputArray)
    Console.WriteLine(match);

模式如下:

(heading1:|heading2:|heading3:).*?(?=heading1:|heading2:|heading3:|\Z)

看起来像是WiktorStribiżew的答案 当然,我们应该使用Regex.Escape,正如他所展示的那样。