Question

我有一个格式的文件：

011110659
            A101    000001    $.45          
031100762
            1030    000001    $.45          
071000288
            1040    000003   $1.35          
            1040    000001    $.45          
103100195
            1030    000001  $33.45          
            J5BU    000001    $.45

我想使用正则表达式使其看起来像

011110659   A101    000001    $.45
031100762   1030    000001    $.45
071000288   1040    000003   $1.35
071000288   1040    000001    $.45
103100195   1030    000001  $33.45
103100195   J5BU    000001    $.45

我要做的是复制整行中只有一个字符串的文本，并将其附加到已经有多个字符串的后续行的前面。

我可以使用编程脚本执行此操作，但有没有办法使用正则表达式执行此操作？

Answer 1

没有

关于距离((?<=\n|^)\d+)\s+\r?\n $1替换{{1}}的距离最近。这里示范：http://regex101.com/r/qD5rJ6

Answer 2

也许不像你希望的那样简洁，但这很有效。正则表达式重要，但不是关键。

顶部和设置：

   import  java.util.Arrays;
   import  java.util.ArrayList;
   import  java.util.regex.Pattern;
/**
   <P>{@code java FormatDataLinesWithRegexXmpl}</P>
 **/
public class FormatDataLinesWithRegexXmpl  {
   public static final void main(String[] igno_red)  {
      String sLS = System.getProperty("line.separator", "\\n");
      StringBuilder sdInput = new StringBuilder().
         append("011110659                                                            ").append(sLS).
         append("                                                                     ").append(sLS).
         append("                          A101            000001                 $.45").append(sLS).
         append("                                                                     ").append(sLS).
         append("031100762                                                            ").append(sLS).
         append("                                                                     ").append(sLS).
         append("                          1030            000001                 $.45").append(sLS).
         append("                                                                     ").append(sLS).
         append("071000288                                                            ").append(sLS).
         append("                                                                     ").append(sLS).
         append("                          1040            000003                $1.35").append(sLS).
         append("                                                                     ").append(sLS).
         append("                          1040            000001                 $.45").append(sLS).
         append("                                                                     ").append(sLS).
         append("103100195                                                            ").append(sLS).
         append("                                                                     ").append(sLS).
         append("                          1030            000001               $33.45").append(sLS).
         append("                                                                     ").append(sLS).
         append("                          J5BU            000001                 $.45").append(sLS);

主要逻辑：

      //config
         int iCOL1LEN = 9;
         String sRE234 = "" +
            "(\\w{4})[\\t ]+" +               //Column 2
            "(\\d{6})[\\t ]+" +               //Column 3
            "\\$(\\d*\\.*\\d+)$";             //Column 4
      Pattern p234 = Pattern.compile(sRE234);

      ArrayList<String> alLines = new ArrayList<String>(Arrays.asList(sdInput.toString().split(sLS)));
      ArrayList<DataLine> aldl = new ArrayList<DataLine>();

      System.out.println("read...START");
      DataLine dlCurr = null;
      int iLn = -1;

      while(alLines.size() > 0)  {
         iLn++;              //1st iteration: was -1, now 0

         String sLine = alLines.remove(0).trim();
         if(sLine.length() == 0)  {
            continue;

         }  else if(sLine.length() == iCOL1LEN)  {
            if(dlCurr != null)  {
               throw  new IllegalStateException("[line " + iLn + "]: Found two column-1s in a row.");
            }
            dlCurr = new DataLine(sLine);
            System.out.print("1");

         }  else if(p234.matcher(sLine).matches())  {
            if(dlCurr == null)  {
               //Current 234-columns have no corresponding column-1.
               //Use previous.
               dlCurr = new DataLine(aldl.get(aldl.size() - 1).sCol1);
               System.out.print("1");
            }
            dlCurr.set234(sLine);
            aldl.add(dlCurr);
            System.out.println("234");
            dlCurr = null;
         }
      }
      System.out.println("read...END");

      System.out.println("Output:");

      for(DataLine dl : aldl)  {
         System.out.println(dl.sCol1 + "        " + dl.sCol2 + "        " + dl.sCol3 + "        " + dl.sCol4);
      }
   }
}

一个简单的数据持有者类：

class DataLine  {
   public String sCol1 = null;
   public String sCol2 = null;
   public String sCol3 = null;
   public String sCol4 = null;
   public DataLine(String s_col1)  {
      sCol1 = s_col1;
   }
   public void set234(String s_234)  {
      String[] as = s_234.split("[\t ]+");
      sCol2 = as[0];
      sCol3 = as[1];
      sCol4 = as[2];
   }
}

输出：

[C:\java_code\]java FormatDataLinesWithRegexXmpl
read...START
1234
1234
1234
1234
1234
1234
read...END
Output:
011110659        A101        000001        $.45
031100762        1030        000001        $.45
071000288        1040        000003        $1.35
071000288        1040        000001        $.45
103100195        1030        000001        $33.45
103100195        J5BU        000001        $.45

条件正则表达式

2 个答案:

没有