我正在开发一个需要编辑rtf文件的项目,我在使用正则表达式方面遇到了很多麻烦,我试图弄清楚如何理解它更好/我做错了什么。
输入文件始终采用以下格式:
abc123_456
Q How much room was there between the bike rack and the snow pile?
A There was about three or four feet.
Q Was the whole place covered with snow?
A Most of that place was covered with snow.
我必须对其进行编辑,以便采用以下格式:
abc123_456
How much room was there between the bike rack and the snow pile?
There was about three or four feet.
Was the whole place covered with snow?
Most of that place was covered with snow.
我还要感谢一些帮助修复我的代码的冗余/不优雅的区域,但是目前我对生成工作输出感到满意。我目前的代码在这里:
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.io.StringWriter;
import java.io.Writer;
import javax.swing.JEditorPane;
import javax.swing.text.BadLocationException;
import javax.swing.text.EditorKit;
public class StringEditing {
String[] linesInDoc;
private static String readRTF(File file){
String documentText = "";
try{
JEditorPane p = new JEditorPane();
p.setContentType("text/rtf");
EditorKit rtfKit = p.getEditorKitForContentType("text/rtf");
rtfKit.read(new FileReader(file), p.getDocument(), 0);
rtfKit = null;
EditorKit txtKit = p.getEditorKitForContentType("text/plain");
Writer writer = new StringWriter();
txtKit.write(writer, p.getDocument(), 0, p.getDocument().getLength());
documentText = writer.toString();
}
catch( FileNotFoundException e )
{
System.out.println( "File not found" );
}
catch( IOException e )
{
System.out.println( "I/O error" );
}
catch( BadLocationException e )
{
}
return documentText;
}
public static void editDocument(File file){
String plaintext = readRTF(file);
System.out.println(plaintext);
plaintext = fixString(plaintext);
System.out.println(plaintext);
}
private static String fixString(String input){
String removedPrefix = input.replaceAll("(A|Q) *(.+)\r", "$2\r");
return removedPrefix;
}
}
目前的输出是:
fqt225_106
How much room was there between the bike rack and the snow pile?
There was about three or four feet.
Was the whole place covered with snow?
Most of that place was covered with snow.
简单来说,问题是删除除第一行以外的所有行的第一个字母和后面的空格。我自己的尝试经常删除所有/ s字符,因为我需要保留换行符,所以这些字符并不好。
谢谢!
[1] 目前的代码基于jbd和David Knipe的建议
public static void editDocument(File file){
String plaintext = readRTF(file);
StringBuilder sb = new StringBuilder();
System.out.println(plaintext);
String[] lines = plaintext.split(System.getProperty("line.separator"));
Pattern pattern = Pattern.compile("(?m)^[QA] *");
for(String s: lines){
Matcher match = pattern.matcher(s);
sb.append(match.replaceFirst(""));
sb.append("\n");
}
System.out.println(sb.toString());
}
小更新,我通过在每一行添加.trim()修复了我的问题,用以下代码替换了我的其他sb.append语句:
sb.append(match.replaceFirst("").trim());
感谢您的帮助!
答案 0 :(得分:0)
如果您知道跳过第一行,则不需要该行的正则表达式,然后使用:
Pattern pattern = Pattern.compile("[Q|A][ ]*");
String s = "Q How much room was there between the bike rack and the snow pile?";
Matcher match = pattern.matcher(s);
System.out.println(match.replaceFirst(""));