我正在逐行读取文件并使用line.indexOf('"', 1)
和substring()
将其拆分为较小的字符串
但是这种方式不会检测"
之前是否为\
,因此它不会对转义char做出反应。我该如何解决这个问题?
(我不能只使用line.split('"')
couse "
在子字符串的开头和结尾,也不能用其他字符分割,因为我的任务不允许这样做。)
整个阅读部分是:
while ((line = bufferedReader.readLine()) != null) {
System.out.println(line);
while(line.length()>0){
if(line.charAt(0) == ',' || line.charAt(0) == ' '){
line = line.substring(1);
}
else{
if(line.indexOf(',') != -1){
if (line.charAt(0) == '"'){
pabaiga = line.indexOf("\"", 1);
zodis = line.substring(0, pabaiga+1);
line = line.substring(pabaiga+1);
duomenys.add(zodis);
}
else{
pabaiga = line.indexOf(',');
zodis = line.substring(0, pabaiga);
line = line.substring(pabaiga);
duomenys.add(zodis);
}
}
else{
zodis = line;
line = line.substring(line.length());
duomenys.add(zodis);
}
}
for(String elem : duomenys){
System.out.println(elem);
}
duomenys.removeAll(duomenys);
}
我不允许仅拆分分隔符,因为字符串中间可能有一个,在文本文件中使用\不是一个选项。所以我被建议将一个sting元素确定为“text”,但如果它在中间包含另一个“或”,那么我当前的代码不起作用。
如果来自文本文件的行是"start \"title\" end", 10, 20, "text"
sting数组应该包含
"start "title" end"
10
20
"text"
答案 0 :(得分:0)
您可以先将动态尺寸组件存储在List
中。要用你的标记填充这样的列表,你需要迭代你的句子中的每个字符,如果它不是,
里面的引号然后将它添加到tokenBuilder,但如果该逗号是在引号之外,则添加tokenBuilder的当前值到你的tokenList。这是示例代码。
String line = "\"start \\\"title\\\" end\", 10, 20, \"text\"";
List<String> tokens = new ArrayList<>();
StringBuilder tokenBuilder = new StringBuilder();
boolean insideQuote = false;
char ch, prev = ' ';
for (int i = 0; i < line.length(); i++) {
ch = line.charAt(i);
if (ch == '"' && prev != '\\') {// normal " (without \ before)
insideQuote = !insideQuote; // starts or ends quotation
}
// commas that are outside quote or last character in line
// should invoke adding non-empty builder to list
if (ch == ',' && !insideQuote || i == line.length() - 1) {
if (tokenBuilder.length() > 0) {
tokens.add(tokenBuilder.toString().trim());
tokenBuilder.delete(0, tokenBuilder.capacity());
}
}
// add every character to builder except \ that are inside
// quotes and have " after it
else if (!(ch == '\\' && i + 1 < line.length()
&& line.charAt(i + 1) == '"' && insideQuote)) {
tokenBuilder.append(ch);
}
prev = ch;//in next loop previous character should be our current one
}
String[] array = tokens.toArray(new String[tokens.size()]);
for (String s : array)
System.out.println(">" + s);
输出:
>"start "title" end"
>10
>20
>"text
答案 1 :(得分:0)
您可以使用此功能(http://ideone.com/TTtlZV上的在线示例):
import java.util.*;
import java.lang.*;
import java.io.*;
/* Name of the class has to be "Main" only if the class is public. */
class Ideone
{
public static void main (String[] args) throws java.lang.Exception
{
boolean inQuoted = false;
List<String> parts = new ArrayList<String>();
String s = "\"start \\\"title\\\" end\", 10, 20, \"text\"";
StringBuilder current = new StringBuilder();
for( int i=0; i<s.length(); i++ ){
char c = s.charAt(i);
char cPrev = ( i == 0 ? (char)0 : s.charAt(i-1));
if( c == '"' && cPrev != '\\' ){
inQuoted = !inQuoted;
}
if( c == ',' && !inQuoted ){
if( current.length() > 0 ){
parts.add(current.toString());
current = new StringBuilder();
}
}
else {
int length = current.length();
if( length > 1 && c == '"' && current.charAt(length-1) == '\\' ){
current.deleteCharAt(length-1);
}
current.append(c);
}
}
if( current.length() > 0 ){
parts.add(current.toString());
}
System.out.println(parts);
}
}
它不会处理双重转义。例如
\\“
如果我运行此程序,则输出为:
[“start \”title \“end”,10,20,“text”]
答案 2 :(得分:-1)
如果你想要最后一个索引,那么只需使用lastindexof
.lastindexOf("\"", 1)
只需替换
pabaiga = line.indexOf("\"", 1);
带
pabaiga = line.lastindexOf("\"", 1);