apex解析每个记录中包含双引号的csv

时间:2017-07-28 02:22:10

标签: string csv salesforce apex double-quotes

public static List<List<String>> parseCSV(String contents,Boolean skipHeaders) {
List<List<String>> allFields = new List<List<String>>();

// replace instances where a double quote begins a field containing a comma
// in this case you get a double quote followed by a doubled double quote
// do this for beginning and end of a field
contents = contents.replaceAll(',"""',',"DBLQT').replaceall('""",','DBLQT",');
// now replace all remaining double quotes - we do this so that we can reconstruct
// fields with commas inside assuming they begin and end with a double quote
contents = contents.replaceAll('""','DBLQT');
// we are not attempting to handle fields with a newline inside of them
// so, split on newline to get the spreadsheet rows
List<String> lines = new List<String>();
try {
    lines = contents.split('\n');
} catch (System.ListException e) {
    System.debug('Limits exceeded?' + e.getMessage());
}
Integer num = 0;
for(String line : lines) {
    // check for blank CSV lines (only commas)
    if (line.replaceAll(',','').trim().length() == 0) break;

    List<String> fields = line.split(',');  
    List<String> cleanFields = new List<String>();
    String compositeField;
    Boolean makeCompositeField = false;
    for(String field : fields) {
        if (field.startsWith('"') && field.endsWith('"')) {
            cleanFields.add(field.replaceAll('DBLQT','"'));
        } else if (field.startsWith('"')) {
            makeCompositeField = true;
            compositeField = field;
        } else if (field.endsWith('"')) {
            compositeField += ',' + field;
            cleanFields.add(compositeField.replaceAll('DBLQT','"'));
            makeCompositeField = false;
        } else if (makeCompositeField) {
            compositeField +=  ',' + field;
        } else {
            cleanFields.add(field.replaceAll('DBLQT','"'));
        }
    }

    allFields.add(cleanFields);

}


if(skipHeaders)allFields.remove(0);

return allFields;       
}

我使用这部分来解析CSV文件,但是当CSV全部用双引号括起来时,我发现我无法解析。

例如,我有这样的记录 “一”, “B”, “C”, “d,E,F”, “G”

解析后,我想得到这些 a b c d,e,f g

1 个答案:

答案 0 :(得分:0)

从我所看到的情况来看,你要做的第一件事是用逗号分割你从CSV文件中获得的行,使用这一行:

  

列表&lt;字符串&gt; fields = line.split(&#39;,&#39;);

当你对自己的例子这样做时(&#34; a&#34;,&#34; b&#34;,&#34; c&#34;,&#34; d,e,f&#34 ;,&#34; g&#34;),你得到的字符串列表是:

  

fields =(&#34; a&#34; |&#34; b&#34; |&#34; c&#34; | &#34; d | e | f&#34; |&#34; g&#34;),其中栏用于分隔列表元素

这里的问题是,如果你首先用逗号分割,那么区分那些属于字段的逗号(因为它们实际上出现在引号内)将会更加难以区分那些用你的字段分隔的字母

我建议尝试用引号分割这行,这会给你这样的东西:

  

fields =(a |,| b |,| c |,| d,e,f |,| g)

并过滤掉列表中仅包含逗号和/或空格的任何元素,最终实现此目的:

  

fields =(a | b | c | d,e,f | g)

(适用编辑)

您使用的是Java吗? 无论如何,这是一个Java代码,可以执行您尝试执行的操作:

import java.lang.*;

import java.util.*;

public class HelloWorld
{
    public static ArrayList<ArrayList<String>> parseCSV(String contents,Boolean skipHeaders) {
    ArrayList<ArrayList<String>> allFields = new ArrayList<ArrayList<String>>();

    // separating the file in lines
    List<String> lines = new ArrayList<String>();
    lines = Arrays.asList(contents.split("\n"));

    // ignoring header, if needed
    if(skipHeaders) lines.remove(0);

    // for each line
    for(String line : lines) {
        List<String> fields = Arrays.asList(line.split("\""));  
        ArrayList<String> cleanFields = new ArrayList<String>();
        Boolean isComma = false; 
        for(String field : fields) {
          // ignore elements that don't have useful data
          // (every other element after splitting by quotes)
          isComma = !isComma;
          if (isComma) continue;

          cleanFields.add(field);
        }

        allFields.add(cleanFields);
    }

    return allFields;       
  }

  public static void main(String[] args)
  {
    // example of input file:
    // Line 1: "a","b","c","d,e,f","g"
    // Line 2: "a1","b1","c1","d1,e1,f1","g1"
    ArrayList<ArrayList<String>> strings = HelloWorld.parseCSV("\"a\",\"b\",\"c\",\"d,e,f\",\"g\"\n\"a1\",\"b1\",\"c1\",\"d1,e1,f1\",\"g1\"",false);
    System.out.println("Result:");
    for (ArrayList<String> list : strings) {
      System.out.println("  New List:");
      for (String str : list) {
        System.out.println("    - " + str);
      }
    }
  }
}