尝试解析由(,)分隔并使用引号分隔的CSV文件

时间:2018-03-05 19:55:30

标签: java android csv delimiter

我没有在网上找到很多帮助。我有一个我要解析的CSV文件。分隔符是一个逗号但是如果它是字段的一部分,我想要一个逗号被忽略,所以我使用引号。当我的字段中没有逗号时,我的方法效果很好。但是,当我通过向其中一个字段添加逗号来尝试它时,期望将其视为单个记录,我收到了一个ArrayIndexOutOfBoundsException错误。这是我的代码。我用AsyncTask运行它。您会注意到我插入了代码 - class ParseCsvTask extends AsyncTask<File, Void, Void>{ @Override protected void onPreExecute() { mProgressBar.setVisibility(View.VISIBLE); } @Override protected Void doInBackground(File... files) { BufferedReader reader = null; CSVParser parser = null; File file = files[0]; CSVFormat formatter = CSVFormat.RFC4180.withFirstRecordAsHeader(); try { reader = new BufferedReader(new FileReader(file)); parser = CSVParser.parse(reader, formatter); List<CSVRecord> list = parser.getRecords(); for (CSVRecord r : list) { r.get(1); r.get(2); Competitor competitor = new Competitor(r.get(1), r.get(2)); if (!r.get(0).equals("")) { competitor.setMemberNum(r.get(0)); } if(!r.get(4).equals("")){ competitor.setEmail(r.get(4)); } if(!r.get(5).equals("")){ competitor.setPhone(r.get(5)); } switch (r.get(7)){ case "": competitor.setAge(Competitor.Age.ADULT); break; case "Junior": competitor.setAge(Competitor.Age.JUNIOR); break; case "Senior": competitor.setAge(Competitor.Age.SENIOR); break; case "Super Senior": competitor.setAge(Competitor.Age.SUPER_SENIOR); break; default: break; } if(r.get(8).equals("")){ competitor.setLady(false); } else { competitor.setLady(true); } mImportedComps.add(competitor); } FileHelper.writeMasterCompetitorsFile(mContext, mImportedComps); Intent intent = new Intent(mContext, MasterCompetitorListActivity.class); startActivity(intent); } catch (Exception e) { e.printStackTrace(); Log.d("record", "what is going on"); } finally { try { assert parser != null; parser.close(); reader.close(); } catch (IOException e) { e.printStackTrace(); } } return null; } @Override protected void onPostExecute(Void aVoid) { mProgressBar.setVisibility(View.INVISIBLE); } } 这仅用于测试。 r.get(1)是抛出错误的行

*org.apache.commons.csv*

请记住:当我不在记录中使用逗号时,效果很好。 &#34;名字&#34;工作正常,但如果记录说&#34;首先,名字&#34;我收到了错误。 另外,我使用invalid char between encapsulated token and delimiter

有人建议我发布的这个问题可能与此帖有关:Apache commons CSV: quoted input doesn't work。这篇文章中的错误是03-05 15:34:44.397 778-778/com.checkinsystems.ez_score D/ViewRootImpl@4ca832c[MasterCompetitorListActivity]: ViewPostImeInputStage processPointer 0 03-05 15:34:44.479 778-778/com.checkinsystems.ez_score D/ViewRootImpl@4ca832c[MasterCompetitorListActivity]: ViewPostImeInputStage processPointer 1 03-05 15:34:44.550 778-825/com.checkinsystems.ez_score W/System.err: java.lang.ArrayIndexOutOfBoundsException: length=1; index=1 03-05 15:34:44.550 778-825/com.checkinsystems.ez_score W/System.err: at org.apache.commons.csv.CSVRecord.get(CSVRecord.java:79) 03-05 15:34:44.550 778-825/com.checkinsystems.ez_score W/System.err: at com.checkinsystems.ez_score.ImportMasterCompsFileFragment$ParseCsvTask.doInBackground(ImportMasterCompsFileFragment.java:186) 03-05 15:34:44.550 778-825/com.checkinsystems.ez_score W/System.err: at com.checkinsystems.ez_score.ImportMasterCompsFileFragment$ParseCsvTask.doInBackground(ImportMasterCompsFileFragment.java:158) 03-05 15:34:44.550 778-825/com.checkinsystems.ez_score W/System.err: at android.os.AsyncTask$2.call(AsyncTask.java:304) 03-05 15:34:44.550 778-825/com.checkinsystems.ez_score W/System.err: at java.util.concurrent.FutureTask.run(FutureTask.java:237) 03-05 15:34:44.550 778-825/com.checkinsystems.ez_score W/System.err: at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:243) 03-05 15:34:44.550 778-825/com.checkinsystems.ez_score W/System.err: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1133) 03-05 15:34:44.550 778-825/com.checkinsystems.ez_score W/System.err: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:607) 03-05 15:34:44.550 778-825/com.checkinsystems.ez_score W/System.err: at java.lang.Thread.run(Thread.java:762) 03-05 15:34:44.550 778-825/com.checkinsystems.ez_score D/record: what is going on 而我的错误与数组索引超出界限的事实清楚地表明我们正在处理不同的场景。我没有被告知分隔符之间的任何无效字符。在我的案例中发生了不同的事情

这是我捕获此错误时调用的堆栈跟踪:

ArrayIndexOutOfBounds

所以我发现了为什么会抛出for(CSVRecord r : list){ Log.d("record", r.toString()); } 错误。 我运行了代码:

@Override
        public void onClick(View view) {

            File file = new File(Environment.getExternalStoragePublicDirectory(Environment.DIRECTORY_DOWNLOADS).getAbsolutePath()
                    + "/" + mFileName.getText().toString());

            new ParseCsvTask().execute(file);

        }

获取清单后。我注意到,由于某种原因,我得到一个空白记录,然后是正确的记录。换句话说,这种模式重复,我以某种方式获得两倍于我需要的记录,但是每一个都是空白的,这就是为什么我会得到索引问题。但我仍然无法理解为什么我会得到这些空白记录。这是调用代码的onClick按钮:

03-05 16:25:40.223 13019-13633/com.checkinsystems.ez_score D/record: CSVRecord [comment=null, mapping={member=0, first name=1, last name=2, name=3, email=4, phone=5, squad=6, age=7, gender=8, division=9, power factor=10, class=11, special =12}, recordNumber=1, values=[]]
03-05 16:25:40.223 13019-13633/com.checkinsystems.ez_score D/record: CSVRecord [comment=null, mapping={member=0, first name=1, last name=2, name=3, email=4, phone=5, squad=6, age=7, gender=8, division=9, power factor=10, class=11, special =12}, recordNumber=2, values=[A9J41, Bob, Al,len, Bob Allen, bob@comcast.net, 5555555555, 7, , , Production, Minor, D, ]]
03-05 16:25:40.223 13019-13633/com.checkinsystems.ez_score D/record: CSVRecord [comment=null, mapping={member=0, first name=1, last name=2, name=3, email=4, phone=5, squad=6, age=7, gender=8, division=9, power factor=10, class=11, special =12}, recordNumber=3, values=[]]
03-05 16:25:40.223 13019-13633/com.checkinsystems.ez_score D/record: CSVRecord [comment=null, mapping={member=0, first name=1, last name=2, name=3, email=4, phone=5, squad=6, age=7, gender=8, division=9, power factor=10, class=11, special =12}, recordNumber=4, values=[TY912111, Fred , Jones , Fred Jones , fred@gmail.com, 5555555555, 5, , , Revolver, Minor, C, ]]

这是一些logcat输出....我已经更改了数据以隐藏人们的信息:

{cnt.scheduleAtFixedRate(new TimerTask() {

      public void  run() {

     if(c.get(Calendar.DAY_OF_WEEK)==day && c.get(Calendar.HOUR_OF_DAY)== hour)
     {
      try {

      //implemented code in here
    } 

请记住,只有当我在第一条记录的姓氏中间添加逗号时才会发生这种情况。如果我把那个逗号拿出来,它就可以了。

1 个答案:

答案 0 :(得分:0)

我解决了!我正在使用依赖于RFC4180标准的格式化程序。此标准默认如下:

withDelimiter(',')
withQuote('"')
withRecordSeparator("\r\n")
withIgnoreEmptyLines(false)

最后一个属性,withIgnoreEmptyLines需要设置为true,否则格式化程序会在每隔一个记录后插入一个空白记录。我不确定为什么在我的记录之间插入空白记录是一个标准,但我用这一行修复了它:

CSVFormat formatter = CSVFormat.RFC4180.withFirstRecordAsHeader()
                    .withIgnoreEmptyLines(true);

这就是我获得ArrayIndexOutOfBounds

的原因

我希望这有助于其他人。感谢大家帮助我解决这个问题