我使用Duke进行记录链接,在基本测试中,我从CSVReader获取此异常java.lang.ArrayIndexOutOfBoundsException:1000。
这是我的Java类:
Configuration config = ConfigLoader.load("resources/dukeConfiguration.xml");
Processor proc = new Processor(config);
proc.addMatchListener(new PrintMatchListener(true, true, true, false,
config.getProperties(),
true));
proc.link();
proc.close();
,这个是配置文件:
<duke>
<schema>
<threshold>0.7</threshold>
<property type="id">
<name>ID</name>
</property>
<property>
<name>TITLE</name>
<comparator>no.priv.garshol.duke.comparators.Levenshtein</comparator>
<low>0.09</low>
<high>0.93</high>
</property>
<property>
<name>ARTIST</name>
<comparator>no.priv.garshol.duke.comparators.Levenshtein</comparator>
<low>0.04</low>
<high>0.73</high>
</property>
</schema>
<group>
<jdbc>
<param name="driver-class" value="com.mysql.jdbc.Driver" />
<param name="connection-string" value="jdbc:mysql://localhost:3306/digitalmusic" />
<param name="user-name" value="root" />
<param name="password" value="root" />
<param name="query" value="select * from inventory" />
<column name="idsong" property="ID" />
<column name="title" property="TITLE" />
<column name="artist" property="ARTIST" />
</jdbc>
</group>
<group>
<csv>
<param name="input-file" value="/home/mongo.csv" />
<param name="header-line" value="false" />
<column name="1" property="ID" />
<column name="2" property="TITLE" />
<column name="3" property="ARTIST" />
</csv>
</group>
</duke>
有人知道问题出在哪里?
堆栈跟踪:
Records: 0
Records: 40000
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1000
at no.priv.garshol.duke.utils.CSVReader.next(CSVReader.java:70)
at no.priv.garshol.duke.datasources.CSVDataSource$CSVRecordIterator.findNextRecord(CSVDataSource.java:170)
at no.priv.garshol.duke.datasources.CSVDataSource$CSVRecordIterator.next(CSVDataSource.java:198)
at no.priv.garshol.duke.datasources.CSVDataSource$CSVRecordIterator.next(CSVDataSource.java:111)
at no.priv.garshol.duke.Processor.linkRecords(Processor.java:362)
at no.priv.garshol.duke.Processor.link(Processor.java:319)
at no.priv.garshol.duke.Processor.link(Processor.java:298)
at no.priv.garshol.duke.Processor.link(Processor.java:285)
at duke.DukeCollecting.main(DukeCollecting.java:20)
答案 0 :(得分:1)
好的,这是你的问题。
根据latest source posted @ GitHub,当您实例化新的CSVReader
时,会发生这种情况:
public CSVReader(Reader in, int buflen, String file) throws IOException {
this.buf = new char[buflen];
this.pos = 0;
this.len = in.read(buf, 0, buf.length);
this.tmp = new String[1000];
this.in = in;
this.separator = ','; // default
this.file = file;
}
根据你的堆栈跟踪,错误发生在这个块中:
if (escaped_quote)
tmp[colno++] = unescape(new String(buf, prev + 1, pos - prev - 1));
else
tmp[colno++] = new String(buf, prev + 1, pos - prev - 1);
问题是,CSVReader colno
比之前分配的1000
数组容量更大,因此生成java.lang.ArrayIndexOutOfBoundsException
这些是你的选择恕我直言:
选项1:获取源代码(分支项目),增加tmp
缓冲区,直到程序正常运行并重新编译;或
选项2:检查GitHub项目页面,查看是否存在任何有关此问题的未解决问题(或只打开一个问题),并确定您的信息是否有任何格式错误可能导致array overflow
。
我推荐选项2 除非你赶时间。
祝你好运!