我目前正在探索在群集模式下运行的apache钻取。我的数据是mongodb.My数据源表包含500万个文档。我无法执行简单的查询
select body from mongo.twitter.tweets limit 10;
投掷例外
Query Failed: An Error Occurred
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: IndexOutOfBoundsException: index: 0, length: 264 (expected: range(0, 256)) Fragment 1:2 [Error Id: 8903127a-e9e9-407e-8afc-2092b4c03cf0 on test01.css.org:31010] (java.lang.IndexOutOfBoundsException) index: 0, length: 264 (expected: range(0, 256)) io.netty.buffer.AbstractByteBuf.checkIndex():1134 io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes():272 io.netty.buffer.WrappedByteBuf.setBytes():390 io.netty.buffer.UnsafeDirectLittleEndian.setBytes():30 io.netty.buffer.DrillBuf.setBytes():753 io.netty.buffer.AbstractByteBuf.setBytes():510 org.apache.drill.exec.store.bson.BsonRecordReader.writeString():265 org.apache.drill.exec.store.bson.BsonRecordReader.writeToListOrMap():167 org.apache.drill.exec.store.bson.BsonRecordReader.write():75 org.apache.drill.exec.store.mongo.MongoRecordReader.next():186 org.apache.drill.exec.physical.impl.ScanBatch.next():178 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext():115 org.apache.drill.exec.record.AbstractRecordBatch.next():162 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():94 org.apache.drill.exec.record.AbstractRecordBatch.next():162 org.apache.drill.exec.physical.impl.BaseRootExec.next():104 org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92 org.apache.drill.exec.physical.impl.BaseRootExec.next():94 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():232 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1657 org.apache.drill.exec.work.fragment.FragmentExecutor.run():226 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1142 java.util.concurrent.ThreadPoolExecutor$Worker.run():617 java.lang.Thread.run():745
正在获取结果的工作查询
select body from mongo.twitter.tweets where tweet_id = 'tag:search.twitter.com,2005:xxxxxxxxxx';
源文件中的示例文档
{
"_id" : ObjectId("58402ad5757d7fede822e641"),
"rule_list" : [
"x",
"(contains:x (contains:y OR contains:y1)) OR (contains:v contains:b) OR (contains:v (contains:r OR contains:t))"
],
"actor_friends_count" : 79,
"klout_score" : 19,
"actor_favorites_count" : 0,
"actor_preferred_username" : "xxxxxxx",
"sentiment" : "neg",
"tweet_id" : "tag:search.twitter.com,2005:xxxxxxxxx",
"object_actor_followers_count" : 1286,
"actor_posted_time" : "2016-07-16T14:08:25.000Z",
"actor_id" : "id:twitter.com:xxxxxxxx",
"actor_display_name" : "xxxxx",
"retweet_count" : 6,
"hashtag_list" : [
"myhashtag"
],
"body" : "my tweet body",
"actor_followers_count" : 25,
"actor_status_count" : 243,
"verb" : "share",
"posted_time" : "2016-08-01T07:49:00.000Z",
"object_actor_status_count" : 206,
"lang" : "ar",
"object_actor_preferred_username" : "xxxxxx",
"original_tweet_id" : "tag:search.twitter.com,2005:xxxxxx",
"gender" : "male",
"object_actor_id" : "id:twitter.com:xxxxxxx",
"favorites_count" : 0,
"object_posted_time" : "2016-06-20T04:12:02.000Z",
"object_actor_friends_count" : 2516,
"generator_display_name" : "Twitter for iPhone",
"object_actor_display_name" : "sdfsf",
"actor_listed_count" : 0
}
感谢任何帮助!
答案 0 :(得分:0)
设置store.mongo.bson.record.reader = false;