我有一个很大的Casandra集合(一百万个文档),我想查询整个用户表数据中的一百万条记录。当我运行以下查询时,它仅返回约10K记录。
请让我知道从Casandra集合查询整个文档的有效方法是什么
我使用https://www.npmjs.com/package/cassandra-driver npm作为casandra驱动程序
I0311 16:57:21.281645 MainThread program.py:165] Not bringing up TensorBoard, but inspecting event files.
I0311 16:57:21.281645 140028330256128 program.py:165] Not bringing up TensorBoard, but inspecting event files.
======================================================================
Processing event files... (this can take a few minutes)
======================================================================
Found event files in:
./CN_flow1_95/eval
./CN_flow1_95/train
These tags are in ./CN_flow1_95/eval:
audio -
histograms -
images
image-0
image-1
image-2
image-3
image-4
image-5
image-6
image-7
image-8
image-9
scalars
Losses/Loss/BoxClassifierLoss/classification_loss
Losses/Loss/BoxClassifierLoss/localization_loss
Losses/Loss/RPNLoss/localization_loss
Losses/Loss/RPNLoss/objectness_loss
PascalBoxes_PerformanceByCategory/AP@0.5IOU/b'cyclist'
PascalBoxes_PerformanceByCategory/AP@0.5IOU/b'motorcyclist'
PascalBoxes_PerformanceByCategory/AP@0.5IOU/b'pedestrian'
PascalBoxes_Precision/mAP@0.5IOU
tensor -
======================================================================
Event statistics for ./CN_flow1_95/eval:
audio -
graph
first_step 0
last_step 0
max_step 0
min_step 0
num_steps 1
outoforder_steps []
histograms -
images
first_step 0
last_step 4112
max_step 4112
min_step 0
num_steps 7
outoforder_steps []
scalars
first_step 0
last_step 4112
max_step 4112
min_step 0
num_steps 7
outoforder_steps []
sessionlog:checkpoint -
sessionlog:start -
sessionlog:stop -
tensor -
======================================================================
These tags are in ./CN_flow1_95/train:
audio -
histograms
ModelVars/...
images -
scalars
Losses/TotalLoss
Losses/clone_0/Loss/BoxClassifierLoss/classification_loss
Losses/clone_0/Loss/BoxClassifierLoss/localization_loss
Losses/clone_0/Loss/RPNLoss/localization_loss
Losses/clone_0/Loss/RPNLoss/objectness_loss
Losses/clone_1/Loss/BoxClassifierLoss/classification_loss
Losses/clone_1/Loss/BoxClassifierLoss/localization_loss
Losses/clone_1/Loss/RPNLoss/localization_loss
Losses/clone_1/Loss/RPNLoss/objectness_loss
Losses/clone_2/Loss/BoxClassifierLoss/classification_loss
Losses/clone_2/Loss/BoxClassifierLoss/localization_loss
Losses/clone_2/Loss/RPNLoss/localization_loss
Losses/clone_2/Loss/RPNLoss/objectness_loss
batch/fraction_of_150_full
clone_0/Losses/clone_0//clone_loss
global_step/sec
queue/prefetch_queue/fraction_of_5_full
tensor -
======================================================================
Event statistics for ./CN_flow1_95/train:
audio -
graph
first_step 0
last_step 0
max_step 0
min_step 0
num_steps 1
outoforder_steps []
histograms
first_step 0
last_step 4110
max_step 4110
min_step 0
num_steps 28
outoforder_steps []
images -
scalars
first_step 0
last_step 4110
max_step 4110
min_step 0
num_steps 54
outoforder_steps []
sessionlog:checkpoint
first_step 1
last_step 4111
max_step 4111
min_step 1
num_steps 7
outoforder_steps []
sessionlog:start
outoforder_steps []
steps [0, 4110]
sessionlog:stop
outoforder_steps []
steps [0, 0]
tensor -
======================================================================
答案 0 :(得分:2)
为什么不能一次检索所有数据是因为可以一次读取的项数有一定限制,这是可以理解的。
看一下您使用stream
或eachRow
方法的documentation。这样您就可以多次处理集合中的条目。
client.stream(query, parameters, options)
.on('readable', function () {
// readable is emitted as soon a row is received and parsed
let row;
while (row = this.read()) {
// process row
}
})
.on('end', function () {
// emitted when all rows have been retrieved and read
});
或
client.eachRow(query, parameters, { prepare: true, autoPage : true }, function(n, row) {
// Invoked per each row in all the pages
}, callback);