Question

如果数组存在于dB中，我想检查if语句。到目前为止，我正在检查光标中的上述语句，但我猜它会降低查询速度。我的代码到现在为止：

编辑： lines = [line.rstrip（）for open in line（input_file）]

print len(lines)
row_no = len(lines)
col_no = len(lines)
matrix = sparse.lil_matrix((len(lines), len(lines)))

no_row  = 0
counter = 0
for item in lines:
    # find from database those items which their id exists in lines list and contain a follower_list 
    for cursor in collection.find({"_id.uid": int(item)}):
        if cursor['list_followers'] is None:
                continue
        else:               
            id = cursor['_id']['uid']
            counter+=1
            print counter
            print id
            name = cursor['screenname']
            # text.write('%s \n' %name)
            followers = cursor['list_followers']    
            print len(followers)
            for follower in followers:
                try:
                    if (follower in lines) and (len(followers)>0):
                        matrix[no_row, lines.index(follower)] = 1
                        print no_row, " ", lines.index(follower), " ", matrix[no_row, lines.index(follower)]
                except ValueError:
                    continue
            no_row+=1
            print no_row

scipy.io.mmwrite(output_file, matrix, field='integer')

最后我发现延迟是由于sparse.lil_matrix

的创建造成的

Answer 1

我能想到的最接近的事情是实现sparse index并稍微查询一下。我将构建一个示例来演示：

{ "a" : 1 }
{ "a" : 1, "b" : [ ] }
{ "a" : 1 }
{ "a" : 1, "b" : [ ] }
{ "b" : [ 1, 2, 3 ] }

基本上你似乎要问的是将最后一个文档作为匹配而不扫描所有内容。这是不同查询和稀疏索引有帮助的地方。首先是查询：

db.collection.find({ "b.0": { "$exists": 1 } })

仅返回1项，因为现有数组中包含某些内容的第一个索引位置。现在索引：

db.collection.ensureIndex({ "b": 1 },{ "sparse": true })

但由于查询性质，我们必须.hint()：

db.collection.find({ "b.0": { "$exists": 1 } }).hint({ "b": 1 }).explain()

获取1个文档，只考虑实际拥有数组的3个文档。

检查mongodB中是否存在元素

1 个答案: