如何使用DSE搜索以原子方式更新cassandra?

时间:2016-02-25 02:02:07

标签: solr cassandra datastax datastax-enterprise

我有一些代码会定期更新表格。每次都应该从表中删除然后插入新记录。

问题是dse搜索有一个索引表的间隙。

这是代码:

session_statis.execute('DELETE FROM statistics WHERE source = %s', [source])

timeone = datetime.now(tz) - timedelta(hours=1)

channels_rdd = channels.map(lambda x:(x.id,{'author':x.name,'category':x.category}))

article_rdd=rdd.map(lambda x:(x[1][0]['channel'],{'source':x[1][0]['source'],'id':x[1][0]['id'],'title':x[1][0]['title'],'thumbnail':x[1][0]['thumbnail'],'url':x[1][0]['url'],'created_at':x[1][0]['created_at'],'genre':x[1][0]['genre'],'reads':0,'likes':x[1][1]['attitudes'],'comments':x[1][1]['comments'],'shares':x[1][1]['reposts'],'shares':x[1][1]['reposts']})) \
                .join(channels_rdd).map(lambda x:{'source':x[1][0]['source'],'id':x[1][0]['id'],'title':x[1][0]['title'],'thumbnail':x[1][0]['thumbnail'],'url':x[1][0]['url'],'created_at':parse(x[1][0]['created_at']),'genre':x[1][0]['genre'],'reads':0,'likes':x[1][0]['likes'],'comments':x[1][0]['comments'],'shares':x[1][0]['shares'],'speed':x[1][0]['shares'],'category':x[1][1]['category'],'author':x[1][1]['author']})

result1=article_rdd.filter(lambda x:x['created_at']>=timeone).filter(lambda x:x['speed']>0).map(lambda x:{'timespan':'1','source':x['source'],'id':x['id'],'title':x['title'],'thumbnail':x['thumbnail'],'url':x['url'],'created_at':x['created_at'],'genre':x['genre'],'reads':0,'likes':x['likes'],'comments':x['comments'],'shares':x['shares'],'speed':x['shares'],'category':x['category'],'author':x['author']})

for rdd in result1.collect():
    dt article xxxx
        session_statis.execute('INSERT INTO statistics(source, timespan, id, title, thumbnail, url, created_at, category, genre, author, reads, likes, comments, shares, speed) values(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)', (rdd['source'],rdd['timespan'],rdd['id'],rdd['title'],rdd['thumbnail'],rdd['url'],rdd['created_at'],rdd['category'],rdd['genre'],rdd['author'],rdd['reads'],rdd['likes'],rdd['comments'],rdd['shares'],rdd['speed']))

感谢您的回复。

1 个答案:

答案 0 :(得分:1)

根据您的使用模式,您可能不得不考虑各种一致性等级。如设置1将产生良好的结果。