使用javascript和python客户端的Riak MapReduce

时间:2018-11-03 21:18:57

标签: javascript python mapreduce riak

设置

考虑一个简单的Riak键值数据库,其中包含城市名称和与该城市关联的少量标签。我正在使用Python 3客户端创建存储桶并添加数据:

import riak

client = riak.RiakClient(pb_port=8087, protocol='pbc')
bucket = client.bucket('cities')

# Adding data to the bucket
bucket.new('tallinn', {'name': 'Tallinn', 'tags': ['architecture', 'food',  'port', 'forest']}).store()
bucket.new('riga', {'name': 'Riga', 'tags': ['food', 'architecture', 'forest']}).store()
bucket.new('vilnius', {'name': 'Vilnius', 'tags': ['beer', 'food', 'shopping']}).store()
bucket.new('kiev', {'name': 'Kiev'}).store()

然后我可以像这样检查存储桶中的内容:

keys = client.get_keys(bucket)  # Get all keys from bucket
print('Keys:', keys)
for key in keys:
    article = bucket.get(key).data  # Get data by key from bucket
    print(article)
print(type(article))  # Check what is the type of object I get

输出:

Keys: ['tallinn', 'riga', 'kiev', 'vilnius']
{'name': 'Tallinn', 'tags': ['architecture', 'food', 'port', 'forest']}
{'name': 'Riga', 'tags': ['food', 'architecture', 'forest']}
{'name': 'Kiev'}
{'name': 'Vilnius', 'tags': ['beer', 'food', 'shopping']}
<class 'dict'>

如您所见,我得到了我的付出。而且由于对象的类型仍然是字典<class 'dict'>,因此我可以轻松访问数据的任何部分。

问题

从这些数据中,我想使用MapReduce获得出现在数据中的每个标签的受欢迎程度。就像元组或列表的排序列表一样:

[(3, 'food'), (2, 'forest'), (2, 'architecture'), (1, 'shopping'), (1, 'port'), (1, 'beer')]

MapReduce

使用以下代码,我可以从每个键值对中获取标签列表:

query = client.add('cities')

# Javascript functions for Map phase and Reduce phace
js_func_map = "function(v) {var val = JSON.parse(v.values[0].data);"\
              "return[val.tags];}"
js_func_reduce = "function(values) {return values;}"

query.map(js_func_map)  # Add Javascript function to Map phase
query.reduce(js_func_reduce)  # Add Javascript function to Reduce phase

# Get result form query
for result in query.run():
    print(result)

但是,它仍然与我的意图相去甚远:

['bear', 'architecture', 'forest']
None
['architecture', 'food', 'port', 'forest']
['bear', 'food', 'shopping']

0 个答案:

没有答案