在erlang中拆分ets

时间:2012-08-09 17:51:30

标签: erlang ets

我在mapReduce algorithem的课程中工作,所以我在一个大数据文件中在Erlang中构建了一个ets表,我想同时处理它。 结果表非常大,我想知道是否有办法将一个大表拆分成几个较小的表,以便我可以同时搜索表 使用mapReduce算法,有没有办法将一个大表分成子表? 日Thnx。

2 个答案:

答案 0 :(得分:1)

您可以同时搜索ETS表,而无需拆分表:

http://www.erlang.org/doc/man/ets.html#new_2_read_concurrency

如果表格很大,我建议你使用一个好的匹配模式来帮助减少搜索量:http://www.erlang.org/doc/man/ets.html#select-2

答案 1 :(得分:1)

我曾在一个内部网应用程序上工作过,我必须在大多数时间内将内容保存在RAM中。我创建了一个稳定的caching library,帮助我抽象出ETS机制。在这个库中,我创建了worker gen_servers,其工作是创建,拥有和公开ETS表的方法。我将它们命名为:cache1cache2。这两个人以冗余的方式继续将所有权转让给对方,以防其中一个人遇到问题。获取申请:http://www.4shared.com/zip/z_VgKLpa/cache-10.html   只需将其解压缩并使用Emake file重新编译它,然后将其放入Erlang Lib directory中。为了查看它是如何工作的,这里有一个shell插件。

F:\programming work\cache-1.0>erl -pa ebin
Eshell V5.9  (abort with ^G)
1> application:start(cache).
ok
2> rd(student,{name,age,sex}).
student
3> cache_server:new(student,set,2).
ok
4> cache_server:write(#student{name = "Muzaaya Joshua",
                        sex = "Male",age = (2012 - 1987) }).
ok
5> cache_server:write(student,[#student{name = "Joe",sex = "Male"},
                #student{name = "Mike",sex = "Male"}]).
ok
6> cache_server:read({student,"Muzaaya Joshua"}).
[#student{name = "Muzaaya Joshua",age = 25,sex = "Male"}]
7> cache_server:read({student,"Joe"}).
[#student{name = "Joe",age = undefined,sex = "Male"}]
8> cache_server:get_tables().
[{cache1,[student]},{cache2,[]}]
9> rd(class,{class,no_of_students}).
class
10> cache_server:get_tables().
[{cache1,[student]},{cache2,[]}]
11> cache_server:new(class,set,2).
ok
12> cache_server:get_tables().
[{cache1,[student]},{cache2,[class]}]
13> cache_server:write(class,[
        #class{class = "Primary " ++ integer_to_list(N),
        no_of_students = random:uniform(50)} || N <- lists:seq(1,7)])
.
ok
14> cache_server:read({class,"Primary 6"}).
[#class{class = "Primary 6",no_of_students = 30}]
15> cache_server:delete({class,"Primary 2"}).
ok
16> cache_server:get_cache_state().
[{server_state,cache1,1,[student]},
 {server_state,cache2,1,[class]}]
17> rd(food,{name,type,value}).
food
18> cache_server:new(food,set,2).
ok
19> cache_server:write(food,[#food{name = "Orange",
                        type = "fruit",value = "Vitamin C"}]).
ok
20> cache_server:get_cache_state().
[{server_state,cache1,2,[food,student]},
 {server_state,cache2,1,[class]}]
21>
现在,要了解ets:give_away/3的重要性,让我们看看当cache1cache2崩溃时会发生什么。请记住,当前服务器状态(显示表的当前所有者)是:
21> cache_server:get_cache_state().
[{server_state,cache1,2,[food,student]},
 {server_state,cache2,1,[class]}]
22>
让我崩溃cache1,我们看到了。
22> gen_server:cast(cache1,stop).
ok
        Cache Server: cache2 has taken over table: food from server: cache1
23>
        Cache Server: cache2 has taken over table: student from server: cache1
23> cache_server:get_cache_state().
[{server_state,cache1,0,[]},
 {server_state,cache2,3,[student,food,class]}]
24>
另外一个:
24> gen_server:cast(cache2,stop).
ok
        Cache Server: cache1 has taken over table: student from server: cache2
25>
        Cache Server: cache1 has taken over table: food from server: cache2
25>
        Cache Server: cache1 has taken over table: class from server: cache2
25> cache_server:get_cache_state().
[{server_state,cache1,3,[class,food,student]},
 {server_state,cache2,0,[]}]
26>
而已 !您可以使用源代码中的概念来创建自己的东西。该库创建的ETS表格为publicnamed,因此您可以使用ETS函数直接访问它们。