我设计了一个包含5个不同表格的mnesia数据库。我的想法是模拟来自许多节点(计算机)的查询而不仅仅是一个,从终端我可以执行查询,但我只需要帮助我如何能够使我从多台计算机请求信息。我正在测试可伸缩性,并希望调查mnesia与其他数据库的性能。任何想法都将受到高度赞赏。
答案 0 :(得分:8)
测试mnesia的最佳方法是在运行mnesia的本地Erlang节点和远程节点上使用密集线程作业。通常,您希望使用RPC calls
来创建远程节点,其中在mnesia表上执行读写操作。当然,高并发性需要权衡;交易速度会降低,许多可能会被重试,因为在给定时间锁可能很多;但是mnesia将确保所有流程在每次交易中都收到{atomic,ok}
。
概念
我建议我们有一个非阻塞重载,其中写入和读取都通过尽可能多的进程定向到每个mnesia表。我们测量对write
函数的调用与我们的大量mnesia订阅者获取Write
事件所花费的时间之间的时间差。这些事件是在成功交易后由mnesia发送的,因此我们不需要中断工作/重载过程,而是让一个强大的" mnesia订阅者等待异步事件报告成功删除和写入。
这里的技术是我们在调用写入函数之前的时间点上取时间戳,然后记下record key
,write CALL timestamp
。然后我们的mnesia订阅者会记下record key
,write/read EVENT timestamp
。然后,这两个时间戳之间的时差(让我们称之为: CALL-to-EVENT Time
)可以让我们大致了解我们的加载方式或效率。随着锁定随着并发而增加,我们应该注册增加的 CALL-to-EVENT Time
参数。执行写入(无限制)的进程将同时执行,而执行读取的进程也将继续执行此操作而不会中断。我们将为每个操作选择进程数,但首先要为整个测试用例奠定基础
以上所有概念都适用于本地操作(与Mnesia在同一节点上运行的进程)
- >模拟许多节点
好吧,我个人没有在Erlang中模拟节点,我一直在同一个盒子上或在网络环境中的几台不同的机器上使用真正的Erlang节点。不过,我建议您仔细查看此模块: http://www.erlang.org/doc/man/slave.html,请在此处更多地关注此模块: http://www.erlang.org/doc/man/ct_slave.html,并在他们谈话时查看以下链接关于在另一个父节点( http://www.erlang.org/doc/man/pool.html, Erlang: starting slave node, https://support.process-one.net/doc/display/ERL/Starting+a+set+of+Erlang+cluster+nodes,{{3}下创建,模拟和控制多个节点} )。我不会潜入这里的Erlang Nodes丛林,因为这也是另一个复杂的话题,但我将专注于运行mnesia的同一节点上的测试。我已经提出了上面的mnesia测试概念,在这里,让我们开始实现它。
现在,首先,您需要为每个表制作一个测试计划(单独)。这应包括写入和读取。然后,您需要决定是否要对表执行脏操作或事务操作。您需要测试与其大小相关的遍历mnesia表的速度。让我们举一个简单的mnesia表的例子
-record(key_value,{key,value,instanceId,pid}).
我们希望有一个通用函数来写入我们的表格,如下所示:
write(Record)-> %% Use mnesia:activity/4 to test several activity %% contexts (and if your table is fragmented) %% like the commented code below %% %% mnesia:activity( %% transaction, %% sync_transaction | async_dirty | ets | sync_dirty %% fun(Y) -> mnesia:write(Y) end, %% [Record], %% mnesia_frag %% ) mnesia:transaction(fun() -> ok = mnesia:write(Record) end).
对于我们的阅读,我们将:
read(Key)-> %% Use mnesia:activity/4 to test several activity %% contexts (and if your table is fragmented) %% like the commented code below %% %% mnesia:activity( %% transaction, %% sync_transaction | async_dirty| ets | sync_dirty %% fun(Y) -> mnesia:read({key_value,Y}) end, %% [Key], %% mnesia_frag %% ) mnesia:transaction(fun() -> mnesia:read({key_value,Key}) end).现在,我们想在我们的小表中写入很多记录。我们需要一个密钥生成器。这个密钥生成器将是我们自己的伪随机字符串生成器。但是,我们需要我们的生成器告诉我们它生成密钥的瞬间,以便我们记录它。我们想看看编写生成的密钥需要多长时间。让我们这样说:
timestamp()-> erlang:now().要进行非常多的并发写入,我们需要一个将由我们将生成的许多进程执行的函数。在这个函数中,它希望不将任何阻塞函数(例如
str(XX)-> integer_to_list(XX).
generate_instance_id()-> random:seed(now()), guid() ++ str(crypto:rand_uniform(1, 65536 * 65536)) ++ str(erlang:phash2({self(),make_ref(),time()})).
guid()-> random:seed(now()), MD5 = erlang:md5(term_to_binary({self(),time(),node(), now(), make_ref()})), MD5List = binary_to_list(MD5), F = fun(N) -> f("~2.16.0B", [N]) end, L = lists:flatten([F(N) || N <- MD5List]), %% tell our massive mnesia subscriber about this generation InstanceId = generate_instance_id(), mnesia_subscriber ! {self(),{key,write,L,timestamp(),InstanceId}}, {L,InstanceId}.
sleep/1
)通常实现为sleep(T)-> receive after T -> true end.
。这样的函数会使进程执行挂起指定的毫秒数。 mnesia_tm
执行锁定控制,重试,阻止,e.t.c。代表进程避免死锁。可以说,我们希望每个进程都写一个unlimited amount of records
。这是我们的功能:
-define(NO_OF_PROCESSES,20). start_write_jobs()-> [spawn(?MODULE,generate_and_write,[]) || _ <- lists:seq(1,?NO_OF_PROCESSES)], ok. generate_and_write()-> %% remember that in the function ?MODULE:guid/0, %% we inform our mnesia_subscriber about our generated key %% together with the timestamp of the generation just before %% a write is made. %% The subscriber will note this down in an ETS Table and then %% wait for mnesia Event about the write operation. Then it will %% take the event time stamp and calculate the time difference %% From there we can make judgement on performance. %% In this case, we make the processes make unlimited writes %% into our mnesia tables. Our subscriber will trap the events as soon as %% a successful write is made in mnesia %% For all keys we just write a Zero as its value
{Key,Instance} = guid(), write(#key_value{key = Key,value = 0,instanceId = Instance,pid = self()}), generate_and_write().
同样,让我们看看如何完成读取作业。 我们将有一个密钥提供商,这个密钥提供商一直在mnesia表周围旋转,只选择键,在桌子的上下都会保持旋转。这是它的代码:
first()-> mnesia:dirty_first(key_value). next(FromKey)-> mnesia:dirty_next(key_value,FromKey). start_key_picker()-> register(key_picker,spawn(fun() -> key_picker() end)). key_picker()-> try ?MODULE:first() of '$end_of_table' -> io:format("\n\tTable is empty, my dear !~n",[]), %% lets throw something there to start with ?MODULE:write(#key_value{key = guid(),value = 0}), key_picker(); Key -> wait_key_reqs(Key) catch EXIT:REASON -> error_logger:error_info(["Key Picker dies",{EXIT,REASON}]), exit({EXIT,REASON}) end. wait_key_reqs('$end_of_table')-> receive {From,<<"get_key">>} -> Key = ?MODULE:first(), From ! {self(),Key}, wait_key_reqs(?MODULE:next(Key)); {_,<<"stop">>} -> exit(normal) end; wait_key_reqs(Key)-> receive {From,<<"get_key">>} -> From ! {self(),Key}, NextKey = ?MODULE:next(Key), wait_key_reqs(NextKey); {_,<<"stop">>} -> exit(normal) end. key_picker_rpc(Command)-> try erlang:send(key_picker,{self(),Command}) of _ -> receive {_,Reply} -> Reply after timer:seconds(60) -> %% key_picker hang, or too busy erlang:throw({key_picker,hanged}) end catch _:_ -> %% key_picker dead start_key_picker(), sleep(timer:seconds(5)), key_picker_rpc(Command) end. %% Now, this is where the reader processes will be %% accessing keys. It will appear to them as though %% its random, because its one process doing the %% traversal. It will all be a game of chance %% depending on the scheduler's choice %% he who will have the next read chance, will %% win ! okay, lets get going below :) get_key()-> Key = key_picker_rpc(<<"get_key">>), %% lets report to our "massive" mnesia subscriber %% about a read which is about to happen %% together with a time stamp. Instance = generate_instance_id(), mnesia_subscriber ! {self(),{key,read,Key,timestamp(),Instance}}, {Key,Instance}.哇!哇!现在我们需要创建一个我们将启动所有读者的功能。
-define(NO_OF_READERS,10). start_read_jobs()-> [spawn(?MODULE,constant_reader,[]) || _ <- lists:seq(1,?NO_OF_READERS)], ok. constant_reader()-> {Key,InstanceId} = ?MODULE:get_key(), Record = ?MODULE:read(Key), %% Tell mnesia_subscriber that a read has been done so it creates timestamp mnesia:report_event({read_success,Record,self(),InstanceId}), constant_reader().
现在,最重要的部分; mnesia_subscriber !!!这是一个订阅的简单过程 简单的事件。从mnesia用户指南中获取mnesia事件文档。 这是mnesia订户
-record(read_instance,{ instance_id, before_read_time, after_read_time, read_time %% after_read_time - before_read_time }). -record(write_instance,{ instance_id, before_write_time, after_write_time, write_time %% after_write_time - before_write_time }). -record(benchmark,{ id, %% {pid(),Key} read_instances = [], write_instances = [] }). subscriber()-> mnesia:subscribe({table,key_value, simple}), %% lets also subscribe for system %% events because events passing through %% mnesia:event/1 will go via %% system events. mnesia:subscribe(system), wait_events(). -include_lib("stdlib/include/qlc.hrl"). wait_events()-> receive {From,{key,write,Key,TimeStamp,InstanceId}} -> %% A process is just about to call %% mnesia:write/1 and so we note this down Fun = fun() -> case qlc:e(qlc:q([X || X <- mnesia:table(benchmark),X#benchmark.id == {From,Key}])) of [] -> ok = mnesia:write(#benchmark{ id = {From,Key}, write_instances = [ #write_instance{ instance_id = InstanceId, before_write_time = TimeStamp }] }), ok; [Here] -> WIs = Here#benchmark.write_instances, NewInstance = #write_instance{ instance_id = InstanceId, before_write_time = TimeStamp }, ok = mnesia:write(Here#benchmark{write_instances = [NewInstance|WIs]}), ok end end, mnesia:transaction(Fun), wait_events(); {mnesia_table_event,{write,#key_value{key = Key,instanceId = I,pid = From},_ActivityId}} -> %% A process has successfully made a write. So we look it up and %% get timeStamp difference, and finish bench marking that write WriteTimeStamp = timestamp(), F = fun()-> [Here] = mnesia:read({benchmark,{From,Key}}), WIs = Here#benchmark.write_instances, {_,WriteInstance} = lists:keysearch(I,2,WIs), BeforeTmStmp = WriteInstance#write_instance.before_write_time, NewWI = WriteInstance#write_instance{ after_write_time = WriteTimeStamp, write_time = time_diff(WriteTimeStamp,BeforeTmStmp) }, ok = mnesia:write(Here#benchmark{write_instances = [NewWI|lists:keydelete(I,2,WIs)]}), ok end, mnesia:transaction(F), wait_events(); {From,{key,read,Key,TimeStamp,InstanceId}} -> %% A process is just about to do a read %% using mnesia:read/1 and so we note this down Fun = fun()-> case qlc:e(qlc:q([X || X <- mnesia:table(benchmark),X#benchmark.id == {From,Key}])) of [] -> ok = mnesia:write(#benchmark{ id = {From,Key}, read_instances = [ #read_instance{ instance_id = InstanceId, before_read_time = TimeStamp }] }), ok; [Here] -> RIs = Here#benchmark.read_instances, NewInstance = #read_instance{ instance_id = InstanceId, before_read_time = TimeStamp }, ok = mnesia:write(Here#benchmark{read_instances = [NewInstance|RIs]}), ok end end, mnesia:transaction(Fun), wait_events(); {mnesia_system_event,{mnesia_user,{read_success,#key_value{key = Key},From,I}}} -> %% A process has successfully made a read. So we look it up and %% get timeStamp difference, and finish bench marking that read ReadTimeStamp = timestamp(), F = fun()-> [Here] = mnesia:read({benchmark,{From,Key}}), RIs = Here#benchmark.read_instances, {_,ReadInstance} = lists:keysearch(I,2,RIs), BeforeTmStmp = ReadInstance#read_instance.before_read_time, NewRI = ReadInstance#read_instance{ after_read_time = ReadTimeStamp, read_time = time_diff(ReadTimeStamp,BeforeTmStmp) }, ok = mnesia:write(Here#benchmark{read_instances = [NewRI|lists:keydelete(I,2,RIs)]}), ok end, mnesia:transaction(F), wait_events(); _ -> wait_events(); end. time_diff({A2,B2,C2} = _After,{A1,B1,C1} = _Before)-> {A2 - A1,B2 - B1,C2 - C1}.
好吧!那是巨大的:)所以我们完成了订阅者。我们需要将代码全部放在一起并运行必要的测试。
install()-> mnesia:stop(). mnesia:delete_schema([node()]), mnesia:create_schema([node()]), mnesia:start(), {atomic,ok} = mnesia:create_table(key_value,[ {attributes,record_info(fields,key_value)}, {disc_copies,[node()]}
]), {atomic,ok} = mnesia:create_table(benchmark,[ {attributes,record_info(fields,benchmark)}, {disc_copies,[node()]} ]), mnesia:stop(), ok.
start()-> mnesia:start(), ok = mnesia:wait_for_tables([key_value,benchmark],timer:seconds(120)), %% boot up our subscriber register(mnesia_subscriber,spawn(?MODULE,subscriber,[])), start_write_jobs(), start_key_picker(), start_read_jobs(), ok.
现在,通过对基准表记录的正确分析,您将获得平均读取时间的记录,
平均写入时间e.t.c.您可以根据不断增加的进程数绘制这些时间的图表。
随着我们增加进程数量,您将发现读取和写入时间增加
。获取代码,阅读并使用它。你可能不会全部使用它,但我相信你可以接受
那里的新概念和其他人一样在那里发送解决方案。使用mnesia事件是测试mnesia读写的最佳方法,而不会阻止进行实际写入或读取的进程。在上面的示例中,读取和写入过程不受任何控制,事实上,它们将永远运行,直到您终止VM。您可以使用良好的公式遍历基准表,以利用每个读取或写入实例的读取和写入时间,然后计算平均值,变量e.t.c.
<小时/>
因此,mnesia背后的概念只能与爱立信的NDB数据库进行比较: http://igorrs.blogspot.com/2009/11/consistent-hashing-for-mnesia-fragments.html ,但不能与现有的RDBMS进行比较,或面向文档的数据库等这些是我的想法:)让我们等待其他人的话......
答案 1 :(得分:0)
使用如下命令启动其他节点:
erl -name test1@127.0.0.1 -cookie devel \
-mnesia extra_db_nodes "['devel@127.0.0.1']"\
-s mnesia start
其中'devel@127.0.0.1'是已经设置了mnesia的节点。在这种情况下,将从远程节点访问所有表,但您可以使用mnesia:add_table_copy/3
制作本地副本。
然后,您可以使用spawn/2
或spawn/4
在所有节点上开始生成负载,例如:
lists:foreach(fun(N) ->
spawn(N, fun () ->
%% generate some load
ok
end
end,
[ 'test1@127.0.0.1', 'test2@127.0.0.1' ]
)