I was using a use mysql;
SET GLOBAL general_log = 'OFF';
DROP TABLE general_log;
CREATE TABLE IF NOT EXISTS `general_log` (
`event_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`user_host` mediumtext NOT NULL,
`thread_id` bigint(21) unsigned NOT NULL, -- Be careful with this one.
`server_id` int(10) unsigned NOT NULL,
`command_type` varchar(64) NOT NULL,
`argument` mediumtext NOT NULL
);
SET GLOBAL general_log = 'ON';
SET GLOBAL log_output = 'TABLE';
select * from mysql.general_log
order by event_time desc;
in some code to store ordered data. I found out that for huge maps, destruction could take a while. In this code I had, replacing map
by map
reduced processing time by 10000...
Finally, I was so surprised that I decided to compare vector<pair>
performances with sorted map
or vector
.
And I'm surprised because I could not find a situation where pair
was faster than a sorted map
of vector
(filled randomly and later sorted)...there must be some situations where pair
is faster....else what's the point in providing this class?
Here is what I tested:
Test one, compare map
filling and destroying vs map
filling, sorting (because I want a sorted container) and destroying:
vector
Compiled with #include <iostream>
#include <time.h>
#include <cstdlib>
#include <map>
#include <vector>
#include <algorithm>
int main(void)
{
clock_t tStart = clock();
{
std::map<float,int> myMap;
for ( int i = 0; i != 10000000; ++i )
{
myMap[ ((float)std::rand()) / RAND_MAX ] = i;
}
}
std::cout << "Time taken by map: " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;
tStart = clock();
{
std::vector< std::pair<float,int> > myVect;
for ( int i = 0; i != 10000000; ++i )
{
myVect.push_back( std::make_pair( ((float)std::rand()) / RAND_MAX, i ) );
}
// sort the vector, as we want a sorted container:
std::sort( myVect.begin(), myVect.end() );
}
std::cout << "Time taken by vect: " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;
return 0;
}
and got:
g++ main.cpp -O3 -o main
Time taken by map: 21.7142
Time taken by vect: 7.94725
's 3 times slower...
Then, I said, "OK, vector is faster to fill and sort, but search will be faster with the map"....so I tested:
map
Compiled with #include <iostream>
#include <time.h>
#include <cstdlib>
#include <map>
#include <vector>
#include <algorithm>
int main(void)
{
clock_t tStart = clock();
{
std::map<float,int> myMap;
float middle = 0;
float last;
for ( int i = 0; i != 10000000; ++i )
{
last = ((float)std::rand()) / RAND_MAX;
myMap[ last ] = i;
if ( i == 5000000 )
middle = last; // element we will later search
}
std::cout << "Map created after " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;
float sum = 0;
for ( int i = 0; i != 10; ++i )
sum += myMap[ last ]; // search it
std::cout << "Sum is " << sum << std::endl;
}
std::cout << "Time taken by map: " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;
tStart = clock();
{
std::vector< std::pair<float,int> > myVect;
std::pair<float,int> middle;
std::pair<float,int> last;
for ( int i = 0; i != 10000000; ++i )
{
last = std::make_pair( ((float)std::rand()) / RAND_MAX, i );
myVect.push_back( last );
if ( i == 5000000 )
middle = last; // element we will later search
}
std::sort( myVect.begin(), myVect.end() );
std::cout << "Vector created after " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;
float sum = 0;
for ( int i = 0; i != 10; ++i )
sum += (std::find( myVect.begin(), myVect.end(), last ))->second; // search it
std::cout << "Sum is " << sum << std::endl;
}
std::cout << "Time taken by vect: " << ((double)(clock() - tStart)/CLOCKS_PER_SEC) << std::endl;
return 0;
}
and got:
g++ main.cpp -O3 -o main
Even search is apparently faster with the Map created after 19.5357
Sum is 1e+08
Time taken by map: 21.41
Vector created after 7.96388
Sum is 1e+08
Time taken by vect: 8.31741
(10 searchs with the vector
took almost 2sec and it took only half a second with the map
)....
So:
vector
simply a class to avoid or is there really situations where map
offers good performances?答案 0 :(得分:3)
通常情况下,map
会更好地进行大量插入和删除操作。如果您构建一次数据结构然后只进行查找,那么排序的vector
几乎肯定会更快,只是因为处理器缓存效应。由于向量中任意位置的插入和删除都是O(n)而不是O(log n),因此这些将成为限制因素。
答案 1 :(得分:1)
std::find
具有线性时间复杂度,而map
搜索具有log N复杂度。
当你发现一个算法比另一个算法快100000倍时,你会产生怀疑!您的基准无效。
您需要比较现实的变体。可能,您的意思是将地图与二进制搜索进行比较。运行每个变量至少1秒的CPU时间,以便您可以实际比较结果。
当基准测试返回“0.00001秒”时间时,您可以很好地处理时钟误差。这个数字什么都没有。