要在不构建柱状数据库的情况下保存在磁盘上,请执行以下操作:
import java.Math.;
public int count-digits (int num){
int count = 0;
String numF = string.valueOf(num);
// We get the number of digits by logs.
for(int j=0; j <= 9; j++){ //loop for each digits
for(int i=0; i < Math.floor(Math.log10(num)); i++){ //this loops checks each no.
if(numF.charAt(j).equals(i)){
count++;
}
return count;
count=0;
}
}
}
只是想知道哪一个在速度方面效率最高? 感谢
答案 0 :(得分:1)
我考虑羽毛,HDF5。 MySQL或PostgreSQL - 也可能是一个选项,具体取决于您将如何查询数据......
以下是HDF5的演示:
In [33]: df = pd.DataFrame(np.random.randint(0, 10**6, (10**4, 3)), columns=list('abc'))
In [34]: df['txt'] = 'X' * 300
In [35]: df
Out[35]:
a b c txt
0 689347 129498 770470 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX...
1 954132 97912 783288 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX...
2 40548 938326 861212 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX...
3 869895 39293 242473 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX...
4 938918 487643 362942 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX...
...
In [37]: df.to_hdf('c:/temp/test_str.h5', 'test', format='t', data_columns=['a', 'c'])
In [38]: store = pd.HDFStore('c:/temp/test_str.h5')
In [39]: store.get_storer('test').table
Out[39]:
/test/table (Table(10000,)) ''
description := {
"index": Int64Col(shape=(), dflt=0, pos=0),
"values_block_0": Int32Col(shape=(1,), dflt=0, pos=1),
"values_block_1": StringCol(itemsize=300, shape=(1,), dflt=b'', pos=2), # <---- NOTE
"a": Int32Col(shape=(), dflt=0, pos=3),
"c": Int32Col(shape=(), dflt=0, pos=4)}
byteorder := 'little'
chunkshape := (204,)
autoindex := True
colindexes := {
"index": Index(6, medium, shuffle, zlib(1)).is_csi=False,
"a": Index(6, medium, shuffle, zlib(1)).is_csi=False,
"c": Index(6, medium, shuffle, zlib(1)).is_csi=False}