给定一个具有分辨率压缩二进制数据的文件,我想将子字节位转换为python中的整数表示。我的意思是我需要将文件中的//db.collection.distinct(field, query) <-- check out docs.
db.yourCollectionName.distinct( "street_name", { branch: 1 } )
位解释为整数。
目前我正在将文件读入n
个对象,并将对象的子集转换为整数。这个过程有效,但相当缓慢和繁琐。是否有更好的方法可以使用bitarray
模块执行此操作?
struct
输出:
import bitarray
bits = bitarray.bitarray()
with open('/dir/to/any/file.dat','r') as f:
bits.fromfile(f,2) # read 2 bytes into the bitarray
## bits 0:4 represent a field
field1 = int(bits[0:4].to01(), 2) # Converts to a string of 0s and 1s, then int()s the string
## bits 5:7 represent a field
field2 = int(bits[4:7].to01(), 2)
## bits 8:16 represent a field
field3 = int(bits[7:16].to01(), 2)
print """All bits: {bits}\n\tfield1: {b1}={field1}\n\tfield2: {b2}={field2}\n\tfield3: {b3}={field3}""".format(
bits=bits, b1=bits[0:4].to01(), field1=field1,
b2=bits[4:7].to01(), field2=field2,
b3=bits[7:16].to01(), field3=field3)
答案 0 :(得分:4)
如果您可以使用某人的模块,看起来像bitstring模块具有良好的表示和位操作:http://pythonhosted.org/bitstring/index.html
例如,如果您知道字段的大小,则可以使用格式字符串: http://pythonhosted.org/bitstring/reading.html#reading-using-format-strings
import bitstring
bitstream = bitstring.ConstBitStream(filename='testfile.bin')
field1, field2, field3 = bitstream.readlist('int:4, int:3, int:9')
如果您不了解字段大小,则可以读取所有字段,然后使用切片提取所有字段:http://pythonhosted.org/bitstring/slicing.html
import bitstring
bitstream = bitstring.ConstBitStream(filename='testfile.bin')
bits = bitstring.BitArray(bitstream)
field1 = bits[0:4].int
field2 = bits[4:7].int
field3 = bits[7:16].int
只是想一想,你可能已经找到了这个模块。
答案 1 :(得分:3)
这适用于您的具体情况:
#bitmasks of fields 1-3, they fit in 2 bytes
FIELD1 = 0b1111000000000000 # first 4 bits
FIELD2 = 0b0000111000000000 # next 3 bits
FIELD3 = 0b0000000111111111 # last 9 bits
def bytes_to_int(num): #convert bytes object to an int
res = 0
num = num[::-1] # reverse the bytes
for i in range(len(num)):
res += num[i] * (256**i)
return res
def get_fields(f):
chunk = bytes_to_int(f.read(2)) # read 2 bytes, f1-f3, convert to int
f1 = (chunk & FIELD1) >> 12 # get each field with its bitmask
f2 = (chunk & FIELD2) >> 9
f3 = chunk & FIELD3
f4 = f.read(f3) # field4 as a bytes object
return f1, f2, f3, f4
file = open('file.dat','rb')
#using your sample data
print(get_fields(file)) # returns 0, 4, 384, field4 as a bytes obj
file.close()