如何通过python3 cpikle和python3 pickle读取序列化数据?

时间:2015-11-22 15:08:39

标签: python python-2.7 python-3.x serialization pickle

我尝试使用包含CIFAR-10 dataseta special version for python

它是一组二进制文件,每个文件代表10k numpy矩阵的字典。这些文件显然是由python2 cPickle创建的。

我尝试从python2加载它,如下所示:

import cPickle
with open("data/data_batch_1", "rb") as f:
    data = cPickle.load(f)

这非常棒。但是,如果我尝试从python3加载数据(而不是cPickle而不是pickle),则会失败:

import pickle
with open("data/data_batch_1", "rb") as f:
    data = pickle.load(f)

如果失败并出现以下错误:

UnicodeDecodeError: 'ascii' codec can't decode byte 0x8b in position 6: ordinal not in range(128)

我可以以某种方式将原始数据集转换为可从python3读取的新数据集吗?或者我可以以某种方式从python3直接读取它吗?

我已经尝试按cPickle加载它,将其转储到json并通过pickle读取它,但是numpy矩阵显然不能写成import java.io.*; import java.util.Arrays; import java.util.Scanner; public class Asgn7 { public static void main(String[] args) throws FileNotFoundException { Scanner file = new Scanner(new File("asgn7data.txt")); double[] array = new double[file.nextInt()]; double[] newArray = new double[array.length - 4]; double tempVal = 0; int j = 0; int count = 0; while(file.hasNext()) { for(int i = 0; i < array.length ; i++) { array[i] = file.nextInt(); } for(j = 0; j < array.length - 4; j++) { for(int k = 0; k < 5; k++) { newArray[j] += array[j+k] / 5; } } for(int i = 2; i < array.length; i++) { if(array[i] > (newArray[i-2] + 0.999)); { count++; tempVal = count; } System.out.println(tempVal); } } } } json文件。

1 个答案:

答案 0 :(得分:5)

您需要告诉pickle用于这些字节串的编解码器,或者告诉它将数据加载为bytes。来自pickle.load() documentation

  

编码错误告诉pickle如何解码Python 2腌制的8位字符串实例;这些默认分别为'ASCII'和'strict'。 编码可以是'bytes',将这些8位字符串实例作为字节对象读取。

将字符串加载为bytes对象:

import pickle
with open("data/data_batch_1", "rb") as f:
    data = pickle.load(f, encoding='bytes')