自定义格式的酸洗数据

时间:2016-12-13 13:37:39

标签: python python-2.7 pickle

我需要挑选具有以下形式的数据:具有

的多行表

元组列表,列表

e.g。 [(1,2),(2,3),(3,4)]与[1,2,3]

有关

我无法找到一种方法来腌制数据并加载它以便我得到:

import cPickle
f = open("data.pkl", 'rb')
X,Y = cPickle.load(f)

这样X只有数据的第一列,Y有第二列。

我想分别存储第一列和第二列,但是我怎么能在一个语句中加载这些数据呢?

a = []
a.append( [(1,2),(2,3)] )

和第二列类似。

b = []
b.append([1,2])

然后如何腌制和去除它?

非常感谢。

2 个答案:

答案 0 :(得分:1)

class Bunch(dict):
    """Container object for datasets
    Dictionary-like object that exposes its keys as attributes.
    >>> b = Bunch(a=1, b=2)
    >>> b['b']
    2
    >>> b.b
    2
    >>> b.a = 3
    >>> b['a']
    3
    >>> b.c = 6
    >>> b['c']
    6
    """

    def __init__(self, **kwargs):
        super(Bunch, self).__init__(kwargs)

    def __setattr__(self, key, value):
        self[key] = value

    def __dir__(self):
        return self.keys()

    def __getattr__(self, key):
        try:
            return self[key]
        except KeyError:
            raise AttributeError(key)

import cPickle as pickle

dataset = Bunch.Bunch(data=X, target=Y,
                         target_names=target_names_input,
                        DESCR=fdescr,feature_names=labels_names)

def save_object(obj, filename):
with open(filename, 'wb') as output:
    pickle.dump(obj, output, pickle.HIGHEST_PROTOCOL)

save_object(dataset,'data.pkl')

with open('data.pkl', "rb") as f:
data = pickle.load(f)
X = data.data
Y = data.target

我假设您在表X的行中有某种形式的要素数据,并且您的列Y是目标向量。

答案 1 :(得分:1)

尝试

import cPickle

FILENAME = 'foo.pkl'

X = [(1,2),(2,3),(3,4)]
Y = [1,2,3]

with open(FILENAME, 'wb') as f:
    cPickle.dump((X, Y), f)

with open(FILENAME, 'rb') as f:
    x, y = cPickle.load(f)

print(x)
print(y)