Question

我希望将数据帧存储为csv文件，然后从中读取数据，同时保留每个条目的数据类型。我可以使用str，int和float来实现它，但是当涉及np.arrays时，它的存储方式是字符串。有没有办法保持其性质？

例：

import pandas as pd
import numpy as np
pd_df = pd.DataFrame(columns=['name', 'int', 'float', 'array'])
pd_df.loc[0] = pd.Series({ 'name'  : 'one', 'int' : 1,
                           'float': 1.0, 
                           'array' : np.random.rand(1,3) })
pd_df.to_csv( 'file.csv' )

然后我用内置的“read_csv”在第二时刻阅读“file.csv”：

read_file = pd.read_csv( 'file.csv' )
print( read_file )                       #returns desired result
print( type(read_file.loc[0]['name'])  ) #returns <class 'str'>
print( type(read_file.loc[0]['int'])   ) #returns <class 'numpy.int64'>
print( type(read_file.loc[0]['float']) ) #returns <class 'numpy.float64'>
print( type(read_file.loc[0]['array']) ) #returns <class 'str'> !!!

当然我可以将read_file.loc[0]['array']转换回np.array，但我想知道是否有办法让数组保持数据框和csv的方式。我尝试使用apply指定每列的数据类型，并使用特定的dtype进行阅读，并按照建议使用as_matrix() here但不能让它发挥作用。

向你提出任何建议。

Python：在编写和读取csv时保留numpy数组

0 个答案: