熊猫进口excel出口HDF5

时间:2014-01-23 21:14:32

标签: python unicode pandas hdf5 pytables

使用pandas和PyTables。首先从excel导入包含整数和浮点列的表,以及包含字符串甚至元组的其他列。 excel导入中的选项数量有限,遗憾的是,与csv导入过程不同,数据类型必须在导入后从其推断类型转换,并且不能在过程中指定。

话虽如此,所有非数字显然都是作为unicode文本导入的,这与以后导出到HDF5不兼容。有没有一种简单的方法可以将所有unicode列(以及所有列标题)转换为兼容HDF5的字符串格式?

更多细节:

>>> metaFrame.head()
                               ProjectName Company ContactName  \
LocationID                                                       
935          PCS Petaluma High School Site  Testco   Test Name   
937            PCS Casa Grande High School  Testco   Test Name   
3465               FUSD Fowler High School  Testco   Test Name   
3466             FUSD Sutter Middle School  Testco   Test Name   
3467        FUSD Fremont Elementary School  Testco   Test Name   

                      Contactemail  \
LocationID                           
935         test.address@email.com   
937         test.address@email.com   
3465        test.address@email.com   
3466        test.address@email.com   
3467        test.address@email.com   

                                                         Link  Systemsize(kW)  \
LocationID                                                                      
935         https://internal.testco.com/locations/935/syst...             NaN   
937         https://internal.testco.com/locations/937/syst...          675.39   
3465        https://internal.testco.com/locations/3465/sys...          384.30   
3466        https://internal.testco.com/locations/3466/sys...          198.90   
3467        https://internal.testco.com/locations/3467/sys...           35.10   

           SystemCheckStartdate SystemCheckActive  \
LocationID                                          
935         2013-10-01 00:00:00              True   
937         2013-10-01 00:00:00              True   
3465        2013-10-01 00:00:00              True   
3466        2013-10-01 00:00:00              True   
3467        2013-10-01 00:00:00              True   

            YTDProductionPriortostartdate  NumberofInverters/cktsmonitored  \
LocationID                                                                   
935                                   NaN                              NaN   
937                                   NaN                              NaN   
3465                                  NaN                              NaN   
3466                                  NaN                              NaN   
3467                                  NaN                              NaN   

                                                  InverterMfg InverterModel  \
LocationID                                                                    
935                                     PV Powered : PVP260KW           NaN   
937                                     PV Powered : PVP260KW           NaN   
3465        Advanced Energy Industries : Solaron 333kW (31...           NaN   
3466                                    PV Powered : PVP260KW           NaN   
3467                                 PV Powered : PVP35KW-480           NaN   

            InverterCECefficiency ModuleMfg Modulemodel  \
LocationID                                                
935                          97.0       NaN         NaN   
937                          97.0       NaN         NaN   
3465                         97.5       NaN         NaN   
3466                         97.0       NaN         NaN   
3467                         96.0       NaN         NaN   

            Moduleirradiancefactor  Moduleirradiancefactorslope  \
LocationID                                                        
935                            NaN                          NaN   
937                            NaN                          NaN   
3465                           NaN                          NaN   
3466                           NaN                          NaN   
3467                           NaN                          NaN   

            StraightLineIntercept  ModuleTemp-PwrDerate MeterDK      
LocationID                                                           
935                           NaN                 0.005    3291 ...  
937                           NaN                 0.005   11548 ...  
3465                          NaN                 0.005   19248 ...  
3466                          NaN                 0.005   15846 ...  
3467                          NaN                 0.005   15847 ...  

[5 rows x 27 columns]

>>> metaFrame.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 43 entries, 935 to 5844
Data columns (total 27 columns):
ProjectName                        43  non-null values
Company                            43  non-null values
ContactName                        43  non-null values
Contactemail                       43  non-null values
Link                               43  non-null values
Systemsize(kW)                     42  non-null values
SystemCheckStartdate               37  non-null values
SystemCheckActive                  43  non-null values
YTDProductionPriortostartdate      0  non-null values
NumberofInverters/cktsmonitored    2  non-null values
InverterMfg                        42  non-null values
InverterModel                      8  non-null values
InverterCECefficiency              33  non-null values
ModuleMfg                          0  non-null values
Modulemodel                        0  non-null values
Moduleirradiancefactor             0  non-null values
Moduleirradiancefactorslope        0  non-null values
StraightLineIntercept              0  non-null values
ModuleTemp-PwrDerate               43  non-null values
MeterDK                            43  non-null values
Genfieldname                       43  non-null values
WSDK                               34  non-null values
WSirradianceField                  43  non-null values
WSCellTempField                    43  non-null values
MiscDerate                         1  non-null values
InverterDKs                        37  non-null values
Invertergenfields                  37  non-null values
dtypes: bool(1), datetime64[ns](1), float64(9), object(16)

0 个答案:

没有答案