我有一个.csv我试图读入一个包含多行列标题的pandas数据框,但第一行是稀疏标记的。
例如:
Binned_average_and_predicted_H2O_spectra_sorted_by_RH-class.,,,,,,,,
,RH=0.8,,,,RH=0.9,,,
,n_=_60,,,,n_=_29,,,
nat_freq,avrg_sp(T),avrg_sp(h2o),denoised_avrg_sp(h2o),pred_sp(h2o),avrg_sp(T),avrg_sp(h2o),denoised_avrg_sp(h2o),pred_sp(h2o)
6.10E-04,8.40E-02,0.117551351,0.117550357,8.64E-02,0.128696811,0.163304381,0.163304015,0.127552704
1.22E-03,7.49E-02,0.126467592,0.126465605,7.70E-02,9.05E-02,0.200350295,0.200349563,8.97E-02
1.83E-03,7.54E-02,0.124370072,0.124367091,7.76E-02,8.54E-02,0.121274897,0.121273799,8.46E-02
2.44E-03,7.76E-02,0.136590839,0.136586865,7.99E-02,5.45E-02,0.100995665,0.100994202,5.40E-02
3.05E-03,8.73E-02,0.141422799,0.141417832,8.98E-02,7.57E-02,0.170033442,0.170031614,7.50E-02
3.66E-03,7.29E-02,0.143599074,0.143593115,7.50E-02,0.10001777,0.165468366,0.165466173,9.91E-02
当我读到csv时,
Cosp2 = pd.read_csv(DPath,index_col=0, header=[1,3])
print(Cosp2)
我最终得到了未明确标记的所有标头第一级标头上的未命名:#_ level_0标签。
RH=0.8 Unnamed: 2_level_0 Unnamed: 3_level_0 \
nat_freq avrg_sp(T) avrg_sp(h2o) denoised_avrg_sp(h2o)
0.00061 0.0840 0.117551 0.117550
0.00122 0.0749 0.126468 0.126466
0.00183 0.0754 0.124370 0.124367
0.00244 0.0776 0.136591 0.136587
0.00305 0.0873 0.141423 0.141418
0.00366 0.0729 0.143599 0.143593
Unnamed: 4_level_0 RH=0.9 Unnamed: 6_level_0 \
nat_freq pred_sp(h2o) avrg_sp(T) avrg_sp(h2o)
0.00061 0.0864 0.128697 0.163304
0.00122 0.0770 0.090500 0.200350
0.00183 0.0776 0.085400 0.121275
0.00244 0.0799 0.054500 0.100996
0.00305 0.0898 0.075700 0.170033
0.00366 0.0750 0.100018 0.165468
Unnamed: 7_level_0 Unnamed: 8_level_0
nat_freq denoised_avrg_sp(h2o) pred_sp(h2o)
0.00061 0.163304 0.127553
0.00122 0.200350 0.089700
0.00183 0.121274 0.084600
0.00244 0.100994 0.054000
0.00305 0.170032 0.075000
0.00366 0.165466 0.099100
有没有办法让pandas在未标记的列上传播0级标签?我想要一些看起来像这样的东西:
RH=0.8 \
nat_freq avrg_sp(T) avrg_sp(h2o) denoised_avrg_sp(h2o) pred_sp(h2o)
0.00061 0.0840 0.117551 0.117550 0.0864
0.00122 0.0749 0.126468 0.126466 0.0770
0.00183 0.0754 0.124370 0.124367 0.0776
0.00244 0.0776 0.136591 0.136587 0.0799
0.00305 0.0873 0.141423 0.141418 0.0898
0.00366 0.0729 0.143599 0.143593 0.0750
RH=0.9
nat_freq avrg_sp(T) avrg_sp(h2o) denoised_avrg_sp(h2o) pred_sp(h2o)
0.00061 0.128697 0.163304 0.163304 0.127553
0.00122 0.090500 0.200350 0.200350 0.089700
0.00183 0.085400 0.121275 0.121274 0.084600
0.00244 0.054500 0.100996 0.100994 0.054000
0.00305 0.075700 0.170033 0.170032 0.075000
0.00366 0.100018 0.165468 0.165466 0.099100
答案 0 :(得分:1)
您可以先get_level_values
to_series
Series
使用a = Cosp2.columns.get_level_values(0).to_series()
print (a)
RH=0.8 RH=0.8
Unnamed: 2_level_0 Unnamed: 2_level_0
Unnamed: 3_level_0 Unnamed: 3_level_0
Unnamed: 4_level_0 Unnamed: 4_level_0
RH=0.9 RH=0.9
Unnamed: 6_level_0 Unnamed: 6_level_0
Unnamed: 7_level_0 Unnamed: 7_level_0
Unnamed: 8_level_0 Unnamed: 8_level_0
dtype: object
:
NaN
如果mask
Unnamed
使用startswith
NaN
,请ffill
替换method='ffill'
b = a.mask(a.str.startswith('Unnamed')).ffill()
print (b)
RH=0.8 RH=0.8
Unnamed: 2_level_0 RH=0.8
Unnamed: 3_level_0 RH=0.8
Unnamed: 4_level_0 RH=0.8
RH=0.9 RH=0.9
Unnamed: 6_level_0 RH=0.9
Unnamed: 7_level_0 RH=0.9
Unnamed: 8_level_0 RH=0.9
dtype: object
fillna
MultiIndex
1}})
Cosp2.columns = pd.MultiIndex.from_arrays([b, Cosp2.columns.get_level_values(1)])
print (Cosp2)
RH=0.8 \
nat_freq avrg_sp(T) avrg_sp(h2o) denoised_avrg_sp(h2o) pred_sp(h2o)
0.00061 0.0840 0.117551 0.117550 0.0864
0.00122 0.0749 0.126468 0.126466 0.0770
0.00183 0.0754 0.124370 0.124367 0.0776
0.00244 0.0776 0.136591 0.136587 0.0799
0.00305 0.0873 0.141423 0.141418 0.0898
0.00366 0.0729 0.143599 0.143593 0.0750
RH=0.9
nat_freq avrg_sp(T) avrg_sp(h2o) denoised_avrg_sp(h2o) pred_sp(h2o)
0.00061 0.128697 0.163304 0.163304 0.127553
0.00122 0.090500 0.200350 0.200350 0.089700
0.00183 0.085400 0.121275 0.121274 0.084600
0.00244 0.054500 0.100996 0.100994 0.054000
0.00305 0.075700 0.170033 0.170032 0.075000
0.00366 0.100018 0.165468 0.165466 0.099100
上次按from_arrays
创建新的$('#first-carousel').carousel();
:
$('#first-carousel').remove();
$('#first-carousel').unbind();