我使用以下(转置)数据并希望将其转换为正确的数据框:
Series ID,Jan 2000,Feb 2000,Mar 2000,Apr 2000,May 2000,Jun 2000,Jul 2000,Aug 2000,Sep 2000,Oct 2000,Nov 2000,Dec 2000,Jan 2001,Feb 2001,Mar 2001,Apr 2001,May 2001,Jun 2001,Jul 2001,Aug 2001,Sep 2001,Oct 2001,Nov 2001,Dec 2001,Jan 2002,Feb 2002,Mar 2002,Apr 2002,May 2002,Jun 2002,Jul 2002,Aug 2002,Sep 2002,Oct 2002,Nov 2002,Dec 2002,Jan 2003,Feb 2003,Mar 2003,Apr 2003,May 2003,Jun 2003,Jul 2003,Aug 2003,Sep 2003,Oct 2003,Nov 2003,Dec 2003,Jan 2004,Feb 2004,Mar 2004,Apr 2004,May 2004,Jun 2004,Jul 2004,Aug 2004,Sep 2004,Oct 2004,Nov 2004,Dec 2004,Jan 2005,Feb 2005,Mar 2005,Apr 2005,May 2005,Jun 2005,Jul 2005,Aug 2005,Sep 2005,Oct 2005,Nov 2005,Dec 2005,Jan 2006,Feb 2006,Mar 2006,Apr 2006,May 2006,Jun 2006,Jul 2006,Aug 2006,Sep 2006,Oct 2006,Nov 2006,Dec 2006,Jan 2007,Feb 2007,Mar 2007,Apr 2007,May 2007,Jun 2007,Jul 2007,Aug 2007,Sep 2007,Oct 2007,Nov 2007,Dec 2007,Jan 2008,Feb 2008,Mar 2008,Apr 2008,May 2008,Jun 2008,Jul 2008,Aug 2008,Sep 2008,Oct 2008,Nov 2008,Dec 2008,Jan 2009,Feb 2009,Mar 2009,Apr 2009,May 2009,Jun 2009,Jul 2009,Aug 2009,Sep 2009,Oct 2009,Nov 2009,Dec 2009,Jan 2010,Feb 2010,Mar 2010,Apr 2010,May 2010,Jun 2010,Jul 2010,Aug 2010,Sep 2010,Oct 2010,Nov 2010,Dec 2010,Jan 2011,Feb 2011,Mar 2011,Apr 2011,May 2011,Jun 2011,Jul 2011,Aug 2011,Sep 2011,Oct 2011,Nov 2011,Dec 2011,Jan 2012,Feb 2012,Mar 2012,Apr 2012,May 2012,Jun 2012,Jul 2012,Aug 2012,Sep 2012,Oct 2012,Nov 2012,Dec 2012,Jan 2013,Feb 2013,Mar 2013,Apr 2013,May 2013,Jun 2013,Jul 2013,Aug 2013,Sep 2013,Oct 2013,Nov 2013,Dec 2013,Jan 2014,Feb 2014,Mar 2014,Apr 2014,May 2014,Jun 2014,Jul 2014,Aug 2014,Sep 2014,Oct 2014,Nov 2014,Dec 2014,Jan 2015,Feb 2015,Mar 2015,Apr 2015,May 2015,Jun 2015,Jul 2015,Aug 2015,Sep 2015,Oct 2015,Nov 2015,Dec 2015,Jan 2016,Feb 2016,Mar 2016,Apr 2016,May 2016,Jun 2016,Jul 2016,Aug 2016,Sep 2016,Oct 2016,Nov 2016,Dec 2016
JTU00000000HIL, , , , , , , , , , , ,4053,5862,4486,5264,5946,5841,5776,5730,5421,5208,5414,4253,3526,4903,3985,4326,5480,5334,5478,5538,5238,5049,5153,4274,3658,4983,3833,4140,5221,4999,5431,5203,4985,5058,5226,4125,3715,4771,3824,4902,5652,5356,5686,5381,5540,5218,5413,4591,3902,5109,4325,4913,5821,5729,6130,5793,5903,5653,5298,4682,3733,5049,4357,5050,5612,5931,6087,5919,5772,5502,5515,4915,3782,5066,4250,5036,5647,5758,6042,5619,5662,5404,5570,4616,3569,4705,4038,4444,5351,5058,5521,4957,4964,4500,4726,3499,3001,4005,3280,3481,4228,4187,4301,4295,4185,4007,3990,3541,2690,3735,3084,3911,4510,4815,4735,4553,4317,4131,4279,3657,2932,3772,3313,4040,4641,4617,5006,4552,4602,4467,4432,3814,2997,4110,3629,4197,4704,4979,5162,4656,4918,4388,4518,4001,3092,4238,3690,4036,4940,5134,5114,4910,5256,4825,4695,4257,3223,4432,3810,4482,5202,5397,5570,5397,5264,5283,5391,4674,3730,4794,4142,4825,5531,5756,5918,5500,5640,5273,5509,4873,3919,4847,4541,, , , , , , , , ,
JTU00000000JOL, , , , , , , , , , , ,4391,5569,4443,4465,5213,4515,4162,4778,4143,3960,3872,3132,3059,3930,3176,3458,3781,3575,3259,3676,3504,3307,3800,3157,2634,3953,3192,2981,3641,3205,3235,3517,3293,3068,3461,2924,2917,3585,3223,3312,3922,3643,3317,4177,3637,3714,4047,3005,3342,3775,3669,3767,4538,3879,3908,4580,4096,4204,4524,3989,3770,4412,4049,4409,4975,4388,4256,4401,4587,4491,4690,4113,3999,4717,4288,4583,5070,4564,4532,4727,4586,4504,4482,3943,3860,4366,3863,3920,4317,3974,3721,4040,3699,3274,3451,2769,2571,2868,2632,2429,2533,2427,2408,2373,2356,2493,2553,2164,2145,2744,2435,2610,3408,2893,2662,3137,2961,2789,3194,2710,2553,3036,2906,3081,3486,3110,3234,3647,3236,3505,3594,2935,3048,3747,3344,3809,3891,3705,3794,3890,3738,3538,3905,3316,3218,3769,3788,3866,4199,3880,3919,4121,4028,3981,4307,3627,3369,3934,3941,4165,4829,4610,4705,4904,5065,4650,5121,4454,4403,5031,4964,5133,5862,5390,5162,6039,5435,5343,5655,4897,4844,5635,5377,, , , , , , , , ,
由于转置它没有做到这一点,我试图手动把它放在一起:
dfVac = pd.read_csv('data/vac_hire.csv', header=None)
dfVac2 = pd.DataFrame(index=dfVac.iloc[0][1:], data=dfVac.iloc[1:, 1:].T.values, columns=dfVac.iloc[1:, 0].values)
以下是索引的外观:
In[67]: dfVac.iloc[0][1:]
Out[67]:
1 Jan 2000
2 Feb 2000
3 Mar 2000
4 Apr 2000
5 May 2000
...
和其他人相似。然而,最终的输出有一个神秘的0指数。
In[69]: dfVac2.head()
Out[69]:
JTU00000000HIL JTU00000000JOL
0
Jan 2000
Feb 2000
Mar 2000
Apr 2000
May 2000
除此之外,一切都很好。但是这是怎么回事,我为什么要阻止呢?
答案 0 :(得分:1)
index.name
,您可以将其删除:
df.index.name = None
或者:
df.reindex_axis(None)
编辑:
另一个解决方案是read_csv
,其参数为index_col=0
,然后T
转换为rename_axis
(pandas
0.18.0
中的新内容):
import pandas as pd
import io
temp=u"""Series ID,Jan 2000,Feb 2000,Mar 2000,Apr 2000,May 2000,Jun 2000,Jul 2000,Aug 2000,Sep 2000,Oct 2000,Nov 2000,Dec 2000,Jan 2001,Feb 2001,Mar 2001,Apr 2001,May 2001,Jun 2001,Jul 2001,Aug 2001,Sep 2001,Oct 2001,Nov 2001,Dec 2001,Jan 2002,Feb 2002,Mar 2002,Apr 2002,May 2002,Jun 2002,Jul 2002,Aug 2002,Sep 2002,Oct 2002,Nov 2002,Dec 2002,Jan 2003,Feb 2003,Mar 2003,Apr 2003,May 2003,Jun 2003,Jul 2003,Aug 2003,Sep 2003,Oct 2003,Nov 2003,Dec 2003,Jan 2004,Feb 2004,Mar 2004,Apr 2004,May 2004,Jun 2004,Jul 2004,Aug 2004,Sep 2004,Oct 2004,Nov 2004,Dec 2004,Jan 2005,Feb 2005,Mar 2005,Apr 2005,May 2005,Jun 2005,Jul 2005,Aug 2005,Sep 2005,Oct 2005,Nov 2005,Dec 2005,Jan 2006,Feb 2006,Mar 2006,Apr 2006,May 2006,Jun 2006,Jul 2006,Aug 2006,Sep 2006,Oct 2006,Nov 2006,Dec 2006,Jan 2007,Feb 2007,Mar 2007,Apr 2007,May 2007,Jun 2007,Jul 2007,Aug 2007,Sep 2007,Oct 2007,Nov 2007,Dec 2007,Jan 2008,Feb 2008,Mar 2008,Apr 2008,May 2008,Jun 2008,Jul 2008,Aug 2008,Sep 2008,Oct 2008,Nov 2008,Dec 2008,Jan 2009,Feb 2009,Mar 2009,Apr 2009,May 2009,Jun 2009,Jul 2009,Aug 2009,Sep 2009,Oct 2009,Nov 2009,Dec 2009,Jan 2010,Feb 2010,Mar 2010,Apr 2010,May 2010,Jun 2010,Jul 2010,Aug 2010,Sep 2010,Oct 2010,Nov 2010,Dec 2010,Jan 2011,Feb 2011,Mar 2011,Apr 2011,May 2011,Jun 2011,Jul 2011,Aug 2011,Sep 2011,Oct 2011,Nov 2011,Dec 2011,Jan 2012,Feb 2012,Mar 2012,Apr 2012,May 2012,Jun 2012,Jul 2012,Aug 2012,Sep 2012,Oct 2012,Nov 2012,Dec 2012,Jan 2013,Feb 2013,Mar 2013,Apr 2013,May 2013,Jun 2013,Jul 2013,Aug 2013,Sep 2013,Oct 2013,Nov 2013,Dec 2013,Jan 2014,Feb 2014,Mar 2014,Apr 2014,May 2014,Jun 2014,Jul 2014,Aug 2014,Sep 2014,Oct 2014,Nov 2014,Dec 2014,Jan 2015,Feb 2015,Mar 2015,Apr 2015,May 2015,Jun 2015,Jul 2015,Aug 2015,Sep 2015,Oct 2015,Nov 2015,Dec 2015,Jan 2016,Feb 2016,Mar 2016,Apr 2016,May 2016,Jun 2016,Jul 2016,Aug 2016,Sep 2016,Oct 2016,Nov 2016,Dec 2016
JTU00000000HIL, , , , , , , , , , , ,4053,5862,4486,5264,5946,5841,5776,5730,5421,5208,5414,4253,3526,4903,3985,4326,5480,5334,5478,5538,5238,5049,5153,4274,3658,4983,3833,4140,5221,4999,5431,5203,4985,5058,5226,4125,3715,4771,3824,4902,5652,5356,5686,5381,5540,5218,5413,4591,3902,5109,4325,4913,5821,5729,6130,5793,5903,5653,5298,4682,3733,5049,4357,5050,5612,5931,6087,5919,5772,5502,5515,4915,3782,5066,4250,5036,5647,5758,6042,5619,5662,5404,5570,4616,3569,4705,4038,4444,5351,5058,5521,4957,4964,4500,4726,3499,3001,4005,3280,3481,4228,4187,4301,4295,4185,4007,3990,3541,2690,3735,3084,3911,4510,4815,4735,4553,4317,4131,4279,3657,2932,3772,3313,4040,4641,4617,5006,4552,4602,4467,4432,3814,2997,4110,3629,4197,4704,4979,5162,4656,4918,4388,4518,4001,3092,4238,3690,4036,4940,5134,5114,4910,5256,4825,4695,4257,3223,4432,3810,4482,5202,5397,5570,5397,5264,5283,5391,4674,3730,4794,4142,4825,5531,5756,5918,5500,5640,5273,5509,4873,3919,4847,4541,, , , , , , , , ,
JTU00000000JOL, , , , , , , , , , , ,4391,5569,4443,4465,5213,4515,4162,4778,4143,3960,3872,3132,3059,3930,3176,3458,3781,3575,3259,3676,3504,3307,3800,3157,2634,3953,3192,2981,3641,3205,3235,3517,3293,3068,3461,2924,2917,3585,3223,3312,3922,3643,3317,4177,3637,3714,4047,3005,3342,3775,3669,3767,4538,3879,3908,4580,4096,4204,4524,3989,3770,4412,4049,4409,4975,4388,4256,4401,4587,4491,4690,4113,3999,4717,4288,4583,5070,4564,4532,4727,4586,4504,4482,3943,3860,4366,3863,3920,4317,3974,3721,4040,3699,3274,3451,2769,2571,2868,2632,2429,2533,2427,2408,2373,2356,2493,2553,2164,2145,2744,2435,2610,3408,2893,2662,3137,2961,2789,3194,2710,2553,3036,2906,3081,3486,3110,3234,3647,3236,3505,3594,2935,3048,3747,3344,3809,3891,3705,3794,3890,3738,3538,3905,3316,3218,3769,3788,3866,4199,3880,3919,4121,4028,3981,4307,3627,3369,3934,3941,4165,4829,4610,4705,4904,5065,4650,5121,4454,4403,5031,4964,5133,5862,5390,5162,6039,5435,5343,5655,4897,4844,5635,5377,, , , , , , , , , """
#after testing replace io.StringIO(temp) to filename
dfVac = pd.read_csv(io.StringIO(temp), header=None)
dfVac2 = pd.DataFrame(index=dfVac.iloc[0][1:], data=dfVac.iloc[1:, 1:].T.values, columns=dfVac.iloc[1:, 0].values)
#0 is index name, rename_axis(None) replace it to None
print dfVac2.rename_axis(None).head()
JTU00000000HIL JTU00000000JOL
Jan 2000
Feb 2000
Mar 2000
Apr 2000
May 2000
df = pd.read_csv(io.StringIO(temp), index_col=0)
#Series ID is columns names, so rename_axis(None, axis=1) replace it to None
print df.T.rename_axis(None, axis=1).head()
JTU00000000HIL JTU00000000JOL
Jan 2000
Feb 2000
Mar 2000
Apr 2000
May 2000