将特定的列更改为Pandas中的行名

时间:2018-09-26 11:25:34

标签: python pandas

我看到此网站here中的要求。从那个帖子借来的想法,但在我的情况下不起作用。我正在从Excel工作表中读取一些数据,并尝试将其转换为具有列和行索引的Pandas数据框。 第一行是Excel的年份标题,我尝试通过进行df.columns=df.iloc[0]

使其成为列标题。

因此,当我运行df.columns时,它会重新运行:Index([None, 2014.0, 2015.0, 2016.0, 2017.0, 2018.0], dtype='object', name=0) enter image description here

我现在的问题是转换以Month名称作为行名称的列。我尝试过

df.set_index('None',inplace=True)

但这返回KeyError: 'None'

  1. 为什么我不能叫“ None”,因为据说这是列名之一。
  2. 我如何将这些可以从图表中调用的月份名称转换为以后的xaxis?任何日期时间格式?

编辑: 添加示例数据here

更新:我通过df.columns = ['Month', 2014, 2015, 2016, 2017, 2018]df.drop(df.index[0])

解决了此问题

4 个答案:

答案 0 :(得分:1)

对于我来说,工作很好,添加2个参数-index_col=[0]用于将第一列转换为index,而usecolsrange一起用于选择所有不包含Unnamed列的列:

df = pd.read_excel('sample.xlsx', usecols=range(1, 100))

print (df)
   Unnamed: 0  2014  2015       2016   2017   2018
0         Jan  42.9  47.2  43.000000  43.00  48.98
1         Feb  36.6  45.0  40.300000  43.00  45.92
2         Mar  37.8  42.8  44.805668  43.00  43.00
3         Apr  40.9  44.4  43.900000  41.30  44.46
4         May  40.5  47.1  44.200000  41.97  42.31
5         Jun  41.8  46.9  44.600000  45.70    NaN
6         Jul  40.5  45.0  43.500000  45.49    NaN
7         Aug  44.3  45.0  43.800000  44.59    NaN
8         Sep  43.8  47.3  47.600000  47.25    NaN
9         Oct  44.2  47.0  47.600000  50.08    NaN
10        Nov  44.2  43.7  50.078663  50.93    NaN
11        Dec  48.8  45.5  46.500000  48.37    NaN

df = pd.read_excel('sample.xlsx', index_col=[0], usecols = range(1, 100))

print (df)
     2014  2015       2016   2017   2018
Jan  42.9  47.2  43.000000  43.00  48.98
Feb  36.6  45.0  40.300000  43.00  45.92
Mar  37.8  42.8  44.805668  43.00  43.00
Apr  40.9  44.4  43.900000  41.30  44.46
May  40.5  47.1  44.200000  41.97  42.31
Jun  41.8  46.9  44.600000  45.70    NaN
Jul  40.5  45.0  43.500000  45.49    NaN
Aug  44.3  45.0  43.800000  44.59    NaN
Sep  43.8  47.3  47.600000  47.25    NaN
Oct  44.2  47.0  47.600000  50.08    NaN
Nov  44.2  43.7  50.078663  50.93    NaN
Dec  48.8  45.5  46.500000  48.37    NaN

或选择第二列作为索引并删除列Unnamed: 0

df = pd.read_excel('sample.xlsx', index_col=[1])

print (df)
     Unnamed: 0  2014  2015       2016   2017   2018
Jan         NaN  42.9  47.2  43.000000  43.00  48.98
Feb         NaN  36.6  45.0  40.300000  43.00  45.92
Mar         NaN  37.8  42.8  44.805668  43.00  43.00
Apr         NaN  40.9  44.4  43.900000  41.30  44.46
May         NaN  40.5  47.1  44.200000  41.97  42.31
Jun         NaN  41.8  46.9  44.600000  45.70    NaN
Jul         NaN  40.5  45.0  43.500000  45.49    NaN
Aug         NaN  44.3  45.0  43.800000  44.59    NaN
Sep         NaN  43.8  47.3  47.600000  47.25    NaN
Oct         NaN  44.2  47.0  47.600000  50.08    NaN
Nov         NaN  44.2  43.7  50.078663  50.93    NaN
Dec         NaN  48.8  45.5  46.500000  48.37    NaN

df = pd.read_excel('sample.xlsx', index_col=[1]).drop('Unnamed: 0', axis=1)

print (df)
     2014  2015       2016   2017   2018
Jan  42.9  47.2  43.000000  43.00  48.98
Feb  36.6  45.0  40.300000  43.00  45.92
Mar  37.8  42.8  44.805668  43.00  43.00
Apr  40.9  44.4  43.900000  41.30  44.46
May  40.5  47.1  44.200000  41.97  42.31
Jun  41.8  46.9  44.600000  45.70    NaN
Jul  40.5  45.0  43.500000  45.49    NaN
Aug  44.3  45.0  43.800000  44.59    NaN
Sep  43.8  47.3  47.600000  47.25    NaN
Oct  44.2  47.0  47.600000  50.08    NaN
Nov  44.2  43.7  50.078663  50.93    NaN
Dec  48.8  45.5  46.500000  48.37    NaN

答案 1 :(得分:0)

您可以通过以下方式重命名列:

  

df.columns = ['None',2014.0,2015.0,2016.0,2017.0,2018.0]

现在您的命令应该可以使用

答案 2 :(得分:0)

尝试这种方式

df.set_index(df.None)

答案 3 :(得分:-1)

将列名设置为“ ”时,您无法将其设置为索引,因此要将该列设置为索引,请首先重命名该列。

df.columns.values[0]='First'

然后将其设置为-:

df.set_index('First')