我有一个像这样的pandas数据框
Data Source World Development Indicators Unnamed: 2 Unnamed: 3 Unnamed: 4 Unnamed: 5
Country Name Country Code Indicator Name Indicator Code 1.960000e+03 1.961000e+03
Aruba ABW GDP at market prices (constant 2010 US$) NY.GDP.MKTP.KD NaN NaN
要将第一行转换为其列,我使用代码
data.columns = data.iloc [0]
结果,数据数据框被修改为
Country Name Country Code Indicator Name Indicator Code 1960.0 1961.0 1962.0
Country Name Country Code Indicator Name Indicator Code 1.960000e+03 1.961000e+03
Aruba ABW GDP at market prices (constant 2010 US$) NY.GDP.MKTP.KD NaN NaN
现在我的主要问题是对于带有多年作为标题的列,我希望得到1960.0,我想成为一个sintegers即1960.任何有关此的帮助将不胜感激
答案 0 :(得分:1)
答案 1 :(得分:1)
如果从skiprows
创建header
,则另一种可能的解决方案是将参数DataFrame
或csv
添加到read_csv
:
import pandas as pd
import numpy as np
from pandas.compat import StringIO
temp=u"""Data Source;World Development Indicators;Unnamed: 2;Unnamed: 3;Unnamed: 4;Unnamed: 5
Country Name;Country Code;Indicator Name;Indicator Code;1960;1961
Aruba;ABW;GDP at market prices (constant 2010 US$);NY.GDP.MKTP.KD;NaN;NaN"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), sep=";", skiprows=1)
print (df)
Country Name Country Code Indicator Name \
0 Aruba ABW GDP at market prices (constant 2010 US$)
Indicator Code 1960 1961
0 NY.GDP.MKTP.KD NaN NaN
df = pd.read_csv(StringIO(temp), sep=";", header=1)
print (df)
Country Name Country Code Indicator Name \
0 Aruba ABW GDP at market prices (constant 2010 US$)
Indicator Code 1960 1961
0 NY.GDP.MKTP.KD NaN NaN
如果无法做到,请检查完美MaxU solution并添加df = df[1:]
以从数据中删除第一行。