我想改变dataframe对象。我想将第1行作为列索引。第一列为行索引。
import pandas as pd
wiki = "https://en.wikipedia.org/wiki/List_of_state_and_union_territory_capitals_in_India"
df = pd.read_html(wiki)[1]
df2 = df.copy()
df2.head()
目前我正在这样做(我正在丢失行索引名称):
df2.columns = df.iloc[0]
df2.drop(0, inplace=True)
df2.drop('No.', axis=1, inplace=True)
df2.head()
如何以更加Pythonic的方式保留行索引名称?
答案 0 :(得分:2)
您可以直接在read_html
中指定您的意愿,header
指定要用作列的行,并index_col
将哪个列用作索引:
In [16]: df = pd.read_html(wiki,header=0,index_col=0)[1]
In [17]: df.head()
Out[17]:
State or union territory Administrative capitals Legislative capitals \
No.
1 Andaman and Nicobar Islands Port Blair Port Blair
2 Andhra Pradesh Hyderabad[a] Hyderabad
3 Arunachal Pradesh Itanagar Itanagar
4 Assam Dispur Guwahati
5 Bihar Patna Patna
Judiciary capitals Year capital was established The Former capital
No.
1 Kolkata 1955 Calcutta (1945–1956)
2 Hyderabad 1959 Kurnool (1953-1956)
3 Guwahati 1986 NaN
4 Guwahati 1975 Shillong[b] (1874–1972)
5 Patna 1912 NaN