我有一个数据框,其中set_index()
使用了3列。我想提取与每个索引关联的数据类型。我怎样才能有效地做到这一点?我不想做type(df.index.get_level_values())
,因为df很大。
MWE:
import pandas as pd
df = pd.DataFrame({"id": [1,2,1,2], "time": [1, 1, 2, 2], "val": [1,2,3,4]})
df.set_index(keys=["id", "time"], inplace=True)
type(df.index.get_level_values(1))
#pandas.core.indexes.numeric.Int64Index
我还想知道索引中实际数据的类型(即看看我知道它的整数但是这样的东西也很好:
type(df.index.get_level_values(1).values[0])
#numpy.int64
答案 0 :(得分:1)
您可以使用import pandas as pd
df = pd.DataFrame({"id": [1,2,1,2], "time": [1, 1, 2, 2], "val": [1,2,3,4]})
df.set_index(keys=["id", "time"], inplace=True)
index = df.index
print([lev.dtype.type for lev in index.levels])
# [<class 'numpy.int64'>, <class 'numpy.int64'>]
# Alternatively, there is the private attribute, `_inferred_type_levels`,
# but this is probably not what you are looking for.
print(index._inferred_type_levels)
# ['integer', 'integer']
:
index.levels
In [172]: list(index.levels)
Out[172]:
[Int64Index([1, 2], dtype='int64', name='id'),
Int64Index([1, 2], dtype='int64', name='time')]
是一维索引的冻结列表:
ddd = test.groupby('InvoiceDocNumber', as_index=False).agg({"itemprob": "max"})
ddd= ddd.rename(columns={'itemprob': 'invoiceprob'})
ddd['invoicerank'] =ddd['invoiceprob'].rank(ascending=0)