Question

在Jupyter笔记本上，我打印了df.info（），结果是

www.domain.com/test.php?name=John

为什么它显示20620条目，格式为0到24867？最后一个数字（24867）应该是20620或20619

Answer 1

这意味着并非所有可能的索引值都已使用。例如，

In [13]: df = pd.DataFrame([10,20], index=[0,100])

In [14]: df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 2 entries, 0 to 100
Data columns (total 1 columns):
0    2 non-null int64
dtypes: int64(1)
memory usage: 32.0 bytes

df有2个条目，但是Int64Index的范围是0到100。

如果删除了行，或者df是另一个DataFrame的子DataFrame，则DataFrame可以很容易地以这种方式结束。

如果重置索引，索引标签将从0开始按顺序重新编号：

In [17]: df.reset_index(drop=True)
Out[17]: 
    0
0  10
1  20

In [18]: df.reset_index(drop=True).info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 1 columns):
0    2 non-null int64
dtypes: int64(1)
memory usage: 96.0 bytes

更确切地说，该行为Chris points out

Int64Index: 2 entries, 0 to 100

仅报告Int64Index中的第一个和最后一个值。它没有报告最小值或最大值。索引中可以有更高或更低的整数：

In [32]: pd.DataFrame([10,20,30], index=[50,0,50]).info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 3 entries, 50 to 50  # notice index value 0 is not mentioned
Data columns (total 1 columns):
0    3 non-null int64
dtypes: int64(1)
memory usage: 48.0 bytes

在报表上的熊猫打印info（）时，条目和索引号不同

1 个答案: