Question

使用pandas pivot_table创建新数据框后，dtype从int32更改为float

原始数据框

df = pd.DataFrame.from_dict(my_dict, orient='columns', dtype='i4')
print(df.head(11))

输出：

         clock   eventid         ns  objectid  value
0   1505960158  62704261  327504323     32219      1
1   1505962773  62711138   22192905     32219      0
2   1505400465  61216428  123915259     32233      1
3   1504642494  59208977  369082011     32254      1
4   1504643325  59210478  576875730     32254      0
5   1504642494  59208978  369082011     32260      1
6   1504643325  59210479  576875730     32260      0
7   1504224224  58101461  445846619     13479      0
8   1504258784  58187457  204908064     13479      1
9   1504310624  58318750  443786274     13479      0
10  1504517992  58886060  746243067     13479      1

print(df.dtypes)

输出：

clock       int32
eventid     int32
ns          int32
objectid    int32
value       int32
dtype: object

我使用pivot_table

p = df.reset_index().pivot_table(index="objectid", columns="value", values="clock", fill_value=0).iloc[:, ::-1]
print(p)

输出：

value              1             0
objectid                          
13479     1505534184  1.505467e+09
13485     1505676014  1.505677e+09
32219     1505960158  1.505963e+09
32233     1505400465  0.000000e+00
32254     1504642494  1.504643e+09
32260     1504642494  1.504643e+09
print(p.dtypes)

输出：

value
1      int64
0    float64
dtype: object

为什么 0 列会浮动？怎么避免这个？

Answer 1

您的示例数据可能不会显示，但您的数据透视操作的结果可能包含NaN个float类型，因此列的其余部分也会上传到{{1通过pandas自动进行有效计算。请注意，float由零填充（NaN），因此您无法看到它们。

例如，fill_value=0和objectid = 32233没有行，因此您的数据透视结果中的相应条目显示为value = 0，然后填充NaN

现在很清楚为什么要升级列，您可以使用0重置数据类型：

astype

pandas pivot更改dtype

1 个答案: