Question

当我尝试将列添加到pandas.DataFrame中时，我遇到了一些问题。

我试图添加一些带有＆＃34; traslation＆＃34;的列。从索引到其名称（已经从csv导入了traslation）。但是当我尝试添加它时，该列以与csv中相同的顺序显示数据，而不是使用索引给它。

奇怪的是，如果我打印数据，我试图自己放入列，数据显示正确。

count = pd.value_counts(y_train, sort=True, ascending=True)
table_index = np.arange(n_classes)

result = pd.DataFrame()
result['SignIndex'] = count.index
result['Counts'] = count.values
column = list_signals['SignName'][count.index]
result['Signal'] = column

result.head()

这就是它打印的内容：

    SignIndex   Counts  Signal
0   0           180     Speed limit (20km/h)
1   19          180     Speed limit (30km/h)
2   37          180     Speed limit (50km/h)
3   27          210     Speed limit (60km/h)
4   41          210     Speed limit (70km/h)

但是如果您自己打印变量列，则会打印出来：

column


SignIndex
0                                  Speed limit (20km/h)
19                          Dangerous curve to the left
37                                  Go straight or left
27                                          Pedestrians
41                                    End of no passing
42    End of no passing by vehicles over 3.5 metric ...

有人知道为什么会这样吗？

谢谢你提前！

Answer 1

正如@BrenBarn所说，如果您将一个系列指定为DataFrame中的一列，则系列的索引将与DataFrame的索引匹配。

所以为了解决我的问题，这就是我最终做的事情：

count = pd.value_counts(y_train).sort_index()

result = pd.DataFrame()
result['SignIndex'] = count.index
result['Counts'] = count.values
result['Signal'] = list_signals['SignName'][count.index]

result = result.sort_values(by=['Counts'], ascending=True)
result = result.reset_index(drop=True)

result.head()

这是印刷的结果：

    SignIndex   Counts  Signal
0   0           180     Speed limit (20km/h)
1   37          180     Go straight or left
2   19          180     Dangerous curve to the left
3   32          210     End of all speed and passing limits
4   27          210     Pedestrians
5   41          210     End of no passing

感谢大家的帮助！

IPython Notebook - pandas.DataFrame（在不需要时自动排序列）

1 个答案: