如何使用float64:python制表lib

时间:2016-12-20 11:01:40

标签: python kaggle

为了打印数据,我在python中使用tabulate库。 这是我正在使用的代码:

train = pd.read_csv('../misc/data/train.csv')
test = pd.read_csv('../misc/data/test.csv')

# Prints the head of data prettily :)
print(tabulate(train.head(), headers='keys', tablefmt='psql'))

数据是来自kaggle的巨大数据集。现在,我需要对具有float64值的数据使用制表。以下代码给出了错误:

surv_age = train[train['Survived'] == 1]['Age'].value_counts()
dead_age = train[train['Survived'] == 0]['Age'].value_counts()

print(tabulate(surv_age, headers='keys', tablefmt='psql'))

df = pd.DataFrame([surv_age, dead_age])
df.index = ['Survived', 'Dead']
df.plot(kind='hist', stacked=True, figsize=(15, 8))
plt.xlabel('Age')
plt.ylabel('Number of passengers')
plt.show()

错误是: 回溯(最近一次调用最后一次):

  File "main.py", line 49, in <module>
    print(tabulate(surv_age, headers='keys', tablefmt='psql'))
  File "/usr/local/lib/python2.7/dist-packages/tabulate.py", line 1109, in tabulate
    tabular_data, headers, showindex=showindex)
  File "/usr/local/lib/python2.7/dist-packages/tabulate.py", line 741, in _normalize_tabular_data
    rows = [list(row) for row in vals]
TypeError: 'numpy.float64' object is not iterable

第49行是代码中的print(tabulate(..行。

如何迭代float64数据值,以便我可以在表格中打印出来?如果在tabulate中无法实现,请建议另一种可以这样做的漂亮打印方式。以下是表格可以做的样本:

+----+---------------+------------+----------+-----------------------------------------------------+--------+-------+---------+---------+------------------+---------+---------+------------+
|    |   PassengerId |   Survived |   Pclass | Name                                                | Sex    |   Age |   SibSp |   Parch | Ticket           |    Fare | Cabin   | Embarked   |
|----+---------------+------------+----------+-----------------------------------------------------+--------+-------+---------+---------+------------------+---------+---------+------------|
|  0 |             1 |          0 |        3 | Braund, Mr. Owen Harris                             | male   |    22 |       1 |       0 | A/5 21171        |  7.25   | nan     | S          |
|  1 |             2 |          1 |        1 | Cumings, Mrs. John Bradley (Florence Briggs Thayer) | female |    38 |       1 |       0 | PC 17599         | 71.2833 | C85     | C          |
|  2 |             3 |          1 |        3 | Heikkinen, Miss. Laina                              | female |    26 |       0 |       0 | STON/O2. 3101282 |  7.925  | nan     | S          |
|  3 |             4 |          1 |        1 | Futrelle, Mrs. Jacques Heath (Lily May Peel)        | female |    35 |       1 |       0 | 113803           | 53.1    | C123    | S          |
|  4 |             5 |          0 |        3 | Allen, Mr. William Henry                            | male   |    35 |       0 |       0 | 373450           |  8.05   | nan     | S          |
+----+---------------+------------+----------+-----------------------------------------------------+--------+-------+---------+---------+------------------+---------+---------+------------+

1 个答案:

答案 0 :(得分:1)

引用tabulate文档,

  

支持以下表格数据类型:

     
      
  • 列表列表或其他可迭代的迭代
  •   
  • 列表或其他可迭代的词组(键作为列)
  •   
  • iterables的dict(键作为列)
  •   
  • 二维NumPy数组
  •   
  • NumPy记录数组(名称为列)
  •   
  • pandas.DataFrame
  •   

您的变量nil是一维形状的numpy数组(342,)。您将需要重新塑造成2-D numpy阵列。您可以使用numpy.reshape

轻松完成此操作
surv_age

你也可以使用np.expand_dims这样做,

surv_age = np.reshape(surv_age, (-1, 1))