创建pandas数据帧时出现TypeError

时间:2016-06-16 15:32:59

标签: python arrays numpy pandas

我使用pandas包编写了以下python代码。

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from pandas import Series

csv = pd.read_csv('train.csv')
df_csv = pd.DataFrame(csv)

PassengerId = np.array(df_csv['PassengerId'])
Age = np.array(df_csv['Age'])
Pclass = np.array(df_csv['Pclass'])
Sex = np.array(df_csv['Sex'])

i = 0
while i < 891:
    if Sex[i] == 'male':
        Sex[i] = 0
        i = i + 1
    else:
        Sex[i] = 1
        i = i + 1
Sex = np.array(Sex)
new_df = pd.DataFrame[
    'PassengerId': Series(PassengerId),
    'Age': Series(Age),
    'Pclass': Series(Pclass),
    'Sex': Series(Sex)
]

print(new_df)

我试图通过读取csv文件来创建数据框,将几列存储为numpy数组,然后替换一个数组的值。当我再次合并这些数组作为数据框时,我得到以下错误

D:\Projects\Titanic>python python.py
Traceback (most recent call last):
  File "python.py", line 27, in <module>
    'Sex': Sex
TypeError: 'type' object is not subscriptable

请帮帮我。提前致谢

1 个答案:

答案 0 :(得分:0)

尝试替换

new_df = pd.DataFrame[
  'PassengerId': Series(PassengerId),
  'Age': Series(Age),
  'Pclass': Series(Pclass),
  'Sex': Series(Sex)
]

new_df = pd.DataFrame({
  'PassengerId': Series(PassengerId),
  'Age': Series(Age),
  'Pclass': Series(Pclass),
  'Sex': Series(Sex)
})