for i in [train1,test1]:
df_dummies = pd.get_dummies(i['Name'], prefix='Name',dummy_na=False)
#print(df_dummies.head())
#i.drop('Name',1,inplace=True)
i = pd.concat([i,df_dummies],axis=1)
print(i.head())
输出:
PassengerId Pclass Name Sex Age SibSp Parch Ticket Fare \
0 892 3 Mr. 1 34.5 0 0 330911 7.8292
1 893 3 Mrs. 0 47.0 1 0 363272 7.0000
2 894 2 Mr. 1 62.0 0 0 240276 9.6875
3 895 3 Mr. 1 27.0 0 0 315154 8.6625
4 896 3 Mrs. 0 22.0 1 1 3101298 12.2875
Embarked Name_Dr. Name_Master. Name_Miss. Name_Mr. Name_Mrs. \
0 2 0 0 0 1 0
1 0 0 0 0 0 1
2 2 0 0 0 1 0
3 0 0 0 0 1 0
4 0 0 0 0 0 1
Name_Rev. Name_other
0 0 0
1 0 0
2 0 0
3 0 0
4 0 0
但是当在for
循环之外再次验证时,我没有得到虚拟变量
print(test1.head())
输出:
PassengerId Pclass Name Sex Age SibSp Parch Ticket Fare \
0 892 3 Mr. 1 34.5 0 0 330911 7.8292
1 893 3 Mrs. 0 47.0 1 0 363272 7.0000
2 894 2 Mr. 1 62.0 0 0 240276 9.6875
3 895 3 Mr. 1 27.0 0 0 315154 8.6625
4 896 3 Mrs. 0 22.0 1 1 3101298 12.2875
Embarked
0 2
1 0
2 2
3 0
4 0
显然我在这里遗漏了一些东西,请帮我找错误,我认为它与数据帧的副本/地址有关
答案 0 :(得分:0)
我认为您希望在df
中分配list of DataFrames
。我认为您的解决方案不起作用,因为concat
会返回新的DataFrame
。
L = [train1,test1]
for i, df in enumerate(L):
df_dummies = pd.get_dummies(df['Name'], prefix='Name',dummy_na=False)
#print(df_dummies.head())
#i.drop('Name',1,inplace=True)
L[i] = pd.concat([df,df_dummies],axis=1)
print (L[0])
PassengerId Pclass Name Sex Age SibSp Parch Ticket Fare \
0 892 3 Mr. 1 34.5 0 0 330911 7.8292
1 893 3 Mrs. 0 47.0 1 0 363272 7.0000
2 894 2 Mr. 1 62.0 0 0 240276 9.6875
3 895 3 Mr. 1 27.0 0 0 315154 8.6625
4 896 3 Mrs. 0 22.0 1 1 3101298 12.2875
Name_Mr. Name_Mrs.
0 1 0
1 0 1
2 1 0
3 1 0
4 0 1
答案 1 :(得分:0)
您可以使用list comprehension
代替。
df_list = [pd.concat([x, pd.get_dummies(x['Name'], prefix='Name',dummy_na=False)], 1)
for x in [train1, test1]]
df_list[0]
PassengerId Pclass Name Sex Age SibSp Parch Ticket Fare \
0 892 3 Mr. 1 34.5 0 0 330911 7.8292
1 893 3 Mrs. 0 47.0 1 0 363272 7.0000
2 894 2 Mr. 1 62.0 0 0 240276 9.6875
3 895 3 Mr. 1 27.0 0 0 315154 8.6625
4 896 3 Mrs. 0 22.0 1 1 3101298 12.2875
Embarked Name_Mr. Name_Mrs.
0 2 1 0
1 0 0 1
2 2 1 0
3 0 1 0
4 0 0 1