我想根据CSV
Kaggle
创建一个arff
文件
https://www.kaggle.com/c/titanic/download/train.csv
这是我制作的arff
文件的一部分
@relation titanic
@attribute PassengerId numeric
@attribute Survived {0,1}
@attribute Pclass {1,2,3}
@attribute Name string
@attribute Sex {male,female}
@attribute Age numeric
@attribute SibSp numeric
@attribute Parch numeric
@attribute Ticket string
@attribute Fare numeric
@attribute Cabin string
@attribute Embarked {C,Q,S}
@data
1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,,S
2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38,1,0,PC 17599,71.2833,C85,C
3,1,3,"Heikkinen, Miss. Laina",female,26,0,0,STON/O2. 3101282,7.925,,S
4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35,1,0,113803,53.1,C123,S
但是当我在Weka
中加载它时,它会返回我的错误:
nominal value not declared in header, read Token[C85], line 18 % the second line of my data
我的声明有什么问题?
答案 0 :(得分:0)
问题是名称"Cumings, Mrs. John Bradley (Florence Briggs Thayer)"
中有逗号。尽管有双引号,Weka将其解析为两个字段。
您可以尝试在正则表达式的帮助下删除此类逗号(即双引号内的逗号)。