如何将此类代码应用于包含多个记录的数据文件
class Iris(val sepal_len:Double,val sepal_width:Double,val petal_len:Double,
val petal_width:Double,var sepal_area:Double,val species:String){
require(sepal_area == sepal_len*sepal_width, "wrong values")
def this(sepal_len:Double,
sepal_width:Double,
petal_len:Double,
petal_width:Double,
species:String
) = {
this(sepal_len,sepal_width,petal_len,petal_width,sepal_len * sepal_width,species)
}
override def toString:String = "Iris("+sepal_len+","+sepal_width+","+petal_len+","+petal_width+
","+sepal_area+","+species + ")"
}
val ir = new Iris(1.2,3.4,4.5,5.0,4.08,"setosa")
Iris(1.2,3.4,4.5,5.0,4.08,setosa)
val ir1 = new Iris(1.2,3.4,4.5,5.0,"setosa")
output => ir1: Iris = Iris(1.2,3.4,4.5,5.0,4.08,setosa)
请给我一些想法
答案 0 :(得分:1)
由于您的示例,我假设您的sepal_area
不是您班级中的必填字段,因此会得到相应的答案,但如果需要,则更改代码将很容易。
在我的示例答案中,我正在存储Iris
的集合。您也可以创建一个类似的案例类:
case class Irises(irises: Seq[Iris])
CSV文件示例:
1,5.1,3.5,1.4,5.1,Iris-setosa
2,4.9,3,1.4,9.8,Iris-setosa
3,4.7,3.2,1.3,14.1,Iris-setosa
4,4.6,3.1,1.5,Iris-setosa
5,5,3.6,1.4,25,Iris-setosa
6,5.4,3.9,1.7,32.4,Iris-setosa
代码:
object Demo extends App {
case class Iris(sepal_len: Double, sepal_width: Double, petal_len: Double,
petal_width: Double, var sepal_area: Double = 0, species: String) {
// if undefined on constructing the class
sepal_area = if(sepal_area == 0) sepal_len * sepal_width else sepal_area
def format(double: Double): Double = {
double.round
}
// had weird thing where 4.7 * 3 = 14.100000000000001
require(format(sepal_area) == format(sepal_len * sepal_width), "wrong values")
// using string concatenation looks much more readable
override def toString: String = s"Iris($sepal_len,$sepal_width,$petal_len,$petal_width,$sepal_area,$species)"
}
// file.csv found in {project-name}/file.csv
val bufferedSource = io.Source.fromFile("file.csv")
val seq = bufferedSource.getLines.map {
line =>
// for each line in csv file, split by comma and remove all whitespace around each part
val cols = line.split(",").map(_.trim)
// define parts
val sepal_len = cols.head.toDouble
val sepal_width = cols(1).toDouble
val petal_len = cols(2).toDouble
val petal_width = cols(3).toDouble
val species = if(cols.length == 6) cols(5) else cols(4)
// if sepal_area is defined
if (cols.length == 6) Iris(sepal_len, sepal_width, petal_len, petal_width, cols(4).toDouble, species)
// if sepal_area is not defined
else Iris(sepal_len, sepal_width, petal_len, petal_width, species = species)
}.toSeq
seq.foreach(println)
// Iris(1.0,5.1,3.5,1.4,5.1,Iris-setosa)
// Iris(2.0,4.9,3.0,1.4,9.8,Iris-setosa)
// Iris(3.0,4.7,3.2,1.3,14.1,Iris-setosa)
// Iris(4.0,4.6,3.1,1.5,18.4,Iris-setosa)
// Iris(5.0,5.0,3.6,1.4,25.0,Iris-setosa)
// Iris(6.0,5.4,3.9,1.7,32.4,Iris-setosa)
val newSeq = seq.toSeq
newSeq.foreach(println)
// close the source once you've finished with it
bufferedSource.close
}
答案 1 :(得分:0)
使用foreach并为每一行创建一个对象集合。也许是一个清单
答案 2 :(得分:0)
我建议你使用这类数据结构的分类和类重载的伴随对象。
case class Iris(sepal_len: Double, sepal_width: Double, petal_len: Double, petal_width: Double, sepal_area: Double, species: String) {
require(sepal_area == sepal_len * sepal_width, "wrong values")
}
object Iris {
def apply(sepal_len: Double, sepal_width: Double, petal_len: Double, petal_width: Double, species: String) =
new Iris(sepal_len, sepal_width, petal_len, petal_width, sepal_len * sepal_width, species)
}
val ir = Iris(1.2, 3.4, 4.5, 5.0, 4.08, "setosa")
val ir1 = Iris(1.2, 3.4, 4.5, 5.0, "setosa")