scala 还是个新手。我正在尝试计算 Scala 中各行的百分比。考虑以下df
:
val df = Seq(("word1", 25, 75),("word2", 15, 15),("word3", 10, 30)).toDF("word", "author1", "author2")
df.show
+-----+-------+-------+
| word|author1|author2|
+-----+-------+-------+
|word1| 25| 75|
|word2| 15| 15|
|word3| 10| 30|
+-----+-------+-------+
我知道我可以使用如下代码并获得预期的输出,但是我想知道是否有更好的方法来做到这一点:
val df_2 = df
.withColumn("total", $"author1" + $"author2")
.withColumn("author1 pct", $"author1"/$"total")
.withColumn("author2 pct", $"author2"/$"total")
.select("word", "author1 pct", "author2 pct")
df_2.show
+-----+-----------+-----------+
| word|author1 pct|author2 pct|
+-----+-----------+-----------+
|word1| 0.25| 0.75|
|word2| 0.5| 0.5|
|word3| 0.25| 0.75|
+-----+-----------+-----------+
奖励积分以百分比格式提供,带有“%”且没有小数。谢谢!
答案 0 :(得分:1)
也许你可以直接计算并选择百分比,而不是使用.withColumn
,并使用concat
在末尾添加一个%
符号:
val df2 = df.select(
$"word",
concat(($"author1"*100/($"author1" + $"author2")).cast("int"), lit("%")).as("author1 pct"),
concat(($"author2"*100/($"author1" + $"author2")).cast("int"), lit("%")).as("author2 pct")
)
df2.show
+-----+-----------+-----------+
| word|author1 pct|author2 pct|
+-----+-----------+-----------+
|word1| 25%| 75%|
|word2| 50%| 50%|
|word3| 25%| 75%|
+-----+-----------+-----------+
如果你想保留数字数据类型,那么你可以这样做
val df2 = df.select(
$"word",
($"author1"*100/($"author1" + $"author2")).cast("int").as("author1 pct"),
($"author2"*100/($"author1" + $"author2")).cast("int").as("author2 pct")
)