我需要查找给定星期之间的所有年度星期。
201824是一年一周的示例。它表示2018年的第24周。
假设一年中有52周,那么2018年的周周从201801开始,到201852结束。在那之后,它继续到201901。
如果起始周和结束周位于同一年,则我可以找到两周之间的全年周的范围
val range = udf((i: Int, j: Int) => (i to j).toArray)
以上代码仅在开始周和结束周在同一年时才有效,例如201912-201917
如果起始周和结束周属于不同的年份,该如何工作?
Example: 201849 - 201903
The above weeks should give the output as:
201849,201850,201851,201852,201901,201902,201903
答案 0 :(得分:1)
还有很多要做的优化,但是对于一般方向,您可以使用:
我在这里使用org.joda.time.format
,但是java.time
也应该合适。
def rangeOfYearWeeks(weeksRange: String): Array[String] = {
try {
val left = weeksRange.split("-")(0).trim
val right = weeksRange.split("-")(1).trim
val leftPattern = s"${left.substring(0, 4)}-${left.substring(4)}"
val rightPattern = s"${right.substring(0, 4)}-${right.substring(4)}"
val fmt = DateTimeFormat.forPattern("yyyy-w")
val leftDate = fmt.parseDateTime(leftPattern)
val rightDate = fmt.parseDateTime(rightPattern)
//if (leftDate.isAfter(rightDate))
val weeksBetween = Weeks.weeksBetween(leftDate, rightDate).getWeeks
val dates = for (one <- 0 to weeksBetween) yield {
leftDate.plusWeeks(one)
}
val result: Array[String] = dates.map(date => fmt.print(date)).map(_.replaceAll("-", "")).toArray
result
} catch {
case e: Exception => Array.empty
}
}
示例:
val dates = Seq("201849 - 201903", "201912 - 201917").toDF("col")
val weeks = udf((d: String) => rangeOfYearWeeks(d))
dates.select(weeks($"col")).show(false)
+-----------------------------------------------------+
|UDF(col) |
+-----------------------------------------------------+
|[201849, 201850, 201851, 201852, 20181, 20192, 20193]|
|[201912, 201913, 201914, 201915, 201916, 201917] |
+-----------------------------------------------------+
答案 1 :(得分:1)
以下是使用java.time
API的UDF的解决方案:
def weeksBetween = udf{ (startWk: Int, endWk: Int) =>
import java.time.LocalDate
import java.time.format.DateTimeFormatter
import scala.util.{Try, Success, Failure}
def formatYW(yw: Int): String = {
val pattern = "(\\d{4})(\\d+)".r
s"$yw" match { case pattern(y, w) => s"$y-$w-1"}
}
val formatter = DateTimeFormatter.ofPattern("YYYY-w-e") // week-based year
Try(
Iterator.iterate(LocalDate.parse(formatYW(startWk), formatter))(_.plusWeeks(1)).
takeWhile(_.isBefore(LocalDate.parse(formatYW(endWk), formatter))).
map{ s =>
val a = s.format(formatter).split("-")
(a(0) + f"${a(1).toInt}%02d").toInt
}.
toList.tail
) match {
case Success(ls) => ls
case Failure(_) => List.empty[Int] // return an empty list
}
}
测试UDF:
val df = Seq(
(1, 201849, 201903), (2, 201908, 201916), (3, 201950, 201955)
).toDF("id", "start_wk", "end_wk")
df.withColumn("weeks_between", weeksBetween($"start_wk", $"end_wk")).show(false)
// +---+--------+------+--------------------------------------------------------+
// |id |start_wk|end_wk|weeks_between |
// +---+--------+------+--------------------------------------------------------+
// |1 |201849 |201903|[201850, 201851, 201852, 201901, 201902] |
// |2 |201908 |201916|[201909, 201910, 201911, 201912, 201913, 201914, 201915]|
// |3 |201950 |201955|[] |
// +---+--------+------+--------------------------------------------------------+