给出几组带时间戳的数据,如何将它们合并为一组?
假设,我有一个由以下数据结构(Kotlin)表示的数据集:
data class Data(
val ax: Double?,
val ay: Double?,
val az: Double?,
val timestamp: Long
)
ax,ay,az-各个轴上的加速度
时间戳-Unix时间戳
现在,我得到了三个数据集:Ax,Ay,Az。每个数据集都有两个非空字段:时间戳和沿其自身轴的加速度。
Ax:
+-----+------+------+-----------+
| ax | ay | az | timestamp |
+-----+------+------+-----------+
| 0.0 | null | null | 0 |
| 0.1 | null | null | 50 |
| 0.2 | null | null | 100 |
+-----+------+------+-----------+
Ay:
+------+-----+------+-----------+
| ax | ay | az | timestamp |
+------+-----+------+-----------+
| null | 1.0 | null | 10 |
| null | 1.1 | null | 20 |
| null | 1.2 | null | 30 |
+------+-----+------+-----------+
Az:
+------+------+-----+-----------+
| ax | ay | az | timestamp |
+------+------+-----+-----------+
| null | null | 2.0 | 20 |
| null | null | 2.1 | 40 |
| null | null | 2.2 | 60 |
+------+------+-----+-----------+
该算法将产生以下内容:
+------+------+------+-----------+
| ax | ay | az | timestamp |
+------+------+------+-----------+
| 0.0 | null | null | 0 |
| 0.0 | 1.0 | null | 10 |
| 0.0 | 1.1 | 2.0 | 20 |
| 0.0 | 1.2 | 2.0 | 30 |
| 0.0 | 1.2 | 2.1 | 40 |
| 0.1 | 1.2 | 2.1 | 50 |
| 0.1 | 1.2 | 2.2 | 60 |
| 0.2 | 1.2 | 2.2 | 100 |
+------+------+------+-----------+
因此,为了将三个数据集合并为一个,我:
将Ax,Ay和Az放入一个列表中:
val united: List<Data> = arrayListOf<Data>()
united.addAll(Ax)
united.addAll(Ay)
united.addAll(Az)
按时间戳排序结果列表:
united.sortBy { it.timestamp }
沿流复制不变的值:
var tempAx: Double? = null
var tempAy: Double? = null
var tempAz: Double? = null
for (i in 1 until united.size) {
val curr = united[i]
val prev = united[i-1]
if (curr.ax == null) {
if (prev.ax != null) {
curr.ax = prev.ax
tempAx = prev.ax
}
else curr.ax = tempAx
}
if (curr.ay == null) {
if (prev.ay != null) {
curr.ay = prev.ay
tempAy = prev.ay
}
else curr.ay = tempAy
}
if (curr.az == null) {
if (prev.az != null) {
curr.az = prev.az
tempAz = prev.az
}
else curr.az = tempAz
}
}
删除重复的行(具有相同的时间戳):
return united.distinctBy { it.timestamp }
可以通过一次合并两个列表来改进上述方法,我也许可以为此创建一个函数。 有没有更优雅的解决方案?有什么想法吗?谢谢。
答案 0 :(得分:1)
我假设您的Data
包含var
而不是val
(否则您的代码将无法工作)。以下是使用分组的时间戳和方法重写函数的方法,该方法要么提取感兴趣的属性,要么返回给定属性的最后一个已知值。
// your tempdata containing the default (starting) values:
val tempData = Data(0.0, 0.0, 0.0, 0L)
fun extract(dataList: List<Data>, prop: KMutableProperty1<Data, Double?>) =
// find the first non null value for the given property
dataList.firstOrNull { prop(it) != null }
// extract that property
?.let(prop)
// set the extracted value in our tempData so that it can reused if a null value is retrieved in future
?.also { prop.set(tempData, it) }
// if the above didn't return a value, use the last one set into tempData
?: prop(tempData)
val mergedData = /* your united.addAll */ (Ax + Ay + Az)
.groupBy { it.timestamp }
// your sort by timestamp
.toSortedMap()
.map {(timestamp, dataList) ->
Data(extract(dataList, Data::ax),
extract(dataList, Data::ay),
extract(dataList, Data::az),
timestamp
)
很难找到更好的方法,因为您的主要条件(默认为最后一个解析值)实际上将迫使您对数据集进行排序并保留(或几个)临时变量。
但是,与您的版本相比,此版本的优点如下:
distinctBy
)extract
方法本身可能很复杂,但其用法更具可读性也许通过重构extract
也可以使整体更具可读性。
正如您还说过的那样,您希望它可以轻松移植到Java中,这是可能的Java重写:
Map<Long, List<Data>> unitedList = Stream.concat(Stream.concat(Ax.stream(), Ay.stream()), Az.stream())
.collect(Collectors.groupingBy(Data::getTimestamp));
List<Data> mergedData = unitedList.keySet().stream().sorted()
.map(key -> {
List<Data> dataList = unitedList.get(key);
return new Data(extract(dataList, Data::getAx, Data::setAx),
extract(dataList, Data::getAy, Data::setAy),
extract(dataList, Data::getAz, Data::setAz),
key);
}).collect(Collectors.toList());
和extract
可能看起来像:
Double extract(List<Data> dataList, Function<Data, Double> getter, BiConsumer<Data, Double> setter) {
Optional<Double> relevantProperty = dataList.stream()
.map(getter)
.filter(Objects::nonNull)
.findFirst();
if (relevantProperty.isPresent()) {
setter.accept(tempData, relevantProperty.get());
return relevantProperty.get();
} else {
return getter.apply(tempData);
}
}
基本相同的机制。
答案 1 :(得分:0)
因此,目前我正在使用以下解决方案:
data class Data(
var ax: Double?,
var ay: Double?,
var az: Double?,
val timestamp: Long
)
fun mergeDatasets(Ax: List<Data>, Ay: List<Data>, Az: List<Data>): List<Data> {
val united = mutableListOf<Data>()
united.addAll(Ax)
united.addAll(Ay)
united.addAll(Az)
united.sortBy { it.timestamp }
var tempAx: Double? = null
var tempAy: Double? = null
var tempAz: Double? = null
for (i in 1 until united.size) {
val curr = united[i]
val prev = united[i-1]
if (curr.ax == null) {
if (prev.ax != null) {
curr.ax = prev.ax
tempAx = prev.ax
}
else curr.ax = tempAx
}
if (curr.ay == null) {
if (prev.ay != null) {
curr.ay = prev.ay
tempAy = prev.ay
}
else curr.ay = tempAy
}
if (curr.az == null) {
if (prev.az != null) {
curr.az = prev.az
tempAz = prev.az
}
else curr.az = tempAz
}
if (curr.timestamp == prev.timestamp) {
prev.ax = curr.ax
prev.ay = curr.ay
prev.az = curr.az
}
}
return united.distinctBy { it.timestamp }
}