我有三列文件,第一列和第二列是时间的开始和结束,而第三列是标签。如果第3列中的标签相同,我想合并连续行(2或更多)的时间戳。
0.000000 0.551875 x
0.551875 0.586875 x
0.586875 0.676188 t
0.676188 0.721875 t
0.721875 0.821250 t
0.821250 0.872063 p
0.872063 0.968625 q
0.968625 1.112250 q
0.000000 0.551875 x
0.551875 0.586875 x
0.586875 0.676188 t
0.676188 0.721875 t
0.721875 0.821250 t
0.821250 0.872063 p
0.872063 0.968625 q
0.968625 1.112250 q
1.112250 1.212250 x
1.212250 1.500000 x
0.000000 0.551875 x
0.551875 0.586875 x
0.586875 0.676188 t
0.676188 0.721875 t
0.721875 0.821250 t
0.821250 0.872063 oo
0.872063 0.968625 q
0.968625 1.112250 q
1.112250 1.212250 x
1.212250 1.500000 x
0.000000 0.586875 x
0.586875 0.821250 t
0.821250 0.872063 p
0.872063 1.112250 q
1.112250 1.500000 x
答案 0 :(得分:0)
在Groovy中,给出:
def inputs = [
[0.000000, 0.551875, 'x'],
[0.551875, 0.586875, 'x'],
[0.586875, 0.676188, 't'],
[0.676188, 0.721875, 't'],
[0.721875, 0.821250, 't'],
[0.821250, 0.872063, 'p'],
[0.872063, 0.968625, 'q'],
[0.968625, 1.112250, 'q']
]
只需按每个列表中的第3个元素对它们进行分组,然后为每个组创建一个包含;
的列表,并提供:
def outputs = inputs.groupBy { it[2] }.collect { key, items ->
[items[0][0], items[-1][1], key]
}
结果是:
[[0.000000, 0.586875, 'x'],
[0.586875, 0.821250, 't'],
[0.821250, 0.872063, 'p'],
[0.872063, 1.112250, 'q']]
如果您的输入可能存在您想要维护的空白,那么您可以尝试
def inputs = [[0.000000, 0.551875, 'x'],
[0.551875, 0.586875, 'x'],
[0.586875, 0.676188, 't'],
[0.676188, 0.721875, 't'],
[0.721875, 0.821250, 't'],
[0.821250, 0.872063, 'p'],
[0.872063, 0.968625, 'q'],
[0.968625, 1.112250, 'q'],
[1.112250, 1.551875, 'x'],
[1.551875, 2.000000, 'x']]
def outputs = inputs.inject([]) { accum, line ->
if(accum && accum[-1][2] == line[2]) {
accum[-1][1] = line[1]
}
else {
accum << line
}
accum
}
给予
[[0.000000, 0.586875, 'x'],
[0.586875, 0.821250, 't'],
[0.821250, 0.872063, 'p'],
[0.872063, 1.112250, 'q'],
[1.112250, 2.000000, 'x']]
def inputs = [[0.000000, 0.551875, 'x'],
[0.551875, 0.586875, 'x'],
[0.586875, 0.676188, 't'],
[0.676188, 0.721875, 't'],
[0.721875, 0.821250, 't'],
[0.821250, 0.872063, 'oo'],
[0.872063, 0.968625, 'q'],
[0.968625, 1.112250, 'q'],
[1.112250, 1.551875, 'x'],
[1.551875, 2.000000, 'x']]
def coalesce(List inputs, String... wildcards) {
inputs.inject([]) { accum, line ->
if(accum &&
(accum[-1][2] == line[2] || wildcards.contains(line[2]))) {
accum[-1][1] = line[1]
}
else {
accum << line
}
accum
}
}
然后;
def outputs = coalesce(inputs, 'oo')
给出:
[[0.000000, 0.586875, 'x'],
[0.586875, 0.872063, 't'],
[0.872063, 1.112250, 'q'],
[1.112250, 2.000000, 'x']]
和
def outputs = coalesce(inputs, 'oo', 'q')
给出
[[0.000000, 0.586875, 'x'],
[0.586875, 1.112250, 't'],
[1.112250, 2.000000, 'x']]