我需要加入2个excel文件
第一栏就是这一栏:
01/01/2016 00:00
01/01/2016 00:10
01/01/2016 00:20
01/01/2016 00:30
01/01/2016 00:40
01/01/2016 00:50
01/01/2016 01:00
01/01/2016 01:10
01/01/2016 01:20
01/01/2016 01:30
01/01/2016 01:40
01/01/2016 01:50
01/01/2016 02:00
01/01/2016 02:10
01/01/2016 02:20
01/01/2016 02:30
01/01/2016 02:40
01/01/2016 02:50
01/01/2016 03:00
01/01/2016 03:10
01/01/2016 03:20
01/01/2016 03:30
01/01/2016 03:40
01/01/2016 03:50
01/01/2016 04:00
01/01/2016 04:10
01/01/2016 04:20
01/01/2016 04:30
01/01/2016 04:40
01/01/2016 04:50
01/01/2016 05:00
01/01/2016 05:10
01/01/2016 05:20
01/01/2016 05:30
01/01/2016 05:40
01/01/2016 05:50
01/01/2016 06:00
01/01/2016 06:10
01/01/2016 06:20
01/01/2016 06:30
01/01/2016 06:40
01/01/2016 06:50
01/01/2016 07:00
01/01/2016 07:10
01/01/2016 07:20
01/01/2016 07:30
01/01/2016 07:40
01/01/2016 07:50
01/01/2016 08:00
01/01/2016 08:10
01/01/2016 08:20
01/01/2016 08:30
01/01/2016 08:40
01/01/2016 08:50
01/01/2016 09:00
01/01/2016 09:10
01/01/2016 09:20
01/01/2016 09:30
01/01/2016 09:40
01/01/2016 09:50
01/01/2016 10:00
01/01/2016 10:10
01/01/2016 10:20
01/01/2016 10:30
01/01/2016 10:40
01/01/2016 10:50
01/01/2016 11:00
01/01/2016 11:10
01/01/2016 11:20
01/01/2016 11:30
01/01/2016 11:40
01/01/2016 11:50
01/01/2016 12:00
01/01/2016 12:10
01/01/2016 12:20
01/01/2016 12:30
01/01/2016 12:40
01/01/2016 12:50
01/01/2016 13:00
01/01/2016 13:10
01/01/2016 13:20
01/01/2016 13:30
01/01/2016 13:40
01/01/2016 13:50
01/01/2016 14:00
01/01/2016 14:10
01/01/2016 14:20
01/01/2016 14:30
01/01/2016 14:40
01/01/2016 14:50
01/01/2016 15:00
01/01/2016 15:10
01/01/2016 15:20
01/01/2016 15:30
01/01/2016 15:40
01/01/2016 15:50
01/01/2016 16:00
01/01/2016 16:10
01/01/2016 16:20
01/01/2016 16:30
01/01/2016 16:40
01/01/2016 16:50
01/01/2016 17:00
01/01/2016 17:10
01/01/2016 17:20
01/01/2016 17:30
01/01/2016 17:40
01/01/2016 17:50
01/01/2016 18:00
01/01/2016 18:10
01/01/2016 18:20
01/01/2016 18:30
01/01/2016 18:40
01/01/2016 18:50
01/01/2016 19:00
01/01/2016 19:10
01/01/2016 19:20
01/01/2016 19:30
01/01/2016 19:40
01/01/2016 19:50
01/01/2016 20:00
01/01/2016 20:10
01/01/2016 20:20
01/01/2016 20:30
01/01/2016 20:40
01/01/2016 20:50
01/01/2016 21:00
01/01/2016 21:10
01/01/2016 21:20
01/01/2016 21:30
01/01/2016 21:40
01/01/2016 21:50
01/01/2016 22:00
01/01/2016 22:10
01/01/2016 22:20
01/01/2016 22:30
01/01/2016 22:40
01/01/2016 22:50
01/01/2016 23:00
01/01/2016 23:10
01/01/2016 23:20
01/01/2016 23:30
01/01/2016 23:40
01/01/2016 23:50
和第二个:
01/01/2016 05:07
01/01/2016 07:10
01/01/2016 08:19
01/01/2016 08:27
01/01/2016 09:18
01/01/2016 10:13
01/01/2016 10:23
01/01/2016 10:30
01/01/2016 10:57
01/01/2016 12:20
01/01/2016 14:50
01/01/2016 14:54
01/01/2016 15:00
01/01/2016 15:20
01/01/2016 16:12
01/01/2016 18:26
01/01/2016 19:08
01/01/2016 20:00
01/01/2016 21:15
01/01/2016 21:20
01/01/2016 22:10
01/01/2016 22:13
01/01/2016 22:18
我需要根据第一个excel文件到第二个文件的最近值来连接它们。
我尝试了vlookup和power查询,但他们假设列值之间存在相等。
我真的需要你的帮助。
非常感谢你。
答案 0 :(得分:1)
如果您的较长列表在A列中,而您的较短列表在B列(假设没有标题等),那么以下公式将VLOOKUP最接近的较低值:
=VLOOKUP(B1,A:A,1,1)
答案 1 :(得分:0)
一个R解决方案。所有merge
(和dplyr::*_join
)运算符都是相等的,因此您的第一个问题是将第一个向量转换为第二个向量中最接近的向量。 findInterval
函数类似于Excel中的VLOOKUP
函数:
findInterval(v1df$v1, v2df$v2)
# [1] 0 1 1 1 1 1 2 2 2 2
这显示了两个工件:
0
,表示它位于v2
(您的第二个向量)中的第一个值之前。我们可以转移v2
以适应。
d <- diff(v2df$v2)
units(d) <- "secs"
v2df$v2shift <- v2df$v2 - c(d[1], d)/2
v2df
# v2 v2shift
# 1 2016-01-01 05:07:00 2016-01-01 04:05:30
# 2 2016-01-01 07:10:00 2016-01-01 06:08:30
# 3 2016-01-01 08:19:00 2016-01-01 07:44:30
现在间隔有效:
findInterval(v1df$v1, v2df$v2shift)
# [1] 1 1 1 2 2 2 2 2 3 3
现在我们将其添加到第一个数据:
v1df$closest <- v2df$v2[findInterval(v1df$v1, v2df$v2shift)]
v1df
# v1 closest
# 1 2016-01-01 05:00:00 2016-01-01 05:07:00
# 2 2016-01-01 05:10:00 2016-01-01 05:07:00
# 3 2016-01-01 05:20:00 2016-01-01 05:07:00
# 4 2016-01-01 06:40:00 2016-01-01 07:10:00
# 5 2016-01-01 06:50:00 2016-01-01 07:10:00
# 6 2016-01-01 07:00:00 2016-01-01 07:10:00
# 7 2016-01-01 07:10:00 2016-01-01 07:10:00
# 8 2016-01-01 07:20:00 2016-01-01 07:10:00
# 9 2016-01-01 08:00:00 2016-01-01 08:19:00
# 10 2016-01-01 08:10:00 2016-01-01 08:19:00
并合并:
merge(v2df["v2"], v1df, by.x="v2", by.y="closest")
# v2 v1
# 1 2016-01-01 05:07:00 2016-01-01 05:00:00
# 2 2016-01-01 05:07:00 2016-01-01 05:10:00
# 3 2016-01-01 05:07:00 2016-01-01 05:20:00
# 4 2016-01-01 07:10:00 2016-01-01 06:40:00
# 5 2016-01-01 07:10:00 2016-01-01 06:50:00
# 6 2016-01-01 07:10:00 2016-01-01 07:00:00
# 7 2016-01-01 07:10:00 2016-01-01 07:10:00
# 8 2016-01-01 07:10:00 2016-01-01 07:20:00
# 9 2016-01-01 08:19:00 2016-01-01 08:00:00
# 10 2016-01-01 08:19:00 2016-01-01 08:10:00
类似地
dplyr::full_join(v2df["v2"], v1df, by=c("v2"="closest"))
数据:
v1 <- as.POSIXct(c(
'01/01/2016 05:00',
'01/01/2016 05:10',
'01/01/2016 05:20',
'01/01/2016 06:40',
'01/01/2016 06:50',
'01/01/2016 07:00',
'01/01/2016 07:10',
'01/01/2016 07:20',
'01/01/2016 08:00',
'01/01/2016 08:10'), format='%m/%d/%Y %H:%M')
v2 <- as.POSIXct(c(
'01/01/2016 05:07',
'01/01/2016 07:10',
'01/01/2016 08:19'), format='%m/%d/%Y %H:%M')
v1df <- data.frame(v1 = v1)
v2df <- data.frame(v2 = v2)
答案 2 :(得分:0)
假设类似01/01/2016 08:00
的字符串是系统中的有效日期,您可以在ADODB中使用VBA解决方案:
'needs reference to MS ActiveX Data Object Library x.x reference
'menu: Tools->References... in VBA Code Pane
Sub GetDataByNearestDates()
Dim sSQL As String, sConn As String
Dim oConn As ADODB.Connection
Dim oRst As ADODB.Recordset
On Error GoTo Err_GetDataByNearestDates
sSQL = "SELECT t1.A, t2.B, t1.A - t2.B" & vbCr
sSQL = sSQL & "FROM (" & vbCr
sSQL = sSQL & "SELECT [F1] AS A FROM [Sheet1$]" & vbCr
sSQL = sSQL & ") AS t1" & vbCr
sSQL = sSQL & ", (" & vbCr
sSQL = sSQL & "SELECT [F1] AS B FROM [Sheet2$]" & vbCr
sSQL = sSQL & ") AS t2" & vbCr
sSQL = sSQL & "WHERE ((t1.A-t2.B >-0.004) AND (t1.A-t2.B <=0.004));" & vbCr
'if you use column-headers, change HDR=NO to HDR=YES
sConn = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & ThisWorkbook.FullName & ";Extended Properties='Excel 12.0 Macro;HDR=NO';"
Set oConn = New ADODB.Connection
With oConn
.ConnectionString = sConn
.Open
Set oRst = oConn.Execute(sSQL)
End With
ThisWorkbook.Worksheets(3).Cells.Clear
ThisWorkbook.Worksheets(3).Range("A1").CopyFromRecordset oRst
Exit_GetDataByNearestDates:
On Error Resume Next
If Not oRst Is Nothing Then oRst.Close: Set oRst = Nothing
If Not oConn Is Nothing Then oConn.Close: Set oConn = Nothing
Exit Sub
Err_GetDataByNearestDates:
MsgBox Err.Description, vbExclamation, Err.Number
Resume Exit_GetDataByNearestDates
End Sub
上面的代码在Sheet3
中生成一个'最近值'的结果集为了能够使用上述代码,您需要:
Sheet1
Sheet2
ALT
+ F11
)Insert->Module
)要运行代码,请将光标移动到GetDataByNearestDates
过程的正文中,然后点击F5
。结果集应该看起来像
2016-01-01 05:10 2016-01-01 05:07
2016-01-01 07:10 2016-01-01 07:10
2016-01-01 08:20 2016-01-01 08:19
2016-01-01 08:30 2016-01-01 08:27
2016-01-01 09:20 2016-01-01 09:18
2016-01-01 10:10 2016-01-01 10:13
2016-01-01 10:20 2016-01-01 10:23
2016-01-01 10:30 2016-01-01 10:30
2016-01-01 11:00 2016-01-01 10:57
2016-01-01 12:20 2016-01-01 12:20
2016-01-01 14:50 2016-01-01 14:50
2016-01-01 14:50 2016-01-01 14:54
2016-01-01 15:00 2016-01-01 15:00
2016-01-01 15:20 2016-01-01 15:20
2016-01-01 16:10 2016-01-01 16:12
2016-01-01 18:30 2016-01-01 18:26
2016-01-01 19:10 2016-01-01 19:08
2016-01-01 20:00 2016-01-01 20:00
2016-01-01 21:10 2016-01-01 21:15
2016-01-01 21:20 2016-01-01 21:15
2016-01-01 21:20 2016-01-01 21:20
2016-01-01 22:10 2016-01-01 22:10
2016-01-01 22:10 2016-01-01 22:13
2016-01-01 22:20 2016-01-01 22:18