基于不相等的时间变量加入excel文件

时间:2018-05-23 11:48:00

标签: r excel vba excel-vba excel-formula

我需要加入2个excel文件

第一栏就是这一栏:

01/01/2016 00:00
01/01/2016 00:10
01/01/2016 00:20
01/01/2016 00:30
01/01/2016 00:40
01/01/2016 00:50
01/01/2016 01:00
01/01/2016 01:10
01/01/2016 01:20
01/01/2016 01:30
01/01/2016 01:40
01/01/2016 01:50
01/01/2016 02:00
01/01/2016 02:10
01/01/2016 02:20
01/01/2016 02:30
01/01/2016 02:40
01/01/2016 02:50
01/01/2016 03:00
01/01/2016 03:10
01/01/2016 03:20
01/01/2016 03:30
01/01/2016 03:40
01/01/2016 03:50
01/01/2016 04:00
01/01/2016 04:10
01/01/2016 04:20
01/01/2016 04:30
01/01/2016 04:40
01/01/2016 04:50
01/01/2016 05:00
01/01/2016 05:10
01/01/2016 05:20
01/01/2016 05:30
01/01/2016 05:40
01/01/2016 05:50
01/01/2016 06:00
01/01/2016 06:10
01/01/2016 06:20
01/01/2016 06:30
01/01/2016 06:40
01/01/2016 06:50
01/01/2016 07:00
01/01/2016 07:10
01/01/2016 07:20
01/01/2016 07:30
01/01/2016 07:40
01/01/2016 07:50
01/01/2016 08:00
01/01/2016 08:10
01/01/2016 08:20
01/01/2016 08:30
01/01/2016 08:40
01/01/2016 08:50
01/01/2016 09:00
01/01/2016 09:10
01/01/2016 09:20
01/01/2016 09:30
01/01/2016 09:40
01/01/2016 09:50
01/01/2016 10:00
01/01/2016 10:10
01/01/2016 10:20
01/01/2016 10:30
01/01/2016 10:40
01/01/2016 10:50
01/01/2016 11:00
01/01/2016 11:10
01/01/2016 11:20
01/01/2016 11:30
01/01/2016 11:40
01/01/2016 11:50
01/01/2016 12:00
01/01/2016 12:10
01/01/2016 12:20
01/01/2016 12:30
01/01/2016 12:40
01/01/2016 12:50
01/01/2016 13:00
01/01/2016 13:10
01/01/2016 13:20
01/01/2016 13:30
01/01/2016 13:40
01/01/2016 13:50
01/01/2016 14:00
01/01/2016 14:10
01/01/2016 14:20
01/01/2016 14:30
01/01/2016 14:40
01/01/2016 14:50
01/01/2016 15:00
01/01/2016 15:10
01/01/2016 15:20
01/01/2016 15:30
01/01/2016 15:40
01/01/2016 15:50
01/01/2016 16:00
01/01/2016 16:10
01/01/2016 16:20
01/01/2016 16:30
01/01/2016 16:40
01/01/2016 16:50
01/01/2016 17:00
01/01/2016 17:10
01/01/2016 17:20
01/01/2016 17:30
01/01/2016 17:40
01/01/2016 17:50
01/01/2016 18:00
01/01/2016 18:10
01/01/2016 18:20
01/01/2016 18:30
01/01/2016 18:40
01/01/2016 18:50
01/01/2016 19:00
01/01/2016 19:10
01/01/2016 19:20
01/01/2016 19:30
01/01/2016 19:40
01/01/2016 19:50
01/01/2016 20:00
01/01/2016 20:10
01/01/2016 20:20
01/01/2016 20:30
01/01/2016 20:40
01/01/2016 20:50
01/01/2016 21:00
01/01/2016 21:10
01/01/2016 21:20
01/01/2016 21:30
01/01/2016 21:40
01/01/2016 21:50
01/01/2016 22:00
01/01/2016 22:10
01/01/2016 22:20
01/01/2016 22:30
01/01/2016 22:40
01/01/2016 22:50
01/01/2016 23:00
01/01/2016 23:10
01/01/2016 23:20
01/01/2016 23:30
01/01/2016 23:40
01/01/2016 23:50

和第二个:

01/01/2016 05:07
01/01/2016 07:10
01/01/2016 08:19
01/01/2016 08:27
01/01/2016 09:18
01/01/2016 10:13
01/01/2016 10:23
01/01/2016 10:30
01/01/2016 10:57
01/01/2016 12:20
01/01/2016 14:50
01/01/2016 14:54
01/01/2016 15:00
01/01/2016 15:20
01/01/2016 16:12
01/01/2016 18:26
01/01/2016 19:08
01/01/2016 20:00
01/01/2016 21:15
01/01/2016 21:20
01/01/2016 22:10
01/01/2016 22:13
01/01/2016 22:18

我需要根据第一个excel文件到第二个文件的最近值来连接它们。

我尝试了vlookup和power查询,但他们假设列值之间存在相等。

我真的需要你的帮助。

非常感谢你。

3 个答案:

答案 0 :(得分:1)

如果您的较长列表在A列中,而您的较短列表在B列(假设没有标题等),那么以下公式将VLOOKUP最接近的较低值:

=VLOOKUP(B1,A:A,1,1)

答案 1 :(得分:0)

一个R解决方案。所有merge(和dplyr::*_join)运算符都是相等的,因此您的第一个问题是将第一个向量转换为第二个向量中最接近的向量。 findInterval函数类似于Excel中的VLOOKUP函数:

findInterval(v1df$v1, v2df$v2)
#  [1] 0 1 1 1 1 1 2 2 2 2

这显示了两个工件:

  1. 第一个值的索引为0,表示它位于v2(您的第二个向量)中的第一个值之前。
  2. 这不提供最接近的,它是最高的而不会过去。
  3. 我们可以转移v2以适应。

    d <- diff(v2df$v2)
    units(d) <- "secs"
    v2df$v2shift <- v2df$v2 - c(d[1], d)/2
    v2df
    #                    v2             v2shift
    # 1 2016-01-01 05:07:00 2016-01-01 04:05:30
    # 2 2016-01-01 07:10:00 2016-01-01 06:08:30
    # 3 2016-01-01 08:19:00 2016-01-01 07:44:30
    

    现在间隔有效:

    findInterval(v1df$v1, v2df$v2shift)
    #  [1] 1 1 1 2 2 2 2 2 3 3
    

    现在我们将其添加到第一个数据:

    v1df$closest <- v2df$v2[findInterval(v1df$v1, v2df$v2shift)]
    v1df
    #                     v1             closest
    # 1  2016-01-01 05:00:00 2016-01-01 05:07:00
    # 2  2016-01-01 05:10:00 2016-01-01 05:07:00
    # 3  2016-01-01 05:20:00 2016-01-01 05:07:00
    # 4  2016-01-01 06:40:00 2016-01-01 07:10:00
    # 5  2016-01-01 06:50:00 2016-01-01 07:10:00
    # 6  2016-01-01 07:00:00 2016-01-01 07:10:00
    # 7  2016-01-01 07:10:00 2016-01-01 07:10:00
    # 8  2016-01-01 07:20:00 2016-01-01 07:10:00
    # 9  2016-01-01 08:00:00 2016-01-01 08:19:00
    # 10 2016-01-01 08:10:00 2016-01-01 08:19:00
    

    并合并:

    merge(v2df["v2"], v1df, by.x="v2", by.y="closest")
    #                     v2                  v1
    # 1  2016-01-01 05:07:00 2016-01-01 05:00:00
    # 2  2016-01-01 05:07:00 2016-01-01 05:10:00
    # 3  2016-01-01 05:07:00 2016-01-01 05:20:00
    # 4  2016-01-01 07:10:00 2016-01-01 06:40:00
    # 5  2016-01-01 07:10:00 2016-01-01 06:50:00
    # 6  2016-01-01 07:10:00 2016-01-01 07:00:00
    # 7  2016-01-01 07:10:00 2016-01-01 07:10:00
    # 8  2016-01-01 07:10:00 2016-01-01 07:20:00
    # 9  2016-01-01 08:19:00 2016-01-01 08:00:00
    # 10 2016-01-01 08:19:00 2016-01-01 08:10:00
    

    类似地

    dplyr::full_join(v2df["v2"], v1df, by=c("v2"="closest"))
    

    数据:

    v1 <- as.POSIXct(c(
      '01/01/2016 05:00',
      '01/01/2016 05:10',
      '01/01/2016 05:20',
      '01/01/2016 06:40',
      '01/01/2016 06:50',
      '01/01/2016 07:00',
      '01/01/2016 07:10',
      '01/01/2016 07:20',
      '01/01/2016 08:00',
      '01/01/2016 08:10'), format='%m/%d/%Y %H:%M')
    
    v2 <- as.POSIXct(c(
      '01/01/2016 05:07',
      '01/01/2016 07:10',
      '01/01/2016 08:19'), format='%m/%d/%Y %H:%M')
    
    v1df <- data.frame(v1 = v1)
    v2df <- data.frame(v2 = v2)
    

答案 2 :(得分:0)

假设类似01/01/2016 08:00的字符串是系统中的有效日期,您可以在ADODB中使用VBA解决方案:

'needs reference to MS ActiveX Data Object Library x.x reference
'menu: Tools->References... in VBA Code Pane

Sub GetDataByNearestDates()
    Dim sSQL As String, sConn As String
    Dim oConn As ADODB.Connection
    Dim oRst As ADODB.Recordset

    On Error GoTo Err_GetDataByNearestDates

    sSQL = "SELECT t1.A, t2.B, t1.A - t2.B" & vbCr
    sSQL = sSQL & "FROM (" & vbCr
    sSQL = sSQL & "SELECT [F1] AS A FROM [Sheet1$]" & vbCr
    sSQL = sSQL & ") AS t1" & vbCr
    sSQL = sSQL & ", (" & vbCr
    sSQL = sSQL & "SELECT [F1] AS B FROM [Sheet2$]" & vbCr
    sSQL = sSQL & ") AS t2" & vbCr
    sSQL = sSQL & "WHERE ((t1.A-t2.B >-0.004) AND (t1.A-t2.B <=0.004));" & vbCr

    'if you use column-headers, change HDR=NO to HDR=YES
    sConn = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & ThisWorkbook.FullName & ";Extended Properties='Excel 12.0 Macro;HDR=NO';"

    Set oConn = New ADODB.Connection
    With oConn
        .ConnectionString = sConn
        .Open
        Set oRst = oConn.Execute(sSQL)
    End With

    ThisWorkbook.Worksheets(3).Cells.Clear
    ThisWorkbook.Worksheets(3).Range("A1").CopyFromRecordset oRst

Exit_GetDataByNearestDates:
    On Error Resume Next
    If Not oRst Is Nothing Then oRst.Close: Set oRst = Nothing
    If Not oConn Is Nothing Then oConn.Close: Set oConn = Nothing
    Exit Sub

Err_GetDataByNearestDates:
    MsgBox Err.Description, vbExclamation, Err.Number
    Resume Exit_GetDataByNearestDates

End Sub

上面的代码在Sheet3

中生成一个'最近值'的结果集

为了能够使用上述代码,您需要:

  1. 创建一个空的Excel文件(包含3个工作表)
  2. 将数据从工作簿1复制到Sheet1
  3. 将数据从工作簿2复制到Sheet2
  4. 转到VBA代码窗格(ALT + F11
  5. 插入新模块(菜单Insert->Module
  6. 将代码粘贴到其中
  7. 和...将文件另存为* .xlsm文件
  8. 要运行代码,请将光标移动到GetDataByNearestDates过程的正文中,然后点击F5。结果集应该看起来像

    2016-01-01 05:10    2016-01-01 05:07
    2016-01-01 07:10    2016-01-01 07:10
    2016-01-01 08:20    2016-01-01 08:19
    2016-01-01 08:30    2016-01-01 08:27
    2016-01-01 09:20    2016-01-01 09:18
    2016-01-01 10:10    2016-01-01 10:13
    2016-01-01 10:20    2016-01-01 10:23
    2016-01-01 10:30    2016-01-01 10:30
    2016-01-01 11:00    2016-01-01 10:57
    2016-01-01 12:20    2016-01-01 12:20
    2016-01-01 14:50    2016-01-01 14:50
    2016-01-01 14:50    2016-01-01 14:54
    2016-01-01 15:00    2016-01-01 15:00
    2016-01-01 15:20    2016-01-01 15:20
    2016-01-01 16:10    2016-01-01 16:12
    2016-01-01 18:30    2016-01-01 18:26
    2016-01-01 19:10    2016-01-01 19:08
    2016-01-01 20:00    2016-01-01 20:00
    2016-01-01 21:10    2016-01-01 21:15
    2016-01-01 21:20    2016-01-01 21:15
    2016-01-01 21:20    2016-01-01 21:20
    2016-01-01 22:10    2016-01-01 22:10
    2016-01-01 22:10    2016-01-01 22:13
    2016-01-01 22:20    2016-01-01 22:18