R合并部分匹配

时间:2020-03-18 12:36:29

标签: r dataframe merge

有很多答案,但是我没有发现我要解决的问题。

我有2个数据框:

df1:

enter image description here

df2:

enter image description here

library(data.table)
setC <-merge(setA, setB, by.x = "name", by.y = "name", all.x = FALSE)

因此,我希望按列值匹配行:

Public Sub RunMe()
 Dim cnn As New ADODB.Connection
 Dim rst As New ADODB.Recordset
 Dim cmd As New ADODB.Command

 With cnx
    .Provider = "Microsoft.Jet.OLEDB.4.0"
    .ConnectionString = "Data Source='" & ThisWorkbook.FullName & "'; " & "Extended 
 roperties='Excel 8.0;HDR=Yes;IMEX=1'"
    .Open
End With


'setup the command
Set cmd.ActiveConnection = cnx
cmd.CommandType = adCmdText
cmd.CommandText = "SELECT * FROM [OutPutSheet$] where [ColumnName]=" & "'AnyValueOrVariable'"
rst.CursorLocation = adUseClient
rst.CursorType = adOpenDynamic
rst.LockType = adLockOptimistic

'open the connection
rst.Open cmd

'Assuming that you want to get the data from A2, or else you can even make it more 
'dynamic
Sheets("plan").Range("A2").CopyFromRecordset rst


'disconnect the recordset
Set rst.ActiveConnection = Nothing
'disconnect the connection
Set cnx = Nothing

End Sub

我得到以下输出:

df3:

enter image description here

因为在df中我的值也为1,但以“;”分隔。如何获得欲望输出?

谢谢!

2 个答案:

答案 0 :(得分:1)

将来,请应用函数dput(df1)和dput(df2),并将控制台的输出复制并粘贴到您的问题中。

Base R解决方案分为两部分:

// Type definitions for Angular JS (ui.router module) 1.1.5
// Project: https://github.com/angular-ui/ui-router
// Definitions by: Michel Salib <https://github.com/michelsalib>, Ivan Matiishyn <https://github.com/matiishyn>
// Definitions: https://github.com/DefinitelyTyped/DefinitelyTyped

import * as angular from 'angular';

export default "ui.router";

export type IState = angular.ui.IState;

数据:

# First unstack the 1;7 row into two separate rows: 

name_split <- strsplit(df1$name, ";")

# If the values of last vector uniquely identify each row in the dataframe: 

df_ro <- data.frame(name = unlist(name_split),
                     last = rep(df1$last, sapply(name_split, length)),
                     stringsAsFactors = FALSE)

# Left join to achieve the same result as first solution 
# without specifically naming each vector: 

df1_ro <- merge(df1[,names(df1) != "name"], df_ro, by = "last", all.x = TRUE)

# Then perform an inner join preventing a name space collision: 

df3 <- merge(df1_ro, setNames(df2, paste0(names(df2), ".x")),
             by.x = "name", by.y = "name.x")

# If you wanted to perform an inner join on all intersecting columns (returning
# no results because values in last and colour are different then): 

df3 <- merge(df1_ro, df2, by = intersect(names(df1_ro), names(df2)))

答案 1 :(得分:-1)

最后,我实现了以下解决方案:

co=open('NewFile.txt','w')
f=open('IndexFile.txt','r')
g=open('File.txt','r')

tabla1 = f.readlines()
tabla2 = g.readlines()

B=[]
for ln in tabla1:
    B = ln.split('\t')[3]
    for k, ln2 in enumerate(tabla2):
        if B in ln2.split('\t')[3]:
            xx=ln2
            print(xx)
            co.write(xx)
            break
co.close()