如何使用EF编写此LEFT JOIN查询

时间:2017-02-23 16:02:38

标签: c# entity-framework tsql linq-to-entities left-join

我们正在使用Entity Framework,我们希望执行一个简单的LEFT JOIN。

实际上,这是我们想要在LINQ(可查询)中使用的SQL:

Const strPath As String = "C:\ImportFolder\" 'Directory Path
Dim strFile As String 'Filename
Dim strFileList() As String 'File  Array
Dim intFile As Integer 'File Number

 'Loop through the folder & build file list
strFile = Dir(strPath & "*.csv")
While strFile <> ""
     'add files to the list
    intFile = intFile + 1
    ReDim Preserve strFileList(1 To intFile)
    strFileList(intFile) = strFile
    strFile = Dir()
Wend
 'see if any files were found
If intFile = 0 Then
    MsgBox "No files found"
    Exit Sub
End If
 'cycle through the list of files &  import to Access
 'creating a new table called MyTable
For intFile = 1 To UBound(strFileList)
    DoCmd.TransferText acImportDelimi, ImportSpec, _
    "Raw Data", strPath & strFileList(intFile), -1
     'Check out the TransferSpreadsheet options in the Access
     'Visual Basic Help file for a full description & list of
     'optional settings
Next
MsgBox UBound(strFileList) & " Files were Imported"

我们想出的是:

SELECT 
     cbl.ID as ClaimBatchLine_ID
    ,cbl.PurchasePrice
    ,c.*
    ,ic.DueDate
    ,ic.Reference
FROM ClaimBatchLine cbl
INNER JOIN Claim c ON c.ID = cbl.CLaim_ID
LEFT JOIN InvoiceClaim ic ON ic.ID = c.ID
WHERE cbl.ClaimBatch_ID = @claimBatchId
ORDER BY cbl.ID
OFFSET (@recordsPerPage*@page) ROWS
FETCH NEXT @recordsPerPage ROWS ONLY

这会产生以下SQL。

from cbl in ClaimBatchLines where cbl.ClaimBatch_ID == 1
from c in Claims where c.ID == cbl.Claim_ID
from ic in InvoiceClaims.DefaultIfEmpty() where ic.ID == c.ID
select new {cbl, c, ic.Reference}

它生成相同的结果集。这太好了。但是,您可以看到SELECT [t0].[ID], [t0].[ClaimBatch_ID], [t0].[Claim_ID], [t0].[PurchasePrice], [t1].[ID] AS [ID2], [t1].[ClaimType_ID], [t1].[Debtor_ID], [t1].[CustomerContractRevision_ID], [t1].[Date], [t1].[CreatedOn], [t1].[GrossAmount], [t1].[OpenAmount], [t1].[IsProcessedByOpenAmountCalculator], [t1].[RowVersion], [t2].[Reference] AS [Reference] FROM [ClaimBatchLine] AS [t0] CROSS JOIN [Claim] AS [t1] LEFT OUTER JOIN [InvoiceClaim] AS [t2] ON 1 = 1 WHERE([t2].[ID] = [t1].[ID]) AND ([t1].[ID] = [t0].[Claim_ID]) AND ([t0].[ClaimBatch_ID] = @p0); 不是我们想要的。我希望它会将其翻译为LEFT OUTER JOIN [InvoiceClaim] AS [t2] ON 1 = 1

我们做错了吗?或者LINQ to SQL只是次优(关于性能)并且无法理解我们想要的东西。

修改 在LINQPad中,这会产生一些不错的查询

LEFT JOIN InvoiceClaim ic ON ic.ID = c.ID

from cbl in ClaimBatchLines
join c in Claims on cbl.Claim_ID equals c.ID
join ic in InvoiceClaims on c.ID equals ic.ID into g
from e in g.DefaultIfEmpty()
where cbl.ClaimBatch_ID == 1
select new {cbl, c, e.Reference}

但是在添加这样的分页功能时:

-- Region Parameters
DECLARE @p0 INT= 1;
-- EndRegion
SELECT [t0].[ID],
     [Columns left out for brevity]
     [t2].[Reference] AS [Reference]
FROM [ClaimBatchLine] AS [t0]
    INNER JOIN [Claim] AS [t1] ON [t0].[Claim_ID] = [t1].[ID]
    LEFT OUTER JOIN [InvoiceClaim] AS [t2] ON [t1].[ID] = [t2].[ID]
WHERE [t0].[ClaimBatch_ID] = @p0;

它产生了这个'怪物':

(from cbl in ClaimBatchLines
join c in Claims on cbl.Claim_ID equals c.ID
join ic in InvoiceClaims on c.ID equals ic.ID into g
from e in g.DefaultIfEmpty()
where cbl.ClaimBatch_ID == 1
select new {cbl, c, e.Reference})
.OrderBy(a => a.cbl.ID)
.Skip(0 * 15000)
.Take(15000)

更糟糕的是。当我不通过LINQpad执行相同的LINT-To-EF时,但是在使用EF存储库的代码中,我得到了这个更大的怪物:

-- Region Parameters
DECLARE @p0 INT= 1;
DECLARE @p1 INT= 0;
DECLARE @p2 INT= 15000;
-- EndRegion
SELECT [t4].[ID],
     [Columsn left out for brevity...]
FROM
(
   SELECT ROW_NUMBER() OVER(ORDER BY [t3].[ID]) AS [ROW_NUMBER],
        [Columsn left out for brevity...]
   FROM
   (
      SELECT [t0].[ID],
            [Columsn left out for brevity...]
      FROM [ClaimBatchLine] AS [t0]
          INNER JOIN [Claim] AS [t1] ON [t0].[Claim_ID] = [t1].[ID]
          LEFT OUTER JOIN [InvoiceClaim] AS [t2] ON [t1].[ID] = [t2].[ID]
   ) AS [t3]
   WHERE [t3].[ClaimBatch_ID] = @p0
) AS [t4]
WHERE [t4].[ROW_NUMBER] BETWEEN @p1 + 1 AND @p1 + @p2
ORDER BY [t4].[ROW_NUMBER];

这到底是怎么回事!在LINQpad上抓一点看起来很好。但是生产代码中的最终查询非常可怕! 也许这些查询可以用一些简单的应用程序。在查询更多记录中的500k记录时,这对我来说不太好。我将使用LINQ坚持使用纯SQL表值函数。太遗憾了。

3 个答案:

答案 0 :(得分:2)

尝试这种方式:

from cbl in ClaimBatchLines 
join c in Claims on c.ID equals cbl.Claim_ID
join ic in InvoiceClaims on ic.ID equals c.ID into g
from e in g.DefaultIfEmpty()
where cbl.ClaimBatch_ID == 1
select new {cbl, c, e.Reference}

有关如何在linq中执行左连接的详细信息,请查看此link

答案 1 :(得分:2)

最简单的方法是将你的位置移到连接处:

query = "acut myeloid leukemia"
document1 = "acut myeloid leukemia normal karyotyp"
document2 = "acut myeloid leukemia"
document3 = "acut normal karyotyp"

Q <- unlist(strsplit(query, " "))
d1 <- unlist(strsplit(document1, " "))  
d2 <- unlist(strsplit(document2, " "))  
d3 <- unlist(strsplit(document3, " "))  

y <- adist(d1,Q)
double_summation1 = 0
for (i in 1:nrow(y-1)) {
    for (j in 1:ncol(y-1)) {
    double_summation1 = double_summation1 + abs(i-j)
    }
}
double_summation1
scatter <- sum(do.call(pmin, lapply(1:nrow(y), function(x)y[x,])))
dist_d_Q1 <- scatter/double_summation1

y <- adist(d2,Q)
double_summation2 = 0
for (i in 1:nrow(y-1)) {
    for (j in 1:ncol(y-1)) {
    double_summation2 = double_summation2 + abs(i-j)
    }
}
double_summation2
scatter <- sum(do.call(pmin, lapply(1:nrow(y), function(x)y[x,])))
dist_d_Q2 <- scatter/double_summation2

y <- adist(d3,Q)
double_summation3 = 0
for (i in 1:nrow(y-1)) {
    for (j in 1:ncol(y-1)) {
    double_summation3 = double_summation3 + abs(i-j)
    }
}
double_summation3
scatter <- sum(do.call(pmin, lapply(1:nrow(y), function(x)y[x,])))
dist_d_Q3 <- scatter/double_summation3

c(dist_d_Q1, dist_d_Q2, dist_d_Q3)

[1] 23
[1] 8
[1] 8
[1] 0.00 0.00 1.75

这将使查询在from cbl in ClaimBatchLines where cbl.ClaimBatch_ID == 1 from c in Claims where c.ID == cbl.Claim_ID from ic in InvoiceClaims.Where(x => x.ID == c.ID).DefaultIfEmpty() select new {cbl, c, ic.Reference}

上使用left join

答案 2 :(得分:1)

尝试使用linq的join编写它。

var q = 
(from cbl in ClaimBatchLines
join c in Claims on cbl.Claim_ID equals c.ID
join tmpIc in InvoiceClaims on c.ID equals tmpIc.ID into g
from ic in g.DefaultIfEmpty()
where cbl.ClaimBatch_ID == 1
select new { cbl, c, ic })
  .OrderBy(x => x.cbl.ID) 
  .Skip(recordsPerPage * page)
  .Take(recordsPerPage);