对对象集合进行SQL EXISTS,JOIN或SELECT

时间:2018-07-30 15:39:53

标签: c# sql .net sql-server

我正在尝试通过消除每个作业的foreach迭代来优化Quartz .NET调度程序。

这是与SQL相关的问题,而不是.NET中的问题。

该代码遍历每个作业以执行操作,并对每个作业执行以下操作:

bool existingJob = await JobExists(conn, newJob.Key, cancellationToken).ConfigureAwait(false);
try
{
    if (existingJob)
    {
        if (!replaceExisting)
        {
            throw new ObjectAlreadyExistsException(newJob);
        }
        await Delegate.UpdateJobDetail(conn, newJob, cancellationToken).ConfigureAwait(false);
    }
    else
    {
        await Delegate.InsertJobDetail(conn, newJob, cancellationToken).ConfigureAwait(false);
    }
}
catch (IOException e)
{
    throw new JobPersistenceException("Couldn't store job: " + e.Message, e);
}
catch (Exception e)
{
    throw new JobPersistenceException("Couldn't store job: " + e.Message, e);
}

每个等待中都有一个sql请求。

我想做相反的事情:对所有工作进行大量的JobExists处理,然后告诉我哪些工作存在而哪些不存在,然后更新存在的工作并添加那些不存在的工作

例如,有20万个作业,而不是进行20万次,然后添加或更新,那么我们将有3个sql事务,一个将验证存在的sql事务,然后是要批量添加的事务和最后一个一个要批量更新。

但是我不知道如何在SQL中执行大量的Exists,我只知道如何对一个查询执行IF EXISTS (SELECT A FROM B)。这样的事情可能吗?还是应该做大量的SELECT或某种JOIN?我该怎么办?

*编辑*

我认为随着代码的进展,代码现在可以在如下所示的存储过程中运行:

USE [Quartz]
GO
/****** Object:  StoredProcedure [dbo].[procProcessJobs]    Script Date: 7/31/2018 10:46:09 AM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

ALTER PROCEDURE [dbo].[procProcessJobs] 
-- Add the parameters for the stored procedure here
@Jobs JobsTableType READONLY
AS

DECLARE MYCURSOR CURSOR LOCAL STATIC READ_ONLY FORWARD_ONLY
FOR Select * FROM @Jobs

Declare @Sched NVARCHAR(120), @JobName NVARCHAR(150), @JobGroup NVARCHAR(150),
@Description NVARCHAR(250), @JobClass NVARCHAR(250), @IsDurable BIT, @IsNonConcurrent BIT,
@IsUpdateData BIT, @RequestsRecovery BIT, @JobData BIT

OPEN MYCURSOR

FETCH NEXT FROM MYCURSOR 
INTO @Sched, @JobName, @JobGroup, @Description, @JobClass, @IsDurable, @IsNonConcurrent, @IsUpdateData, @RequestsRecovery, @JobData

WHILE @@FETCH_STATUS = 0
BEGIN
    If EXISTS(SELECT * FROM QRTZ_JOB_DETAILS WHERE SCHED_NAME = @Sched AND JOB_NAME = @JobName AND JOB_GROUP = @JobGroup)
    BEGIN
        /*do your update code here*/
        UPDATE QRTZ_JOB_DETAILS 
            SET DESCRIPTION = @Description,
                JOB_CLASS_NAME = @JobClass, 
                IS_DURABLE = @IsDurable, 
                IS_NONCONCURRENT = @IsNonConcurrent, 
                IS_UPDATE_DATA = @IsUpdateData, 
                REQUESTS_RECOVERY = @RequestsRecovery, 
                JOB_DATA = @JobData  
            WHERE SCHED_NAME = @Sched AND JOB_NAME = @JobName AND JOB_GROUP = @JobGroup
    END
    ELSE BEGIN
        /*do your insert code here*/
        INSERT INTO QRTZ_JOB_DETAILS (SCHED_NAME, JOB_NAME, JOB_GROUP, DESCRIPTION, JOB_CLASS_NAME, IS_DURABLE, IS_NONCONCURRENT, IS_UPDATE_DATA, REQUESTS_RECOVERY, JOB_DATA)  
        VALUES(@Sched, @JobName, @JobGroup, @Description, @JobClass, @IsDurable, @IsNonConcurrent, @IsUpdateData, @RequestsRecovery, @JobData)
    END

END

CLOSE MYCURSOR
DEALLOCATE MYCURSOR

但是代码运行很长的时间很长,我不知道为什么...

2 个答案:

答案 0 :(得分:2)

进行行集操作后,将其移至性能最佳的数据引擎。

将Table参数中的newJobs列表传递给存储过程,让该过程在那里进行,然后返回Rowset,该Rowset指示每个已处理newJob的结果。然后根据需要在客户端中处理结果。性能提升应该是巨大的。

鉴于提供的信息,这就是我要尝试的方法。

这是一个示例存储过程,可以执行您想要的操作。它未经测试,您的里程可能会有所不同:

/* Create a table type. */  
CREATE TYPE JobsTableType AS TABLE   
( JobKey INT  
, Value1 INT
, Value2 INT);  
GO  

CREATE PROCEDURE procProcessJobs 
-- Add the parameters for the stored procedure here
@Jobs JobsTableType READONLY
AS

DECLARE MYCURSOR CURSOR LOCAL STATIC READ_ONLY FORWARD_ONLY
FOR Select Distinct JobKey FROM @Jobs

Declare @MyKey Int

OPEN MYCURSOR

FETCH NEXT FROM MY_CURSOR INTO @MyKey

WHILE @@FETCH_STATUS = 0
BEGIN
    If EXISTS(Select * From [your_table_name] where      [your_table_name].JobKey=@MyKey)
    BEGIN
        /*do your update code here*/
    END
    ELSE BEGIN
        /*do your insert code here*/
    END

END

CLOSE MYCURSOR
DEALLOCATE MYCURSOR

GO

作为额外的奖励:)这是一些未经测试的C#代码:

namespace ConsoleApp1
{

    public class Job
    {
    public int  Key { get; set; }
    public int value1 { get; set; }
    public int value2 { get; set; }
}


class Program
{

    static DataTable getJobTabel(List<Job> jobs)
    {
        DataTable results = new DataTable();
        results.Columns.Add("JobKkey", typeof(int));
        results.Columns.Add("Value1", typeof(int));
        results.Columns.Add("Value2", typeof(int));


        foreach(Job item in jobs)
        {
            object[] r = { item.Key, item.value1, item.value2, false };
            results.Rows.Add(new object[] { item.Key, item.value1, item.value2});
        }

        return results;
    }


    static void Main(string[] args)
    {
        List<Job> myJobs = new List<Job>;//populate the myjobs list

        DataTable JobsToProcess = getJobTabel(myJobs);

        //create your connection and command objects then add JobsToProcess as a parameter

    }

}

}

答案 1 :(得分:0)

这是我根据@Ibrahim的答案进行的方式:

在SQL Server Management Studio上运行此脚本

/* Create a table type. */  
CREATE TYPE JobsTableType AS TABLE   
( JobKey INT  
, Value1 INT
, Value2 INT);  
GO  

CREATE PROCEDURE procProcessJobs 
-- Add the parameters for the stored procedure here
@Jobs JobsTableType READONLY
AS

BEGIN
    MERGE [dbo].[QRTZ_JOB_DETAILS] AS T
    USING @Jobs AS S
    ON T.SCHED_NAME = S.SCHED_NAME AND T.JOB_NAME = S.JOB_NAME AND T.JOB_GROUP = S.JOB_GROUP
    WHEN MATCHED THEN UPDATE 
                    SET T.DESCRIPTION = S.DESCRIPTION,
                        T.JOB_CLASS_NAME = S.JOB_CLASS_NAME, 
                        T.IS_DURABLE = S.IS_DURABLE, 
                        T.IS_NONCONCURRENT = S.IS_NONCONCURRENT, 
                        T.IS_UPDATE_DATA = S.IS_UPDATE_DATA, 
                        T.REQUESTS_RECOVERY = S.REQUESTS_RECOVERY, 
                        T.JOB_DATA = S.JOB_DATA  
    WHEN NOT MATCHED THEN 
        INSERT (SCHED_NAME, JOB_NAME, JOB_GROUP, DESCRIPTION, JOB_CLASS_NAME, IS_DURABLE, IS_NONCONCURRENT, IS_UPDATE_DATA, REQUESTS_RECOVERY, JOB_DATA)
        VALUES(S.SCHED_NAME, S.JOB_NAME, S.JOB_GROUP, S.DESCRIPTION, S.JOB_CLASS_NAME, S.IS_DURABLE, S.IS_NONCONCURRENT, S.IS_UPDATE_DATA, S.REQUESTS_RECOVERY, S.JOB_DATA);

END

以及我如何使用c#代码做到这一点:

DataTable tempTable = GetJobTable(jobs);

using (var cmd = PrepareCommand(conn, string.Empty))
{
    cmd.CommandType = CommandType.StoredProcedure;
    cmd.CommandText = "dbo.procProcessJobs";
    SqlParameter parameter = new SqlParameter
    {
        ParameterName = "@Jobs",
        SqlDbType = SqlDbType.Structured,
        Value = tempTable
    };
    cmd.CommandTimeout = 12000;

    cmd.Parameters.Add(parameter);

    return await cmd.ExecuteNonQueryAsync(cancellationToken).ConfigureAwait(false);
}


private DataTable GetJobTable(List<IJobDetail> jobs)
{
    DataTable results = new DataTable();
    results.Columns.Add(ColumnSchedulerName, typeof(string));
    results.Columns.Add(ColumnJobName, typeof(string));
    results.Columns.Add(ColumnJobGroup, typeof(string));
    results.Columns.Add(ColumnDescription, typeof(string));
    results.Columns.Add(ColumnJobClass, typeof(string));
    results.Columns.Add(ColumnIsDurable, typeof(bool));
    results.Columns.Add(ColumnIsNonConcurrent, typeof(bool));
    results.Columns.Add(ColumnIsUpdateData, typeof(bool));
    results.Columns.Add(ColumnRequestsRecovery, typeof(bool));
    results.Columns.Add(ColumnJobDataMap, typeof(byte[]));

    foreach (var job in jobs)
    {
        byte[] baos = null;
        if (job.JobDataMap.Count > 0)
        {
            baos = SerializeJobData(job.JobDataMap);
        }

        results.Rows.Add(new object[]
        {
            SchedulerNameLiteral,
            job.Key.Name,
            job.Key.Group,
            job.Description,
            GetStorableJobTypeName(job.JobType),
            GetDbBooleanValue(job.Durable),
            GetDbBooleanValue(job.ConcurrentExecutionDisallowed),
            GetDbBooleanValue(job.PersistJobDataAfterExecution),
            GetDbBooleanValue(job.RequestsRecovery),
            baos
        });
    }

    return results;
}