我有一个像这样的数据集:
CREATE PROCEDURE dbo.UpdateRecord(
@NewValue1 INT,
@IdValue INT,
@ErrorMessage NVARCHAR(4000) OUTPUT
)
AS
BEGIN
BEGIN TRY
MERGE TableA as Tgt
USING (
VALUES(@IdValue, @NewValue1)
) AS src(IdValue, MyName)
ON Tgt.Id = src.IdValue
WHEN NOT MATCHED THEN
INSERT (Id, MyName)
VALUES(Src.IdValue, Src.MyName)
WHEN MATCHED THEN
UPDATE
SET MyName = Src.MyName;
END TRY
BEGIN CATCH
SET @ErrorMessage = ERROR_MESSAGE()
declare @ErrorSeverity int, @ErrorState int;
select @ErrorMessage = ERROR_MESSAGE() + ' Line ' + cast(ERROR_LINE() as nvarchar(5)), @ErrorSeverity = ERROR_SEVERITY(), @ErrorState = ERROR_STATE();
raiserror (@ErrorMessage, @ErrorSeverity, @ErrorState);
END CATCH
END
另一个像这样:
> head(featured_products)
Dept Class Sku Description Code Vehicle/Placement StartDate EndDate Comments(Circulation,Location,etc)
1: 430 4318 401684 ++INDV RAMEKIN WP 9CM OSM Facebook 2017-01-01 2017-01-29 Fancy Brunch Blog
2: 430 4318 401684 ++INDV RAMEKIN WP 9CM OSM Twitter 2017-01-01 2017-01-29 Fancy Brunch Blog
3: 340 3411 1672605 ++ SPHERE WILLOW 4" OP1 Editorial 2016-02-29 2016-03-27 Spruce up for Spring
4: 230 2311 2114074 ++BOX 30 ISLAND ORCHRD TLIGHTS EM Email 2016-02-17 2016-02-17 Island Orchard and Jeweled Lanterns
5: 895 8957 2118072 ++PAPASAN STL TAUPE OSM Instagram 2017-08-26 2017-10-01 by @audriestorme
6: 895 8957 2118072 ++PAPASAN STL TAUPE EM Email 2017-11-23 2017-11-23 Day 2 Black Friday AM
我在名为SKU ActivityDate OnlineSalesQuantity OnlineDiscountPercent InStoreSalesQuantity InStoreDiscountPercent
1: 401684 2015-12-01 150 0.00 406 2.72
2: 401684 2015-12-02 0 0.00 556 3.79
3: 401684 2015-12-03 0 0.00 723 3.44
4: 401684 2015-12-04 16 4.91 781 2.46
5: 401684 2015-12-05 17 0.00 982 3.18
6: 401684 2015-12-06 0 0.00 851 3.12
的第二个df中添加了一列,如果产品在给定日期的第一个df中列出,则为1,否则为0。
现在,我想要做的是将featured
列添加到新的,合并的df(当Vehicle/Placement
== 1时)...这里的问题是那里的不同日期可能是不同的车辆,或多个......
如何扫描featured
行的日期并将其与df1进行比较,然后提取featured == 1
并将其添加到合并的df中?
这也必须有效地完成,因为df2是285万行...
我正在寻找以下内容:
Vehicle/Placement
但这会产生错误:
警告消息:如果(合并$ featured == 1){:条件 长度> 1,只使用第一个元素
我认为我找到了另一种解决方案,但它非常慢并且需要数小时才能运行:
# Add vehicle
if(combined$featured == 1) {
for (n in 1:nrow(featured_products)) {
for (m in 1:nrow(combined)) {
combined$vehicle <- ifelse(combined$activitydate[m] %within% interval(featured_products$startdate[n],featured_products$enddate[n]), featured_products$`vehicle/placement`, NA)
}
}
}
答案 0 :(得分:0)
您希望将其制作成更小,更高效的子任务。我们知道,当我们使用矢量化代码时R
可以非常快,但not when using for loops
。我们通过使用以下命令来利用它:
combined = merge(combined, featured_products) # merge/join both data frames
mismatch = !(combined$ActivityDate %within% interval(combined$StartDate, combined$EndDate) & combined$featured == 1) # Query rows
combined$Placement[mismatch] = NA # Remove Placement in mismatched rows
combined[,c("StartDate", "EndDate")] = NULL # Remove columns
请注意,列/对象名称可能与您的名称不同,因此您可能需要调整它们。