情况:
我需要添加两个标识为这样的列标志:
输出应包含5列:
原始数据如下:
abc@gmail.com cucumber 01-02-2019
abc@gmail.com orange 04-02-2019
abc@gmail.com grapefruit 15-02-2019
cde@gmail.com blackberry 06-02-2019
cde@gmail.com lime 15-02-2019
cde@gmail.com lime 20-02-2019
zzz@gmail.com apple 02-02-2019
zzz@gmail.com apple 18-02-2019
zzz@gmail.com orange 19-02-2019
zzz@gmail.com apple 28-02-2019
目标:
我的输出看起来像这样:
Email ProductName DatePurchased SameProduct AnyProduct
abc@gmail.com cucumber 01-02-2019 0 0
abc@gmail.com orange 04-02-2019 0 1
abc@gmail.com grapefruit 15-02-2019 0 1
cde@gmail.com blackberry 06-02-2019 0 0
cde@gmail.com lime 15-02-2019 0 1
cde@gmail.com lime 20-02-2019 1 1
zzz@gmail.com apple 02-02-2019 0 0
zzz@gmail.com apple 18-02-2019 1 1
zzz@gmail.com orange 19-02-2019 0 1
zzz@gmail.com apple 28-02-2019 1 1
我尝试了什么: 我尝试过两次自己加入并使用用例语句,但是我觉得这种方式效率极低。
虚拟数据:
create table #table1 (email varchar(20), productname varchar(20), datepurchased date)
insert into #table1 values
('abc@gmail.com','cucumber','2019-02-01'),
('abc@gmail.com','orange','2019-02-04'),
('abc@gmail.com','grapefruit','2019-02-15'),
('cde@gmail.com','blackberry','2019-02-06'),
('cde@gmail.com','lime','2019-02-15'),
('cde@gmail.com','lime','2019-02-20'),
('zzz@gmail.com','apple','2019-02-02'),
('zzz@gmail.com','apple','2019-02-18'),
('zzz@gmail.com','orange','2019-02-19'),
('zzz@gmail.com','apple','2019-02-28')
注意:我的实际数据有1亿多行。我不确定哪种查询可以使数据处理尽快完成。
答案 0 :(得分:3)
另一个获得结果的选项。
我使用ROW_NUMBER()-1,所以我们可以给第一次出现的值一个零。然后,我使用SIGN()将任何正值转换为1。
SELECT *,
SameProduct = SIGN(ROW_NUMBER() OVER(PARTITION BY email, productname ORDER BY datepurchased)-1),
AnyProduct = SIGN(ROW_NUMBER() OVER(PARTITION BY email ORDER BY datepurchased)-1)
FROM #table1
ORDER BY email, datepurchased;
如果需要,可以将其转换为与使用SIGN()相同的结果,但是在这种情况下,所有值均为正。
SELECT *,
SameProduct = CAST(ROW_NUMBER() OVER(PARTITION BY email, productname ORDER BY datepurchased)-1 AS bit),
AnyProduct = CAST(ROW_NUMBER() OVER(PARTITION BY email ORDER BY datepurchased)-1 AS bit)
FROM #table1
ORDER BY email, datepurchased;
答案 1 :(得分:2)
我的解决方案是使用#include <iostream>
#include <string>
#include <sstream>
#include <cmath>
using namespace std;
const int FEE = 1250; // fee in cents
//---- Utilities ----//
string moneyString(int cents) {
ostringstream oss;
oss << cents/100 << '.' << cents % 100;
return oss.str();
}
int toCents(double money) {
return int(round(money*100));
}
int getMoney() {
double money;
cin >> money;
return toCents(money);
}
//---- User input ----//
// Available choices
enum Choices {
BALANCE = 1,
WITHDRAW = 2,
DEPOSIT = 3,
LOGOUT = 4
};
short int getChoice() {
short int choice = 0;
while (choice < 1 or choice > 4) {
cout << "1 - Current Balance" << '\n'
<< "2 - Withdraw" << '\n'
<< "3 - deposit" << '\n'
<< "4 - Log Out" << '\n'
<< "Option: ";
string input;
cin >> input;
choice = atoi(input.c_str());
cout << endl;
}
return choice;
}
bool userWantsMoreActions() {
cout << "Would you like to take any other actions today? ";
char answer;
cin >> answer;
cout << endl;
return toupper(answer) == 'Y';
}
//---- Actions ----//
void greeting(double &balance) {
cout << "Hello, thank you for banking with Pallet Town Bank.\n";
cout << "Please enter your name. ";
string name;
cin >> name;
cout << "Hello " << name << ". Your current balance is $" << moneyString(balance) << ".\n";
cout << "There will be a a service fee of $12.50 subtracted from your account.\n";
cout << "Your updated balance will be $" << moneyString(balance -= FEE) << " \n";
cout << "What would you like to do today?\n\n";
}
void printBalance(const double &balance) {
cout << "Current Balance is " << balance << '\n';
}
void withdraw(double &balance) {
cout << "Withdraw - How much would you like to withdraw? $";
int withdraw = getMoney();
cout << "Your new balance after withdrawing $" << withdraw << " will be $"
<< (balance -= withdraw -= FEE) << '\n';
}
void deposit(double &balance) {
cout << "Deposit - How much would you like to deposit? $";
int deposit = getMoney();
cout << "Your new balance after depositing $" << moneyString(deposit)
<< " will be $" << moneyString(balance += deposit -= FEE) << '\n';
}
int main()
{
// Initialize a sample session:
double balance = 157236;
greeting(balance);
while (true)
{
short int choice = getChoice();
if (choice == Choices::BALANCE) printBalance(balance);
else if (choice == Choices::WITHDRAW) withdraw(balance);
else if (choice == Choices::DEPOSIT) deposit(balance);
else if (choice == Choices::LOGOUT) break;
if (not userWantsMoreActions()) break;
}
cout << "Log Out - Thank you for banking with Pallet Town Bank. Have a great day!" << endl;
}
和LAG()
。
ROW_NUMBER()
始终引用先前的记录,因此检查先前和当前乘积是否相等非常有用。
LAG()
仅用于标记第一次购买(行号= 1)
当然,ROW_NUMBER()
和PARTITION BY
子句对于按正确的顺序获取记录很重要。
我还检查了Vamsi Prabhalas的解决方案,但是ORDER BY
的性能似乎比IIF
快。
CASE-WHEN
答案 2 :(得分:1)
使用count
窗口函数或row_number
的一种方法。
--count
select t.*
,case when count(*) over(partition by email,productname order by datepurchased) > 1 then 1 else 0 end as same_prev
,case when count(*) over(partition by email order by datepurchased) > 1 then 1 else 0 end as any_prev
from tbl t
--row_number
select t.*
,case when row_number() over(partition by email,productname order by datepurchased) > 1 then 1 else 0 end as same_prev
,case when row_number() over(partition by email order by datepurchased) > 1 then 1 else 0 end as any_prev
from tbl t
答案 3 :(得分:1)
我会使用row_number()
:
select t.*,
(case when 1 = row_number() over (partition by email, productname order by datepurchased)
then 0 else 1
end) as same_product,
(case when 1 = row_number() over (partition by email order by datepurchased)
then 0 else 1
end) as any_product
from #table1 t;
请注意,唯一的区别是row_number()
。
您也可以在没有case
比较的情况下执行此操作:
select t.*,
coalesce(max(1) over (partition by email, productname order by datepurchased rows between unbounded preceding and 1 preceding), 0) as same_product,
coalesce(max(1) over (partition by email order by datepurchased rows between unbounded preceding and 1 preceding), 0) as any_product
from table1 t
order by email, datepurchased;
Here是db <>小提琴。