Question

I'm using 16/02/12 21:28:38 INFO Worker: Connecting to master pl:7077... 16/02/12 21:28:38 INFO Worker: Successfully registered with master spark://pl:7077 on postgres. I have a query which goes like this:

Redshift

Here, I'm selecting a combination of SELECT EXTRACT(year from created_at) AS CustomYear, client_ip, member_id, COUNT(*) AS Views FROM ads.fbs_page_view_staging WHERE member_id = 2 GROUP BY CustomYear, client_ip, member_id HAVING COUNT(*) = 1 ORDER BY CustomYear and client_ip where member_id is 1. I would now like to take these combinations of Views and client_ip and subset the entire table member_id having only such combinations.

If there was only one column I wanted to subset on, say ads.fbs_page_view_staging, I could've written the following query and got the results:

client_ip

Notice that in the outer query, I am subsetting based on SELECT EXTRACT(year FROM created_at) AS CustomYear, COUNT(*) FROM ads.fbs_page_view_staging WHERE member_id = 2 AND client_ip IN (SELECT client_ip FROM ((SELECT EXTRACT(year from created_at) AS CustomYear, client_ip, member_id, COUNT(*) FROM ads.fbs_page_view_staging WHERE member_id = 2 GROUP BY CustomYear, client_ip, member_id HAVING COUNT(*) = 1 ORDER BY CustomYear))) GROUP BY customyear ORDER BY customyear. But how do I subset the table on a combination of columns?

Any help would be much appreciated.

Answer 1

Instead of subquerying, try joining directly to the results of your query. That way you can specify multiple criteria.

Here is (draft) SQL to select IP/member pairs that match the rows found by your sub-query (i.e. for some year in the past, there was only one view for that IP & member.)

 faces = face_cascade.detectMultiScale(gray, 1.3, 5)

I'm not certain I've captured the intent of your query correctly, but if not hopefully you can adapt the technique.

Subsetting based on combinations from an inner query

1 个答案: