使用不同的基线来查询数据

时间:2014-07-21 12:10:39

标签: java python sql excel ms-access

我的数据看起来像这样:

data_
company      result        ID    group
cars         50            q1    ground
boats        0             q1    water
bicycles     50            q2    ground
cars         75            q2    water 
horses       0             q2    ground
foxes        50            q5    ground
.....etc

所以我想提出以下问题:

哪些Ground公司的结果与Cars公司不同,以及哪个季度(ID)发生这种情况?

实质上上面的结果是:

horses, q2 (result: 0, differs from cars 75)
bicycles, q2 (result: 50, differs from cars 75)

我使用Excel或Access来执行此操作。但如果有人有更好的建议,我会很高兴听到它。

我觉得我可以在Excel中管理半自动方法,获取基线数据,然后使用VLOOKUP和IF公式的组合提问。所以像这样:

baseline_
company    result   id 
cars       50       q1
cars       75       q2

然后问:哪个Q1地面组的结果与50不同?哪个Q2地面组的结果与75不同?

即使像这样拆分它也是可能的:

groups_ground
company    result    id
cars       etc.      etc.
foxes      etc.      etc.
horses     etc.      etc.
bicycles   etc.      etc.

但是考虑到我的数据是500k +行,所有这些方法都有点单调乏味。

SQL我想的是:

SELECT * FROM data_ D
 LEFT JOIN baseline_ B
 ON D.result=!B.result;

2 个答案:

答案 0 :(得分:1)

你的SQL是正确的。但是你需要寻找匹配然后选择不匹配的匹配,因此它需要更多的条件:

SELECT d.*
FROM data d LEFT JOIN
     data dcars
     ON d.result = dcars.result and
        dcars.company = 'cars'
WHERE d.group = 'ground' and
      dcars.company is null;

答案 1 :(得分:1)

data = [['cars',         50,            'q1',    'ground'],
        ['boat',        0,             'q1',    'water'],
        ['bicycles',     50,            'q2',    'ground'],
        ['cars',         75,            'q2',    'water'],
        ['horses',      0,             'q2',    'ground'],
        ['foxes',        50,            'q5',    'ground']]
data_dict = {i[2]: i[1] for i in data if i[0] == 'cars'}
for i in data:
    if i[3] == 'ground' and i[0] != 'cars':
        if i[2] != data_dict.get(i[2]):
            print("{}, {} (result: {}, differs from cars {})".format(i[0], i[2], i[1], data_dict.get(i[2])))

结果:

bicycles, q2 (result: 50, differs from cars 75)
horses, q2 (result: 0, differs from cars 75)
foxes, q5 (result: 50, differs from cars None)