Question

假设我有记录：

======= =========
Element id
======= =========
        "H"
        "O"

另一个类似：

======== ==
Compound id
======== ==
         "Water"

使用：

======== == =========== ========== ==========
Relation id compound_id element_id bond
======== == =========== ========== ==========
         1  "Water"     "H"        "Covalent"
         2  "Water"     "H"        "Covalent"
         3  "Water"     "O"        "Covalent"

现在，我的查询中大多数都不是完全匹配的，但是假设有时我想找到具有确切元素= ["H", "H", "O"]的化合物（即水，而不是 Hydrooxide （["H", "O"]）或 Peroxide （["H", "H", "O", "O"]）。

我该怎么办？

Consensus seems to have it，在SQL中存储数组的最佳方法是通过多对多中间表。
但是，即使具有GROUP_CONCAT之类的特定于数据库的功能，没有数组的querying for an exact match似乎也很慢和复杂。

Answer 1

为什么不只使用array_agg()？

select compound_id
from t3
group by compound_id
having array_agg(element_id order by element_id) = array['H', 'H', 'O']

Answer 2

始终最好使数据库规范化。在您的特定情况下，我将存储每个化合物的元素数量，而不是为每个元素添加新行。

#include <map>
#include <functional>
#include <atomic>

template <typename State, typename Transition>
class fsm
{
    using handler_t = std::function<void()>;

    class from_t
    {
        Transition transition_;
        State state_;

    public:
        from_t(Transition transition, State state) :
            transition_{ transition },
            state_{ state }
        {
        }

        bool operator<(const from_t& other) const
        {
            if (transition_ < other.transition_)
            {
                return true;
            }
            else if (transition_ > other.transition_)
            {
                return false;
            }
            else
            {
                return state_ < other.state_;
            }
        }
    };

    class to_t
    {
        State state_;
        handler_t handler_;

    public:
        to_t(State state, handler_t handler) :
            state_{ state },
            handler_{ handler }
        {
        }

        State state() const { return state_; }

        void operator()() const { handler_(); }
    };

    std::map<from_t, to_t> transitions_;
    std::atomic<State> state_;

public:
    fsm(State initial_state) :
        state_{ initial_state }
    {
    }

    void add(State from, Transition transition, State to, handler_t handler)
    {
        transitions_.insert({ { transition, from }, { to, handler } });
    }

    void add(State from, Transition transition, State to)
    {
        add(transition, to, from, [] {});
    }

    bool to(Transition transition)
    {
        auto found = transitions_.find({ transition, state_ });

        if (found != transitions_.end())
        {
            auto& to = found->second;

            state_ = to.state();

            to();

            return true;
        }
        else
        {
            return false;
        }
    }
};

确切匹配的查询将是

 compound_id element_id      bond         count
 -------------------------------------------------
   "Water"     "H"        "Covalent"        2
   "Water"     "O"        "Covalent"        1

但是，由于将使用顺序扫描，因此该方法不是最佳的。如果非正规化不是问题，则可以为每个化合物存储许多不同的元素。

 select compound_id
 from elements
 group by compound_id
 having count(
              case when 
                (element_id = 'H' and count = 2) or
                (element_id = 'O' and count = 1) then 1 
              end
        ) = count(*)

然后查询可能是

 compound_id   element_count
 ------------------------------
   "Water"          2

并且如果您在select e.compound_id from elements e join compounds c on e.compound_id = c.compound_id where c.element_count = 2 and ((e.element_id = 'H' and e.count = 2) or (e.element_id = 'O' and e.count = 1)) group by e.compound_id having count(*) = 2和compounds(element_count)上都有索引，那么即使数据库很大，查询也将使用它快速检索结果。

SQL-多对多替代方案？

2 个答案: