在Python中使用标准错误处理数字

时间:2016-07-16 15:53:30

标签: python numpy math scipy

我需要处理大量带有标准错误的数字,所以我编写了一个以参数方式处理它们的类(下面,取自larger module),据我所知,没有任何内容和数学过去对我有用。我很高兴,直到有人最近提到numpy有类似的东西,但不确定 - 这个小伙伴知识有限,但假装不然,所以我很确定这不是真的。然而,贪婪击败了智慧,所以我仍在寻找一个史诗般隐藏的超级惊人的课程(理想情况下根据经验解决)。更不用说,Scipy和Numpy有时会让人感到困惑。

因此我想知道我是否重新发明了轮子。

这是课程,仅供参考 - 有效,所以请随意使用copypaste。

class NumSEM:
    """
    A class to handle numbers with SEM.
    I am not sure why there is nothing that does this and whether this is the best way of doing.
    For now errors are propagated parametrically based on the maths I discuss in [mutanalyst](http://www.mutanalyst.com/)
    """
    option_space_between_numbers = True

    def __init__(self, num, sem, df=2):
        """
        :param num: the number (mean)
        :param sem: the standard error
        :param df: the number of samples used to determine the SE
        :return: an object with the three inputs as _num, _sem and _df attributes.
        """
        self._num = float(num)
        self._sem = float(sem)
        self._df = int(df)

    def __str__(self):
        """
        The lowest digit to show is the first digit of the error. So math.floor(math.log10(se))
        :return:
        """
        if self.option_space_between_numbers:
            s = " "
        else:
            s = ""
        if self._sem == 0:

            return str(self._num) + s + "±" + s + "Ind."
        else:
            sig = -math.floor(math.log10(self._sem))
            if sig < 0:
                sig = 0
            txt = "{:." + str(sig) + "f}" + s + "±" + s + "{:." + str(sig) + "f}"
            return txt.format(self._num, self._sem)

    def __add__(self, other):
        """
        Returns the addition of either two NumSEM objects or a NumSEM and a float/int
        :param other: NumSEM or int or float
        :return: a new NumSEM instance where the variance is based on the Binaymé rule if both NumSEM.
        """
        if type(other) is NumSEM:
            v = self._var() + other._var()
            df = NumSEM._df(self._df, other._df)
            return NumSEM(self._num + other._num, math.sqrt(v / df), df)
        else:  # assume int or float
            return NumSEM(self._num + other, self._sem, self._df)

    def __sub__(self, other):
        """
        Returns the subtraction of either two NumSEM objects or a NumSEM and a float/int
        :param other: NumSEM or int or float
        :return: a new NumSEM instance where the variance is based on the Binaymé rule (var summed) if both NumSEM.
        """
        if type(other) is NumSEM:
            v = self._var() + other._var()
            df = NumSEM._df(self._df, other._df)
            return NumSEM(self._num - other._num, math.sqrt(v / df), df)
        else:  # assume int or float
            return NumSEM(self._num - other, self._sem, self._df)

    def __mul__(self, other):
        """
        Returns the mutiplication of either two NumSEM objects or a NumSEM and a float/int
        :param other: NumSEM or int or float
        :return: a new NumSEM instance where, if both NumSEM, the variance is the var(x)/mean(x)^2 + var(y)/mean(y)^2.
        The latter formula stems from the first order Taylor approximation of the König–Huygens theorem applied to a function:
        > Var(f(x))\approx [f'(x)]^2 · Var(x)
        And converting Var(xy) to Var(e^ln(xy)) and solving.
        > Var(e^ln(xy)) = (e^ln(xy))^2 · Var(ln(xy)) = x^2 · y^2 · (Var(ln(x))+Var(ln(y))) = x^2 · y^2 · (Var(x)/x^2+Var(y)/y^2) _etc._
        """
        if type(other) is NumSEM:
            v = self._var() * other._num ** 2 + other._var() * self._num ** 2
            df = NumSEM._df(self._df, other._df)
            return NumSEM(self._num * other._num, math.sqrt(v / df), df)

    def __truediv__(self, other):
        """
        Same principle as multiplication
        :param other: NumSEM or int or float
        :return: a new NumSEM instance where, if both NumSEM, the variance is the var(x)/mean(y)^2 + var(y)*mean(x)^2/mean(y)^4.
        """
        if type(other) is NumSEM:
            v = self._var() / other._num ** 2 + other._var() * self._num ** 2 / other._num ** 4
            df = NumSEM._df(self._df, other._df)
            return NumSEM(self._num / other._num, math.sqrt(v / df), df)

    def _var(self):
        return self._sem ** 2 * self._df

    @staticmethod
    def _df(a, b):
        return min(a,
                   b)  # I am unsure if min is best, hence the method. I assume that the worst case scenario is the smallest.
        # Also it is not degrees of freedom but sample size...

0 个答案:

没有答案