LINQ中的标准偏差

时间:2010-02-12 17:48:21

标签: linq standard-deviation

LINQ是否对聚合SQL函数STDDEV()(标准差)进行建模?

如果没有,那么计算它的最简单/最佳实践方法是什么?

示例:

  SELECT test_id, AVERAGE(result) avg, STDDEV(result) std 
    FROM tests
GROUP BY test_id

8 个答案:

答案 0 :(得分:92)

您可以自己进行扩展计算

public static class Extensions
{
    public static double StdDev(this IEnumerable<double> values)
    {
       double ret = 0;
       int count = values.Count();
       if (count  > 1)
       {
          //Compute the Average
          double avg = values.Average();

          //Perform the Sum of (value-avg)^2
          double sum = values.Sum(d => (d - avg) * (d - avg));

          //Put it all together
          ret = Math.Sqrt(sum / count);
       }
       return ret;
    }
}

如果你有一个人口的样本而不是整个人口,那么你应该使用ret = Math.Sqrt(sum / (count - 1));

Adding Standard Deviation to LINQ by Chris Bennett转换为扩展名。

答案 1 :(得分:57)

Dynami的答案有效,但会通过数据进行多次传递以获得结果。这是一种计算样本标准差的单遍方法:

public static double StdDev(this IEnumerable<double> values)
{
    // ref: http://warrenseen.com/blog/2006/03/13/how-to-calculate-standard-deviation/
    double mean = 0.0;
    double sum = 0.0;
    double stdDev = 0.0;
    int n = 0;
    foreach (double val in values)
    {
        n++;
        double delta = val - mean;
        mean += delta / n;
        sum += delta * (val - mean);
    }
    if (1 < n)
        stdDev = Math.Sqrt(sum / (n - 1));

    return stdDev;
}

这是样本标准差,因为它除以n - 1。对于正常的标准偏差,您需要除以n

这使用Welford's method,与Average(x^2)-Average(x)^2方法相比具有更高的数值准确度。

答案 2 :(得分:28)

这会将David Clarke's answer转换为与其他聚合LINQ函数(如Average)相同形式的扩展名。

用法为:var stdev = data.StdDev(o => o.number)

public static class Extensions
{
    public static double StdDev<T>(this IEnumerable<T> list, Func<T, double> values)
    {
        // ref: https://stackoverflow.com/questions/2253874/linq-equivalent-for-standard-deviation
        // ref: http://warrenseen.com/blog/2006/03/13/how-to-calculate-standard-deviation/ 
        var mean = 0.0;
        var sum = 0.0;
        var stdDev = 0.0;
        var n = 0;
        foreach (var value in list.Select(values))
        {
            n++;
            var delta = value - mean;
            mean += delta / n;
            sum += delta * (value - mean);
        }
        if (1 < n)
            stdDev = Math.Sqrt(sum / (n - 1));

        return stdDev; 

    }
} 

答案 3 :(得分:2)

var stddev = Math.Sqrt(data.Average(z=>z*z)-Math.Pow(data.Average(),2));

答案 4 :(得分:1)

直截了当(并​​且C#> 6.0),Dynamis答案变为:

    public static double StdDev(this IEnumerable<double> values)
    {
        var count = values?.Count() ?? 0;
        if (count <= 1) return 0;

        var avg = values.Average();
        var sum = values.Sum(d => Math.Pow(d - avg, 2));

        return Math.Sqrt(sum / count);
    }

答案 5 :(得分:0)

    private void buttonNo_Click(object sender, EventArgs e)
    {
        pictureBoxAttendanceVerification.Dispose();
        veringerprint.FPVerificationStop();
        FormLecVerification lectVerification = new FormLecVerification(this);
        lectVerification .ShowDialog();
    }

答案 6 :(得分:0)

简单的4行,我使用了一个双打列表,但是可以使用IEnumerable<int> values

public static double GetStandardDeviation(List<double> values)
{
    double avg = values.Average();
    double sum = values.Sum(v => (v - avg) * (v - avg));
    double denominator = values.Count - 1;
    return denominator > 0.0 ? Math.Sqrt(sum / denominator) : -1;
}

答案 7 :(得分:0)

一般情况下,我们希望在单次中计算StdDev:如果 valuesfile 或RDBMS 游标 哪个可以在计算平均值和总和之间改变?我们将得到不一致的结果。这 下面的代码只使用了一次:

// Population StdDev
public static double StdDev(this IEnumerable<double> values) {
  if (null == values)
    throw new ArgumentNullException(nameof(values));

  double N = 0;
  double Sx = 0.0;
  double Sxx = 0.0;

  foreach (double x in values) {
    N += 1;
    Sx += x;
    Sxx += x * x;
  }

  return N == 0
    ? double.NaN // or throw exception
    : Math.Sqrt((Sxx - Sx * Sx / N) / N);
}

对于 sample StdDev 的想法完全相同:

// Sample StdDev
public static double StdDev(this IEnumerable<double> values) {
  if (null == values)
    throw new ArgumentNullException(nameof(values));

  double N = 0;
  double Sx = 0.0;
  double Sxx = 0.0;

  foreach (double x in values) {
    N += 1;
    Sx += x;
    Sxx += x * x;
  }

  return N <= 1
    ? double.NaN // or throw exception
    : Math.Sqrt((Sxx - Sx * Sx / N) / (N - 1));
}