我正在使用MSChart控件在C#中编写一个小应用程序来执行X和Y数据点集的Scatter Plots。其中一些可能相当大(数百个数据点)。
想要询问是否有一个'标准'算法来绘制点上最合适的线。我想将X数据点划分为预定数量的集合,比如10或20,并且对于每个集合,取相应Y值和中间X值的平均值,依此类推以创建该行。这是正确的做法吗?
我搜索了现有的线程,但它们似乎都是使用Matlab等现有应用程序实现相同的目标。
谢谢,
答案 0 :(得分:10)
使用线性最小二乘算法
public class XYPoint
{
public int X;
public double Y;
}
class Program
{
public static List<XYPoint> GenerateLinearBestFit(List<XYPoint> points, out double a, out double b)
{
int numPoints = points.Count;
double meanX = points.Average(point => point.X);
double meanY = points.Average(point => point.Y);
double sumXSquared = points.Sum(point => point.X * point.X);
double sumXY = points.Sum(point => point.X * point.Y);
a = (sumXY / numPoints - meanX * meanY) / (sumXSquared / numPoints - meanX * meanX);
b = (a * meanX - meanY);
double a1 = a;
double b1 = b;
return points.Select(point => new XYPoint() { X = point.X, Y = a1 * point.X - b1 }).ToList();
}
static void Main(string[] args)
{
List<XYPoint> points = new List<XYPoint>()
{
new XYPoint() {X = 1, Y = 12},
new XYPoint() {X = 2, Y = 16},
new XYPoint() {X = 3, Y = 34},
new XYPoint() {X = 4, Y = 45},
new XYPoint() {X = 5, Y = 47}
};
double a, b;
List<XYPoint> bestFit = GenerateLinearBestFit(points, out a, out b);
Console.WriteLine("y = {0:#.####}x {1:+#.####;-#.####}", a, -b);
for(int index = 0; index < points.Count; index++)
{
Console.WriteLine("X = {0}, Y = {1}, Fit = {2:#.###}", points[index].X, points[index].Y, bestFit[index].Y);
}
}
}
答案 1 :(得分:1)
是。您需要使用Linear Regression,特别是Simple Linear Regression。
该算法基本上是:
y = ax + b
a
和b
的值(应该只有一个最小值)维基百科页面将为您提供所需的一切。