我试图通过制作一条线(感知器)f并使一侧的点+1和-1在另一侧制作数据点的训练集。然后通过w = w + y(t)x(t)更新w = w + y(t)x(t)来创建新的g并尽可能接近f,其中w是权重,y(t)是+ 1,-1和x(t) )是未分类点的坐标。实施这个后,我从g到f都没有得到很好的契合。这是我的代码和一些示例输出。
import random
random.seed()
points = [ [1, random.randint(-25, 25), random.randint(-25,25), 0] for k in range(1000)]
weights = [.1,.1,.1]
misclassified = []
############################################################# Function f
interceptf = (0,random.randint(-5,5))
slopef = (random.randint(-10, 10),random.randint(-10,10))
point1f = ((interceptf[0] + slopef[0]),(interceptf[1] + slopef[1]))
point2f = ((interceptf[0] - slopef[0]),(interceptf[1] - slopef[1]))
############################################################# Function G starting
interceptg = (-weights[0],weights[2])
slopeg = (-weights[1],weights[2])
point1g = ((interceptg[0] + slopeg[0]),(interceptg[1] + slopeg[1]))
point2g = ((interceptg[0] - slopeg[0]),(interceptg[1] - slopeg[1]))
#############################################################
def isLeft(a, b, c):
return ((b[0] - a[0])*(c[1] - a[1]) - (b[1] - a[1])*(c[0] - a[0])) > 0
for i in points:
if isLeft(point1f,point2f,i):
i[3]=1
else:
i[3]=-1
for i in points:
if (isLeft(point1g,point2g,i)) and (i[3] == -1):
misclassified.append(i)
if (not isLeft(point1g,point2g,i)) and (i[3] == 1):
misclassified.append(i)
print len(misclassified)
while misclassified:
first = misclassified[0]
misclassified.pop(0)
a = [first[0],first[1],first[2]]
b = first[3]
a[:] = [x*b for x in a]
weights = [(x + y) for x, y in zip(weights,a)]
interceptg = (-weights[0],weights[2])
slopeg = (-weights[1],weights[2])
point1g = ((interceptg[0] + slopeg[0]),(interceptg[1] + slopeg[1]))
point2g = ((interceptg[0] - slopeg[0]),(interceptg[1] - slopeg[1]))
check = 0
for i in points:
if (isLeft(point1g,point2g,i)) and (i[3] == -1):
check += 1
if (not isLeft(point1g,point2g,i)) and (i[3] == 1):
check += 1
print weights
print check
117< ---原始未分类数量与g
[ - 116.9,-300.9,190.1]< ---最终权重
617< ---算法之后带有g的原始未分类数
956< ---原始未分类数与g
[ - 33.9,-12769.9,-572.9]< ---最终重量
461< ---算法后带有g的原始未分类数
答案 0 :(得分:0)
您的算法至少存在一些问题:
你的“同时”条件是错误的 - 感知器学习不是像现在一样通过所有错误分类的点重复一次。只要其中任何一个被错误分类,算法就应该遍历所有点。特别是 - 每次更新都可以将一些正确分类的点作为错误的点,因此您必须总是遍历所有这些并检查一切是否正常。
我非常确定您真正想要的是(y(i)-p(i))x(i)
形式的更新规则,其中p(i)
是预测标签而y(i)
是真正的标签(但这显然是退化的)如果您只更新错误分类,请使用您的方法