使线性回归更紧凑(python)

时间:2018-02-06 12:13:43

标签: python python-3.x numpy matplotlib

我试图为数据集制作线性表达式。我绘制了数据并绘制了回归图,但我的代码效率不高。有没有办法让它更紧凑?

import numpy as np
import matplotlib.pyplot as plt    

temp1, tid0 = np.genfromtxt("forsok1.txt", dtype=float, skip_header=41, usecols = (1,2)).T
tid1 = tid0 - 200
temp2, tid2 = np.genfromtxt("forsok2.txt", dtype=float, skip_header=1, usecols = (1,2)).T
temp3, tid3 = np.genfromtxt("forsok3.txt", dtype=float, skip_header=1, usecols = (1,2)).T

tempreg1_1 = np.zeros(88)
tidreg1_1 = np.zeros(88)
for i in range(0, 88):
    tempreg1_1[i] = temp1[i]
    tidreg1_1[i] = tid1[i]
tempreg2_1 = np.zeros(65)
tidreg2_1 = np.zeros(65)
tempreg3_1 = np.zeros(65)
tidreg3_1 = np.zeros(65)
for i in range(0, 65):
    tempreg2_1[i] = temp2[i]
    tidreg2_1[i] = tid2[i]
    tempreg3_1[i] = temp3[i]
    tidreg3_1[i] = tid3[i]

tempreg1_2 = np.zeros(59)
tidreg1_2 = np.zeros(59)
for i in range(0, 59):
    tempreg1_2[i] = temp1[i+112]
    tidreg1_2[i] = tid1[i+112]
tempreg2_2 = np.zeros(76)
tidreg2_2 = np.zeros(76)
for i in range(0, 76):
    tempreg2_2[i] = temp2[i+93]
    tidreg2_2[i] = tid2[i+93]
tempreg3_2 = np.zeros(55)
tidreg3_2 = np.zeros(55)
for i in range(0,55):
    tempreg3_2[i] = temp3[i+100]
    tidreg3_2[i] = tid3[i+100]

tempreg1_3 = np.zeros(76)
tidreg1_3 = np.zeros(76)
for i in range(0, 76):
    tempreg1_3[i] = temp1[i+210]
    tidreg1_3[i] = tid1[i+210]
tempreg2_3 = np.zeros(80)
tidreg2_3 = np.zeros(80)
for i in range(0, 80):
    tempreg2_3[i] = temp2[i+207]
    tidreg2_3[i] = tid2[i+207]
tempreg3_3 = np.zeros(91)
tidreg3_3 = np.zeros(91)
for i in range(0,91):
    tempreg3_3[i] = temp3[i+181]
    tidreg3_3[i] = tid3[i+181]



R1_1, b1_1 = np.polyfit(tidreg1_1, tempreg1_1, 1)
R2_1, b2_1 = np.polyfit(tidreg2_1, tempreg2_1, 1)
R3_1, b3_1 = np.polyfit(tidreg3_1, tempreg3_1, 1)
R1_2, b1_2 = np.polyfit(tidreg1_2, tempreg1_2, 1)
R2_2, b2_2 = np.polyfit(tidreg2_2, tempreg2_2, 1)
R3_2, b3_2 = np.polyfit(tidreg3_2, tempreg3_2, 1)
R1_3, b1_3 = np.polyfit(tidreg1_3, tempreg1_3, 1)
R2_3, b2_3 = np.polyfit(tidreg2_3, tempreg2_3, 1)
R3_3, b3_3 = np.polyfit(tidreg3_3, tempreg3_3, 1)

tempreg1_1[0] = b1_1
tempreg2_1[0] = b2_1
tempreg3_1[0] = b3_1
for j in range(1, 88):
        tempreg1_1[j] = tempreg1_1[j-1] + 5*R1_1
for j in range(1, 65):
        tempreg2_1[j] = tempreg2_1[j-1] + 5*R2_1
        tempreg3_1[j] = tempreg3_1[j-1] + 5*R3_1

tempreg1_2[0] = b1_2 + 560*R1_2
tempreg2_2[0] = b2_2 + 465*R2_2
tempreg3_2[0] = b3_2 + 500*R3_2
for j in range(1, 59):
        tempreg1_2[j] = tempreg1_2[j-1] + 5*R1_2
for j in range(1, 76):
        tempreg2_2[j] = tempreg2_2[j-1] + 5*R2_2
for j in range(1, 55):
        tempreg3_2[j] = tempreg3_2[j-1] + 5*R3_2

tempreg1_3[0] = b1_3 + 1050*R1_3
tempreg2_3[0] = b2_3 + 1035*R2_3
tempreg3_3[0] = b3_3 + 905*R3_3
for j in range(1, 76):
        tempreg1_3[j] = tempreg1_3[j-1] + 5*R1_3
for j in range(1, 80):
        tempreg2_3[j] = tempreg2_3[j-1] + 5*R2_3
for j in range(1, 91):
        tempreg3_3[j] = tempreg3_3[j-1] + 5*R3_3

plt.figure()
ax1 = plt.subplot(311)
ax2 = plt.subplot(312)
ax3 = plt.subplot(313)

ax1.plot(tid1, temp1, ':', color="g")
ax1.plot(tidreg1_1, tempreg1_1, '-.',color="b")
ax1.plot(tidreg1_2, tempreg1_2, '-.',color="b")
ax1.plot(tidreg1_3, tempreg1_3, '-.',color="b")
ax2.plot(tid2, temp2, ':', color="g")
ax2.plot(tidreg2_1, tempreg2_1, '-.',color="b")
ax2.plot(tidreg2_2, tempreg2_2, '-.',color="b")
ax2.plot(tidreg2_3, tempreg2_3, '-.',color="b")
ax3.plot(tid3, temp3, ':', color="g")
ax3.plot(tidreg3_1, tempreg3_1, '-.',color="b")
ax3.plot(tidreg3_2, tempreg3_2, '-.',color="b")
ax3.plot(tidreg3_3, tempreg3_3, '-.',color="b")

我使用的代码是从数据集的一小部分创建数组,然后从这些数组进行线性回归。然后将回归转换为另一个数组,在子图中绘制whitch。这是针对三个不同的数据时隙完成的。

我试图让它更紧凑,但还没有使用的功能。感谢您的帮助,对不起英语感到抱歉。

2 个答案:

答案 0 :(得分:1)

此:

tempreg1_1 = np.zeros(88)
tidreg1_1 = np.zeros(88)
for i in range(0, 88):
    tempreg1_1[i] = temp1[i]
    tidreg1_1[i] = tid1[i]

与此相同:

tempreg1_1 = temp1[:88]
tidreg1_1 = tid1[:88]

所以你可能甚至不需要制作那些数组,因为你可能只是直接使用'切片'。

通常,您很少需要预先创建一个空数组,然后用循环填充它。如果你发现自己在NumPy中这样做,那几乎肯定有更好的方法。

答案 1 :(得分:0)

您不必明确地完成所有这些操作,您可以遍历这些几乎完全相同的作品。这是一个简化的案例,抱歉你的变量有点太多了,所以我使用了一些简单的名字:

#read data

plt.figure()
ax1 = plt.subplot(311)
ax2 = plt.subplot(312)
ax3 = plt.subplot(313)

plots = [ax1, ax2, ax3]
for subplot in plots:

    #operating tidreg and tempreg here

    xCordinate = #should be your tidreg
    y1 =  tempreg1
    y2 =  tempreg2

    regression1 = np.poly1d(np.polyfit(xCordinate , y1, 1))
    regression2 = np.poly1d(np.polyfit(xCordinate , y2, 1))
    subplot.plot(xCordinate, regression1(xCordinate), 'b-')
    subplot.plot(xCordinate, regression2(xCordinate), 'b-')

plt.show()

每个for循环对应一个子图,您只需要在该子图中使用的操作数据。在每个循环期间,变量都会更新,因此您也不必创建这么多变量。从理论上讲,这可以减少三分之二的工作并节省大量的内存。

对于索引或切片数组,您可以参考this question和此numpy manual