0%

最小二乘回归策略

本文总结了最小二乘法针对一元线性方程、多元线性方程与多元非线性方程的回归策略,内容笔记向。

一元线性回归

a = 0,单变量x,单个未知数 b,求解一元一次方程

模型\(y = ax + b\)

目标函数:\(y = c\)

n个样本点:\((x_i, y_i), i=1,2,3,...,n\)

偏差平方和:\(e = \sum_{i=1}^{n}{(c - y_i)^2}\)

求解过程:

​ 偏差平方和函数: \[ e = \sum_{i=1}^{n}{(c - y_i)^2} \] ​ 对c求导,令导数为0: \[ \frac{de}{dc} = \sum_{i=1}^{n}2(c - y_i) = 0 \] ​ 解关于c的一元一次方程可得: \[ c = \frac{\begin{align}\sum_{i=1}^{n} y_i\end{align}}{n} = \overline y \] ​ 由此可知,拟合一条与x轴平行的直线的最佳参数,就是样本点y值的均值

a \(\not=\) 0,单变量x,两个未知数 a和b,解二元一次方程

模型\(y = ax + b\)

目标函数:\(y = ax+b\)

n个样本点:\((x_i, y_i), i=1,2,3,...,n\)

偏差平方和:\(e = \sum_{i=1}^{n}{((ax_i+b)) - y_i)^2}\)

求解过程:

偏差平方和函数: \[ e = \sum_{i=1}^{n}{((ax_i+b)) - y_i)^2} \] 对a、b的偏导数: \[ \begin{align} & \frac{\partial e}{\partial a} = \sum_{i=1}^{n}2x_i(ax_i +b-y_i) \\\\ & \frac{\partial e}{\partial b} = \sum_{i=1}^{n}2(ax_i + b-y_i) \\\\ \end{align} \]

同时令e对a和b的偏导等于0: \[ \begin{cases} \begin{align} &\sum_{i=1}^{n} x_i(ax_i+b-y_i) = 0\\\\ &\sum_{i=1}^{n} (ax_i+b-y_i) = 0 \end{align} \end{cases} \] 解关于a、b的二元一次方程组得: \[ \begin{cases} \begin{align} & a = \frac{n\sum x_iy_i - \sum x_i \sum y_i}{n\sum x_i^2 - (\sum x_i)^2} \\\\\\ &b = \frac{\sum y_i - a\sum x_i }{n} \end{align} \end{cases} \]

多元线性回归

多项式函数:n个变量x,(n+1)个未知数a,解多元线性方程组

模型\(y =a_0 + a_1x_1 + a_2x_2 + ... + a_nx_n\)

目标函数: \(y =a_0 + a_1x_1 + a_2x_2 + ... + a_nx_n\)

m组样本点:\((x_{1i},x_{2i},\dots,x_{ni}, y_i),\space i=1,2,...,m\)

偏差平方和:\(e = \sum_{i=1}^{m}{(a_0 + a_1x_{1i} + a_2x_{2i} + ... + a_nx_{n i}- y_i)^2}\)

求解过程:

偏差平方和函数: \[ \begin{align} e = \sum_{i=1}^{m}{(a_0 + a_1x_{1i} + a_2x_{2i} + ... + a_nx_{n i}- y_i)^2} \end{align} \] 令e对 \(a_0\) ~ \(a_n\) 的各个偏导为0,稍微变化形式可得到关于未知数的正规方程组(n+1个方程,n+1个未知数): \[ \begin{cases} \begin{align} \sum_{i=1}^{m}(a_0 + a_1x_{1i} + a_2&x_{2i} + ... + a_nx_{n i}) = \sum_{i=1}^{m}y_i \\\\ \sum_{i=1}^{m}(a_0 + a_1x_{1i} + a_2&x_{2i} + ... + a_nx_{n i})x_{1i} = \sum_{i=1}^{m}x_{1i}y_i \\\\ &\vdots \\\\ \sum_{i=1}^{m}(a_0 + a_1x_{1i} + a_2&x_{2i} + ... + a_nx_{n i})x_{ni} = \sum_{i=1}^{m}x_{ni}y_i \end{align} \end{cases} \] 用矩阵的形式表示为: \[ \begin{aligned} \left[ \begin{matrix} n &\sum x_{1i} &\sum x_{2i} &\dots &\sum x_{ni} \\\\ \sum x_{1i} &\sum x_{1i}^2 & \sum x_{1i}x_{2i} &\dots &\sum x_{1i}x_{ni} \\\\ \vdots &\vdots &\vdots &\ddots &\vdots \\\\ \sum x_{ni} &\sum x_{1i}x_{ni} &\sum x_{2i}x_{ni} &\sum \dots &\sum x_{ni}^2 \end{matrix} \right] \left[ \begin{matrix} a_0 \\ a_1 \\ a_2 \\ \vdots \\ a_n \end{matrix} \right] = \left[ \begin{matrix} 1 &1 &1 &\cdots &1 \\\\ x_{11} &x_{12} & x_{13} &\dots &x_{1m} \\\\ \vdots &\vdots &\vdots &\ddots &\vdots \\\\ x_{n1} &x_{n2} & x_{n3} &\dots &x_{nm} \end{matrix} \right] \left[ \begin{matrix} y_1 \\ y_2 \\ y_3 \\ \vdots \\y_m \end{matrix} \right] \end{aligned} \] 令样本矩阵以及参数矩阵如下: \[ \begin{align} X = \left[ \begin{matrix} &1 &x_{11} &x_{21} &\dots &x_{n1} \\\\ &1 &x_{12} & x_{22} &\dots &x_{n2} \\\\ &\vdots &\vdots &\vdots &\ddots &\vdots \\\\ &1 &x_{1m} & x_{2m} &\dots &x_{nm} \end{matrix} \right], Y = \left[ \begin{matrix} y_1 \\ y_2 \\ y_3 \\ \vdots \\ y_m \end{matrix} \right], A= \left[ \begin{matrix} a_0 \\ a_1 \\ a_2 \\ \vdots \\ a_n \end{matrix} \right] \end{align} \] 则上面的矩阵方程可以表示为: \[ \begin{align} (X^TX)A = XY \end{align} \] 由于 \(X^TX\) 为满秩矩阵 ,即非奇异、可逆,因此将上式等号两侧同时左乘 \((X^TX)^{-1}\),可得最优参数矩阵: \[ \begin{align} A = (X^TX)^{-1}XY \end{align} \]

多元非线性回归

一元非线性回归可通过换元法转化为一元线性回归

四参数正弦函数:四个未知数,解多元非线性方程组

解法:视为整体,在初始值处一阶泰勒展开,结合牛顿-拉弗森迭代法每次构造一个线性方程组,用高斯消元法求解,再代回去迭代直至符合误差预期

模型\(y = Asin(wt+\theta)+c\)

目标函数:\(y = Asin(wt+\theta)+c\)

样本点:\((t_i, y_i), i=1,2,3...n\)

偏差平方和:\(e = \sum_{i=1}^{n}{[(Asin(wt_i + \theta)+c) - y_i]^2}\)

求解过程:

偏差平方和函数: \[ \begin{align} & e = \sum_{i=1}^{n}{[(Asin(wt_i + \theta)+c) - y_i]^2} \\\\ \end{align} \] 对各个待求参数求偏导: \[ \begin{align} &\frac{\partial e}{\partial A} = \sum_{i=1}^{n}2[Asin(wt_i +\theta) +c - y_i]sin(wt_i + \theta) \\\\ &\frac{\partial e}{\partial w} = \sum_{i=1}^{n}2[Asin(wt_i +\theta) +c - y_i]At_icos(wt_i+\theta) \\\\ &\frac{\partial e}{\partial \theta} = \sum_{i=1}^{n}2[Asin(wt_i +\theta) +c - y_i]Acos(wt_i+\theta) \\\\ &\frac{\partial e}{\partial c} = \sum_{i=1}^{n}2[Asin(wt_i +\theta) +c - y_i] \end{align} \] 令各个偏导数为0,等价于求解非线性方程组(注意和上式有差异,约掉了常系数): \[ \begin{cases} \begin{align} &f_1(X) = \sum_{i=1}^{n}[Asin(wt_i +\theta) +c - y_i]sin(wt_i + \theta) = 0\\\\ &f_2(X) = \sum_{i=1}^{n}[Asin(wt_i +\theta) +c - y_i]t_icos(wt_i+\theta) = 0\\\\ &f_3(X) = \sum_{i=1}^{n}[Asin(wt_i +\theta) +c - y_i]cos(wt_i+\theta) = 0\\\\ &f_4(X) = \sum_{i=1}^{n}[Asin(wt_i +\theta) +c - y_i] = 0 \end{align} \end{cases} \] 这里采用牛顿-拉弗森迭代法的思想,线性化迭代求解该方程组的4个未知量 \(A、w、\theta,、c\)

用矩阵形式表示参数集以及偏导数集,并假设初值为 \(X_0\) \[ X= \left[ \begin{matrix} A\\ w\\ \theta \\ c \end{matrix} \right], F(X) = \left[ \begin{matrix} f_1(X) \\ f_2(X) \\ f_3(X) \\ f_4(X) \end{matrix} \right] \] 构建 \(F(X)\)\(X_0\) 处的雅克比矩阵: \[ \begin{align} J_F(X)= \large \left[ \begin{matrix} \frac{\partial f_1(X_0)}{\partial A} & \frac{\partial f_1(X_0)}{\partial w} & \frac{\partial f_1(X_0)}{\partial \theta} & \frac{\partial f_1(X_0)}{\partial c} \\\\ \frac{\partial f_2(X_0)}{\partial A} & \frac{\partial f_2(X_0)}{\partial w} & \frac{\partial f_2(X_0)}{\partial \theta} & \frac{\partial f_2(X_0)}{\partial c} \\\\ \frac{\partial f_3(X_0)}{\partial A} & \frac{\partial f_3(X_0)}{\partial w} & \frac{\partial f_3(X_0)}{\partial \theta} & \frac{\partial f_3(X_0)}{\partial c} \\\\ \frac{\partial f_4(X_0)}{\partial A} & \frac{\partial f_4(X_0)}{\partial w} & \frac{\partial f_4(X_0)}{\partial \theta} & \frac{\partial f_4(X_0)}{\partial c} \end{matrix} \right] \end{align} \]

其中f1对各个参数的偏导: \[ \begin{align} &\frac{\partial f_1}{\partial A} = \sum_{i=1}^{n} sin^2(wt_i+\theta)\\\\ &\frac{\partial f_1}{\partial w} = \sum_{i=1}^{n} [At_isin2(wt_i+\theta)] + \sum_{i=1}^{n} [(c-y_i)t_icos(wt_i+\theta)]\\\\ &\frac{\partial f_1}{\partial \theta} = \sum_{i=1}^{n} [Asin2(wt_i+\theta)] + \sum_{i=1}^{n} [(c-y_i)cos(wt_i+\theta)]\\\\ &\frac{\partial f_1}{\partial c} = \sum_{i=1}^{n} sin(wt_i+\theta) \end{align} \]

f2对各个参数的偏导: \[ \begin{align} &\frac{\partial f_2}{\partial A} = \frac{1}{2}\sum_{i=1}^{n} t_isin2(wt_i+\theta)\\\\ &\frac{\partial f_2}{\partial w} = \sum_{i=1}^{n} [At_i^2cos2(wt_i+\theta)] - \sum_{i=1}^{n} [(c-y_i)t_i^2sin(wt_i+\theta)]\\\\ &\frac{\partial f_2}{\partial \theta} = \sum_{i=1}^{n} [At_icos2(wt_i+\theta)] - \sum_{i=1}^{n} [(c-y_i)t_isin(wt_i+\theta)]\\\\ &\frac{\partial f_2}{\partial c} = \sum_{i=1}^{n} t_icos(wt_i+\theta) \end{align} \]

f3对各个参数的偏导: \[ \begin{align} &\frac{\partial f_3}{\partial A} = \frac{1}{2}\sum_{i=1}^{n} sin2(wt_i+\theta) \\\\ &\frac{\partial f_3}{\partial w} = \sum_{i=1}^{n} [At_icos2(wt_i+\theta)] - \sum_{i=1}^{n} [(c-y_i)t_isin(wt_i+\theta)]\\\\ &\frac{\partial f_3}{\partial \theta} = \sum_{i=1}^{n} [Acos2(wt_i+\theta)] - \sum_{i=1}^{n} [(c-y_i)sin(wt_i+\theta)]\\\\ &\frac{\partial f_3}{\partial c} = \sum_{i=1}^{n} cos(wt_i+\theta) \end{align} \]

f4对各个参数的偏导: \[ \begin{align} &\frac{\partial f_4}{\partial A} = \sum_{i=1}^{n} sin(wt_i+\theta)\\\\ &\frac{\partial f_4}{\partial w} = \sum_{i=1}^{n} [At_icos(wt_i+\theta)]\\\\ &\frac{\partial f_4}{\partial \theta} = \sum_{i=1}^{n} [Acos(wt_i+\theta)]\\\\ &\frac{\partial f_4}{\partial c} = n \end{align} \]

\(F(X)\) 在初值 \(X_0\) 处的一阶泰勒展开: \[ \begin{align} F(X) \approx F(X_0) \space +\space J_F(X_0)(X-X_0)\end{align} \] 可知:令 等式右侧为0 求解出的 \(X\) 会比 \(X_0\) 更加趋近于 \(F(X)=0\) 的真实解,只要初值在这个真实解的收敛域内,经过不断的迭代求得的近似解会越来越趋近于真实解,我们令 \(\Delta X = X - X_0\) 可得: \[ \begin{align} & F(X_0) \space +\space J_F(X_0)\Delta X = 0 \end{align} \] 转化一下形式: \[ \begin{align} J_F(X_0)\Delta X = -F(X_0) \end{align} \] 这样我们便将非线性方程的求解问题转化成了线性方程的求解问题,上述式子就等价于: \[ \begin{align} \\ \large \left[ \begin{matrix} \frac{\partial f_1(X_0)}{\partial A} & \frac{\partial f_1(X_0)}{\partial w} & \frac{\partial f_1(X_0)}{\partial \theta} & \frac{\partial f_1(X_0)}{\partial c} \\ \frac{\partial f_2(X_0)}{\partial A} & \frac{\partial f_2(X_0)}{\partial w} & \frac{\partial f_2(X_0)}{\partial \theta} & \frac{\partial f_2(X_0)}{\partial c} \\ \frac{\partial f_3(X_0)}{\partial A} & \frac{\partial f_3(X_0)}{\partial w} & \frac{\partial f_3(X_0)}{\partial \theta} & \frac{\partial f_3(X_0)}{\partial c} \\ \frac{\partial f_4(X_0)}{\partial A} & \frac{\partial f_4(X_0)}{\partial w} & \frac{\partial f_4(X_0)}{\partial \theta} & \frac{\partial f_4(X_0)}{\partial c} \end{matrix} \right] \small \left[ \begin{matrix} \Delta A \\ \Delta w \\ \Delta \theta \\ \Delta c \end{matrix} \right] = \left[ \begin{matrix} - f_1(X_0) \\ - f_2(X_0) \\ - f_3(X_0) \\ - f_4(X_0) \\ \end{matrix} \right] \\\\ \end{align} \] 用高斯消元法或奇异值分解法解这个线性方程,可以求出 \(\Delta A, \Delta w, \Delta \theta, \Delta c\),进而本次迭代的结果就是 \((X_0 + \Delta X)\),然后我们将\((X_0 + \Delta X)\) 作为新的 \(X_0\) ,代入上面的线性方程,不断迭代求解,\(\Delta X\) 就会越来越小,可以获得越来越趋近于真实解的 \(X\)

Python API -- Leastsq功能测试

需要预先安装scipy库: sudo pip install scipy

这个程序是我在学最小二乘的时候写的一个测试程序,测试的目标是scipy库里leastsq这个以最小二乘法为基本原理的回归函数,目标函数可以是线性的也可以是非线性的。为了两个都测试,我选择了常量函数y=C作为线性模型,四参数正弦函数y=Asin(wt+Θ)+C作为非线性模型。其余各个函数的意义如下:

leastsq:API,输入参数依次为:误差函数、拟合初值、样本点集、迭代次数上限

func_sin:求正弦函数值。输入自变量x的值与四个参数组成的列表p,返回由p组成的正弦函数在x处的值。

error_sin:求拟合值与样本值的偏差量。输入拟合出来的参数列表p0与样本因变量y,返回二者的偏差量。

fit_sin:调用Leastsq函数进行拟合,获取拟合结果并求积分,返回拟合结果或积分值。

func_c/error_c/fit_c:模型换成y=C,与上面几个对照

read:纯粹是为了个人debug方便写的一个读取语句和捕获输入异常的功能函数,建议根据个人需求改写main函数。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
import numpy as np
from scipy.optimize import leastsq


def func_sin(x, p):
a, f, theta, c = p
return a * np.sin(2 * np.pi * f * x + theta) + c


def error_sin(p0, x, y):
return y - func_sin(x, p0)


def fit_sin(x, y, p, t, flag, max_fev=500):
x_sample = np.array(x)
y_sample = np.array(y)
x_max = max(x_sample)

p0 = np.array(p)
para = leastsq(error_sin, p0, args=(x_sample, y_sample), maxfev=max_fev)

a1, a2, a3, a4 = para[0]
a1 = round(float(a1), 3)
a2 = round(float(a2), 3)
a3 = round(float(a3), 3)
a4 = round(float(a4), 3)

# result_api = round(integrate(expr, (x_t, x_max, x_max+t)), 3) # 积分api,速度奇慢
tmp = a1 / (2*np.pi*a2)
temp_x = x_max + t
lower = - tmp - tmp*np.cos(2*np.pi*a2*x_max + a3) + a4*x_max
upper = - tmp - tmp*np.cos(2*np.pi*a2*temp_x + a3) + a4*temp_x
result = round((upper-lower), 3)
# print("\nresult_api = {}, result_straight = {}".format(result_api, result))

if flag:
return [a1, a2, a3, a4], result
else:
return result


def func_c(x, c):
return c * np.ones(x.shape)


def error_c(p, x, y):
return y - func_c(x, p)


def fit_c(x, y, p, t, flag, max_fev=500):
x_sample = np.array(x)
y_sample =np.array(y)
para = leastsq(error_c, p, args=(x_sample, y_sample), maxfev=max_fev)
a = round(float(para[0]), 3)
result = round(t*a, 3)
if flag:
return a, result
else:
print("拟合结果: y_c =", a)
return result


# (独立测试使用) 按规模读取键盘数字输入, 捕获异常
def read(sentence, quantity):
assert quantity > 0
sentence_error = "格式错误, 请按照要求重新输入: "
inf = []
try:
inf = list(map(float, input(sentence).split()))
while len(inf) != quantity:
inf = list(map(float, input(sentence_error).split()))
except ValueError:
while True:
try:
inf = list(map(float, input(sentence_error).split()))
while len(inf) != quantity:
inf = list(map(float, input(sentence_error).split()))
break
except ValueError:
continue

if quantity == 1:
return inf[0]
else:
return inf


# 作为独立程序测试
if __name__ == "__main__":

# 样本点要求: 跨度小 或者 跨度大密度大, 这里的跨度指的是样本横坐标跨越的周期个数

from sympy import *
from matplotlib import pyplot as plt
import time

sentence_1 = "请用空格分割, 按顺序输入希望的测试点 数目 & 横坐标最小值 & 横坐标最大值 & 横坐标分度值(如:15 -3.14 0 0.1): "
sentence_2 = "输入w测试正弦模型, e测试直线模型, 其它键取消测试: "
sentence_3 = "请用空格分割, 请按顺序输入正弦模型y = Asin(2πf*t+Θ)+c的四个参数A,f,Θ,c的真实值: "
sentence_4 = "请用空格分割, 输入参数A,f,Θ,c的拟合初值: "
sentence_5 = "请输入直线模型y = c中参数c的真实值: "
sentence_6 = "请输入参数c的拟合初值: "
sentence_7 = "请输入预测时长(单位:s): "
sentence_8 = "分度值不可为0, 请重新输入: "
sentence_9 = "请输入拟合次数上限: "

x_samples, y_samples, x_fit, y_fit = [], [], [], []

info = read(sentence_1, 4)
if info[3] == 0:
info = read(sentence_8, 4)
amount, x_min, x_max, x_division = int(info[0]), info[1], info[2], info[3]

key = input(sentence_2)

if key == 'w':
a, f, theta, c = read(sentence_3, 4) # 真实值
p0 = read(sentence_4, 4) # 拟合初值
x_samples = np.linspace(x_min, x_max, amount) # 样本x
y_origin = func_sin(x_samples, [a, f, theta, c]) # 样本y(无噪声)
y_samples = y_origin + 0.8 * np.random.randn(amount) # 样本y(带噪声)
t = read(sentence_7, 1) # 预测时长
max_fev = int(read(sentence_9, 1)) # 拟合次数上限

time_1 = time.time()
paras, result = fit_sin(x_samples, y_samples, p0, t, true, max_fev) # 拟合
time_2 = time.time()

x_t = symbols("x")
expr = paras[0] * sin(round(2 * np.pi * paras[1], 3) * x_t + paras[2]) + paras[3]

print("\n模型: y = Asin(2πf*t+Θ)+c")
print("样本点:\nx={}\ny(无噪声)={}\ny(含噪声)={}".format(x_samples, y_origin, y_samples))
print("函数原型参数: ", [a, f, theta, c])
print("拟合初值: ", p0)
print("拟合结果: ", paras)
print("拟合表达式: y =", expr)
print("拟合耗时: {} ms".format(round((time_2-time_1)*1000, 3)))
print("积分区间: [{}, {}]".format(x_max, x_max+t))
print("积分值: ", result)

if x_max > 0:
x_fit = np.arange(x_min, 2*(x_max+t), x_division)
x_bound = np.array([x_min, 2*(x_max+t)])
else:
x_fit = np.arange(x_min, abs(-x_max+2*t), x_division)
x_bound = np.array([x_min, abs(-x_max+2*t)])

y_fit = func_sin(x_fit, paras)
y_bound = np.array([c, c])

plt.figure(figsize=(8, 6))
plt.scatter(x_samples, y_samples, color='red', label='Sample Points', linewidth=2) # 带噪声的样本点图
plt.plot(x_samples, y_samples, color='gray', label="Sample Lines", linewidth=2) # 带噪声的样本曲线图
plt.plot(x_fit, y_fit, color='orange', label="Fitting Line", linewidth=2) # 拟合的曲线图
plt.plot(x_bound, y_bound, color='gray', linewidth=1, linestyle='--') # 对称轴
plt.xlabel("Time")
plt.ylabel("Velocity")
plt.plot(x_max, func_sin(x_max, paras), "ob", color='blue', label="current")
plt.plot(x_max+t, func_sin(x_max+t, paras), "ob", color='purple', label='Prediction')

plt.plot([x_max, x_max], [func_sin(x_max, paras), c],
color='blue', linestyle='--') # 作最新点到对称轴的垂直虚线
plt.plot([x_max+t, x_max+t], [func_sin(x_max+t, paras), c],
color='purple', linestyle='--') # 作预测点到对称轴的垂直虚线
x_series = np.linspace(x_max, x_max+t, amount) # 积分区域
y_1 = c
y_2 = func_sin(x_series, paras)
plt.fill_between(x_series, y_1, y_2, facecolor='white', hatch='/', interpolate=true) # 填充阴影

plt.legend()
plt.show()
exit(0)

elif key == 'e':
c = read(sentence_5, 1)
p0 = read(sentence_6, 1)
x_samples = np.linspace(x_min, x_max, amount) # 样本x
y_origin = func_c(x_samples, c) # 样本y(无噪声)
y_samples = y_origin + 0.8 * np.random.randn(amount) # 样本y(带噪声)
t = read(sentence_7, 1) # 预测时长
max_fev = int(read(sentence_9, 1)) # 拟合次数上限

time_1 = time.time()
para, result = fit_c(x_samples, y_samples, p0, t, true, max_fev) # 拟合
time_2 = time.time()

print("\n模型: y = c")
print("样本点:\nx={}\ny={}".format(x_samples, y_samples))
print("函数原型参数: ", c)
print("拟合初值: ", p0)
print("拟合结果: ", para)
print("拟合表达式: y =", para)
print("拟合耗时: {} ms".format((time_2-time_1)*1000))
print("积分区间: [{}, {}]".format(x_max, x_max+t))
print("积分值: ", result)

if x_max > 0:
x_fit = np.arange(x_min, 2*(x_max+t), x_division)
x_bound = np.array([x_min, 2.5*(x_max+t)])
else:
x_fit = np.arange(x_min, abs(-x_max+2*t), x_division)
x_bound = np.array([x_min, 1.25 * abs(-x_max+2*t)])

y_fit = func_c(x_fit, para)
y_bound = np.array([c, c])

fig = plt.figure(figsize=(8, 6))
ax = fig.add_subplot(111)
ax.spines['right'].set_color('none') # 去掉右方的y轴
ax.spines['top'].set_color('none') # 去掉上方的x轴
ax.spines['left'].set_position(('data', x_min)) # 将y轴移动至x最小值处
ax.spines['bottom'].set_position(('data', min(y_samples))) # 将x轴移动至y样本最小值处
ax.set_xticks(np.arange(min(x_samples), x_max+2*t, 1)) # 设置x刻度
ax.scatter(x_samples, y_samples, color='red', label='Original Points', linewidth=1) # 有噪声的样本点图
plt.plot(x_bound, y_bound, color='green', label='Real Value', linewidth=1) # 实际
ax.plot(x_fit, y_fit, color='orange', label="Fitting Line", linewidth=2) # 有噪声样本拟合的曲线图
plt.plot(x_max, func_c(np.array([x_max]), para), "ob", color='blue', label="current") # 最新点
plt.plot(x_max+t, func_c(np.array([x_max+t]), para), "ob", color='purple', label='Prediction') # 预测点
ax.plot([x_max, x_max], [func_c(np.array([x_max]), para), min(y_samples)],
color='blue', linestyle='--') # 作最新点到x轴的垂直虚线
ax.plot([x_max+t, x_max+t], [func_c(np.array([x_max+t]), para), min(y_samples)],
color='purple', linestyle='--') # 作预测点到x轴的垂直虚线
x_series = np.linspace(x_max, x_max+t, amount) # 积分区域
y_1 = min(y_samples) * np.ones(x_series.shape)
y_2 = func_c(x_series, para)
plt.fill_between(x_series, y_1, y_2, where=y_1<y_2, facecolor='white', hatch='/', interpolate=true) # 填充阴影
plt.xlabel("Time/s")
plt.ylabel("Velocity")
plt.legend()
plt.show()
exit(0)

else:
print("已取消测试")
exit(0)

运行结果如下:

输入参数依次为:[15, -3.14, 0, 0.1], 'w', [8, 0.34, 0.52, 0], [7, 0.2, 0.2, 0], 0.7, 1000, 结果如下:

图像结果

由于个人需要求了个积分,因此输入的参数中的"0.7"表示积分区间的长度,积分区间为 [样本横坐标最大值, 样本横坐标最大值+0.7],由阴影部分表示。其余点/线的意义如下:

  • 红色点:由输入参数的正弦函数生成的带高斯噪声的样本点
  • 蓝色点:横坐标最大的样本点
  • 紫色点:积分点
  • 灰色折线:样本点依次连线
  • 黄色曲线:拟合结果

输出的数据结果如下,其中带噪声的样本点是由无噪声样本点的基础上加上符合标准正态分布的噪声产生的:

数据结果

程序可改进的地方:因为选择了y=C这样一个与自变量不相关的线性模型,所以最佳拟合值就是样本均值,不需要再调用api了!