博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Linear Regression with machine learning methods
阅读量:6983 次
发布时间:2019-06-27

本文共 6841 字,大约阅读时间需要 22 分钟。

Ha, it's English time, let's spend a few minutes to learn a simple machine learning example in a simple passage.

Introduction

  • What is machine learning? you design methods for machine to learn itself and improve itself.
  • By leading into the machine learning methods, this passage introduced three methods to get optimal k and b of linear regression(y = k*x + b).
  • The data used is produced by ourselves.

Life is simple

Self-sufficientDataGeneration

import pandas as pdimport matplotlib.pyplot as pltimport numpy as npimport random#produce dataage_with_fares = pd.DataFrame({"Fare":[263.0, 247.5208, 146.5208, 153.4625, 135.6333, 247.5208, 164.8667, 134.5, 135.6333, 153.4625, 134.5, 263.0, 211.5, 263.0, 151.55, 153.4625, 227.525, 211.3375, 211.3375],                          "Age":[23.0, 24.0, 58.0, 58.0, 35.0, 50.0, 31.0, 40.0, 36.0, 38.0, 41.0, 24.0, 27.0, 64.0, 25.0, 40.0, 38.0, 29.0, 43.0]})sub_fare = age_with_fares['Fare']sub_age = age_with_fares['Age']#show our dataplt.scatter(sub_age,sub_fare)plt.show()

1.PNG

def func(age, k, b): return k*age+bdef loss(y,yhat): return np.mean(np.abs(y-yhat))#here we choose only minus methods as the loss, besides, there are mean-square-error(L2) loss and other loss methods

RandomChosenMethod

min_error_rate = float('inf')loop_times = 10000losses = []def step(): return random.random() * 2 - 1# random生成 0~1的随机数;(0,1)*2 -> (0,2); 再减1 -> (-1,1), 随机生成+循环:学习动力来源while loop_times > 0:    k_hat = random.random() * 20 - 10    b_hat = random.random() * 20 - 10    estimated_fares = func(sub_age, k_hat, b_hat)    error_rate = loss(y=sub_fare, yhat=estimated_fares)    if error_rate

2.PNG

show the loss change

plt.plot(range(len(losses)), losses)plt.show()

3.PNG

Explain

  • We can see the loss decrease sometimes quickly, sometimes slowly, anyway, it decreases finally.
  • One shortcoming of this method: the Random Chosen methods is not so valid as it runs random function tons of time.
  • Because even when it comes out a better parameter, it may choose a worse one next time.
  • One improved method see next part.

SupervisedDirectionMethod

change_directions = [    (+1, -1),# k increase, b decrease    (+1, +1),    (-1, -1),    (-1, +1)]min_error_rate = float('inf')loop_times = 10000losses = []best_direction = random.choice(change_directions)#定义每次变化(步长)的大小def step(): return random.random()*2-1 #random生成 0~1的随机数;(0,1)*2 -> (0,2); 再减1 -> (-1,1);#但是change_directions已经有加减1(改变方向)的操作,所以去掉 *2-1#但保留*2-1 能增加choisek_hat = random.random() * 20 - 10b_hat = random.random() * 20 - 10best_k, best_b = k_hat, b_hatwhile loop_times > 0:    k_delta_direction, b_delta_direction = best_direction or random.choice(change_directions)    k_delta = k_delta_direction * step()    b_delta = b_delta_direction * step()    new_k = best_k + k_delta    new_b = best_b + b_delta    estimated_fares = func(sub_age, new_k, new_b)    error_rate = loss(y=sub_fare, yhat=estimated_fares)    #print(error_rate)    if error_rate < min_error_rate:#supervisor learning        min_error_rate = error_rate        best_k, best_b = new_k, new_b        best_direction = (k_delta_direction, b_delta_direction)        #print(min_error_rate)        #print("loop == {}".format(loop_times))        losses.append(min_error_rate)        #print("f(age) = {} * age + {}, with error rate: {}".format(best_k, best_b, error_rate))    else:        best_irection = random.choice(list(set(change_directions)-{(k_delta_direction, b_delta_direction)}))        #新方向不能等于老方向    loop_times -= 1print("f(age) = {} * age + {}, with error rate: {}".format(best_k, best_b, error_rate))    plt.scatter(sub_age, sub_fare)plt.plot(sub_age, func(sub_age, best_k, best_b), c = 'r')plt.show()

4.PNG

show the loss change

plt.plot(range(len(losses)), losses)plt.show()

5.PNG

Explain

  • The Supervised Direction method(2nd method) is better than Random Chosen method(1st method).
  • The 2nd method introduced supervise mechanism, which is more efficiently in changing parameters k and b.
  • But the 2nd method can't optimize the parameters to smaller magnitude.
  • Besides, the 2nd method can't find the extreme value, thus can't find the optimal parameters effectively.

GradientDescentMethod

min_error_rate = float('inf')loop_times = 10000losses = []learing_rate = 1e-1change_directions = [    # (k, b)    (+1, -1), # k increase, b decrease    (+1, +1),    (-1, +1),    (-1, -1)  # k decrease, b decrease]k_hat = random.random() * 20 - 10b_hat = random.random() * 20 - 10best_direction = Nonedef step(): return random.random() * 1direction = random.choice(change_directions)def derivate_k(y, yhat, x):    abs_values = [1 if (y_i - yhat_i) > 0 else -1 for y_i, yhat_i in zip(y, yhat)]    return np.mean([a * -x_i for a, x_i in zip(abs_values, x)])def derivate_b(y, yhat):    abs_values = [1 if (y_i - yhat_i) > 0 else -1 for y_i, yhat_i in zip(y, yhat)]    return np.mean([a * -1 for a in abs_values])while loop_times > 0:    k_delta = -1 * learing_rate * derivate_k(sub_fare, func(sub_age, k_hat, b_hat), sub_age)    b_delta = -1 * learing_rate * derivate_b(sub_fare, func(sub_age, k_hat, b_hat))    k_hat += k_delta    b_hat += b_delta    estimated_fares = func(sub_age, k_hat, b_hat)    error_rate = loss(y=sub_fare, yhat=estimated_fares)    #print('loop == {}'.format(loop_times))    #print('f(age) = {} * age  {}, with error rate: {}'.format(k_hat, b_hat, error_rate))    losses.append(error_rate)    loop_times -= 1print('f(age) = {} * age  {}, with error rate: {}'.format(k_hat, b_hat, error_rate))plt.scatter(sub_age, sub_fare)plt.plot(sub_age, func(sub_age, k_hat, b_hat), c = 'r')plt.show()

7.PNG

show the loss change

plt.plot(range(len(losses)), losses)plt.show()

8.PNG

Explain

  • To fit the objective function given discrete data, we use the loss function to determine how good the fit is.
  • In order to get the minimum loss, it becomes a problem of finding the extremum without constraints.
  • Therefore, the method of gradient reduction of the objective function is conceived.
  • The gradient is the maximum value in the directional derivative.
  • When the gradient approaches 0, we fit the better objective function.

Conclusion

  • Machine learning is a process to make the machine learning and improving by methods designed by us.
  • Random function usually not so efficient, but when we add supervise mechanism, it becomes efficient.
  • Gradient Descent is efficiently to find extreme value and optimal.

Serious question for this article:

Why do you use machine learning methods instead of creating a y = k*x + b formula?

  • In some senarios, complicated formula can't meet the reality needs, like irrational elements in economics models.
  • When we have enough valid data, we can run regression or classification model by machine learning methods
  • We can also evaluate our machine learning model by test data which contributes to the application of the model in our real life
  • This is just an example, Okay.

转载于:https://www.cnblogs.com/ChristopherLE/p/10790492.html

你可能感兴趣的文章
UNIX环境高级编程笔记之文件I/O
查看>>
DIV+CSS规范命名
查看>>
我的2013 Q.E.D
查看>>
2017 Multi-University Training Contest - Team 9 1002&&HDU 6162 Ch’s gift【树链部分+线段树】...
查看>>
4.5. Rspamd
查看>>
ArcMap中的名称冲突问题
查看>>
(转) 一张图解AlphaGo原理及弱点
查看>>
美联邦调查局 FBI 网站被黑,数千特工信息泄露
查看>>
掉电引起的ORA-1172错误解决过程(二)
查看>>
在网站建设过程中主要在哪几个方面为后期的网站优打好根基?
查看>>
【MOS】RAC 环境中最常见的 5 个数据库和/或实例性能问题 (文档 ID 1602076.1)
查看>>
新年图书整理和相关的产品
查看>>
Struts2的核心文件
查看>>
Spring Boot集成Jasypt安全框架
查看>>
GIS基础软件及操作(十)
查看>>
HDOJ 2041 超级楼梯
查看>>
1108File Space Bitmap Block损坏能修复吗2
查看>>
遭遇DBD::mysql::dr::imp_data_size unexpectedly
查看>>
人人都会设计模式:03-策略模式--Strategy
查看>>
被忽视但很实用的那部分SQL
查看>>