梯度下降是疊代法的一種,可以用於求解最小二乘問題(線性和非線性都可以)。在求解機器學習算法的模型參數,即無約束最佳化問題時,梯度下降(Gradient Descent)是最常採用的方法之一,另一種常用的方法是最小二乘法。在求解損失函式的最小值時,可以通過梯度下降法來一步步的疊代求解,得到最小化的損失函式和模型參數值。反過來,如果我們需要求解損失函式的最大值,這時就需要用梯度上升法來疊代了。在機器學習中,基於基本的梯度下降法發展了兩種梯度下降方法,分別為隨機梯度下降法和批量梯度下降法。
基本介紹
簡介
求解過程
data:image/s3,"s3://crabby-images/e074a/e074a7684d1c3e7389797a79e50460c6f184709e" alt=""
data:image/s3,"s3://crabby-images/65ec5/65ec5bfc574bf7f88d7335bc74301661a3f2c3a8" alt=""
data:image/s3,"s3://crabby-images/2384f/2384f9da7eab0073bd738f3e267ba83a23904d82" alt=""
套用
data:image/s3,"s3://crabby-images/81b5f/81b5ffc8fc21be87bfcbf9123e2055a92d860298" alt=""
data:image/s3,"s3://crabby-images/b1a7c/b1a7c1477054fbe00f208e5b1ef08f601b8549ff" alt=""
data:image/s3,"s3://crabby-images/fd957/fd95724f57949ee489a9a5cd9c3ed2f8bfce16eb" alt=""
data:image/s3,"s3://crabby-images/e0dc5/e0dc5ff8de7c2d48aabd35552d4dac909104a0ed" alt=""
data:image/s3,"s3://crabby-images/a5ab4/a5ab4b0351dcaf942d120cf3e3a9d62a9d664fd6" alt=""
data:image/s3,"s3://crabby-images/c5d7e/c5d7eb557ea2615581b3e84f9dc1cc090da29eab" alt=""
data:image/s3,"s3://crabby-images/8bb7f/8bb7f5932bf36f6fa652373a281283241d24ab8c" alt=""
data:image/s3,"s3://crabby-images/8bb7f/8bb7f5932bf36f6fa652373a281283241d24ab8c" alt=""
data:image/s3,"s3://crabby-images/8bb7f/8bb7f5932bf36f6fa652373a281283241d24ab8c" alt=""
data:image/s3,"s3://crabby-images/dbeb1/dbeb121181aeeef4f6a665ac41091bb658a33d5c" alt=""
data:image/s3,"s3://crabby-images/dbeb1/dbeb121181aeeef4f6a665ac41091bb658a33d5c" alt=""
data:image/s3,"s3://crabby-images/8bb7f/8bb7f5932bf36f6fa652373a281283241d24ab8c" alt=""
data:image/s3,"s3://crabby-images/dbeb1/dbeb121181aeeef4f6a665ac41091bb658a33d5c" alt=""
%% 最速下降法圖示% 設定步長為0.1,f_change為改變前後的y值變化,僅設定了一個退出條件。syms x;f=x^2;step=0.1;x=2;k=0; %設定步長,初始值,疊代記錄數f_change=x^2; %初始化差值f_current=x^2; %計算當前函式值ezplot(@(x,f)f-x.^2) %畫出函式圖像axis([-2,2,-0.2,3]) %固定坐標軸hold onwhile f_change>0.000000001 %設定條件,兩次計算的值之差小於某個數,跳出循環 x=x-step*2*x; %-2*x為梯度反方向,step為步長,!最速下降法! f_change = f_current - x^2; %計算兩次函式值之差 f_current = x^2 ; %重新計算當前的函式值 plot(x,f_current,'ro','markersize',7) %標記當前的位置 drawnow;pause(0.2); k=k+1;endhold offfprintf('在疊代%d次後找到函式最小值為%e,對應的x值為%e\n',k,x^2,x)
data:image/s3,"s3://crabby-images/2b2ee/2b2eed880f3937862523b17082e576697b068950" alt=""
data:image/s3,"s3://crabby-images/7858e/7858ecf80880fef99cbfa1106a10b082071b6b4f" alt=""
data:image/s3,"s3://crabby-images/748f5/748f5823b1fce5605a6ba344ce2e3e3b6e9704c3" alt=""
data:image/s3,"s3://crabby-images/0e80c/0e80c46f95483bb62573bce2f9e464526102d190" alt=""
data:image/s3,"s3://crabby-images/09561/095610337a06b901be1aeee6feef29b9364ec631" alt="圖1 圖1"
缺點
- 靠近極小值時收斂速度減慢。
- 直線搜尋時可能會產生一些問題。
- 可能會“之字形”地下降。