python - Difference between LinearRegression() and Ridge(alpha=0) -


the tikhonov (ridge) cost becomes equivalent least squares cost when alpha parameter approaches zero. on scikit-learn docs subject indicates same. therefore expected

sklearn.linear_model.ridge(alpha=1e-100).fit(data, target) 

to equivalent to

sklearn.linear_model.linearregression().fit(data, target) 

but that's not case. why?

updated code:

import pandas pd sklearn.linear_model import ridge, linearregression sklearn.preprocessing import polynomialfeatures import matplotlib.pyplot plt %matplotlib inline  dataset = pd.read_csv('house_price_data.csv')  x = dataset['sqft_living'].reshape(-1, 1) y = dataset['price'].reshape(-1, 1)  polyx = polynomialfeatures(degree=15).fit_transform(x)  model1 = linearregression().fit(polyx, y) model2 = ridge(alpha=1e-100).fit(polyx, y)  plt.plot(x, y,'.',          x, model1.predict(polyx),'g-',          x, model2.predict(polyx),'r-') 

note: plot looks same alpha=1e-8 or alpha=1e-100

enter image description here

according documentation, alpha must positive float. example has alpha=0 integer. using small positive alpha, results of ridge , linearregression appear converge.

from sklearn.linear_model import ridge, linearregression data = [[0, 0], [1, 1], [2, 2]] target = [0, 1, 2]  ridge_model = ridge(alpha=1e-8).fit(data, target) print("ridge coefs: " + str(ridge_model.coef_)) ols = linearregression().fit(data,target) print("ols coefs: " + str(ols.coef_))  # ridge coefs: [ 0.49999999  0.50000001] # ols coefs: [ 0.5  0.5] # # vs. alpha=0: # ridge coefs: [  1.57009246e-16   1.00000000e+00] # ols coefs: [ 0.5  0.5] 

update issue alpha=0 int above seems issue few toy problems example above.

for housing data, issue 1 of scaling. 15-degree polynomial invoke causing numerical overflow. produce identical results linearregression , ridge, try scaling data first:

import pandas pd sklearn.linear_model import ridge, linearregression sklearn.preprocessing import polynomialfeatures, scale  dataset = pd.read_csv('house_price_data.csv')  # scale x data prevent numerical errors. x = scale(dataset['sqft_living'].reshape(-1, 1)) y = dataset['price'].reshape(-1, 1)  polyx = polynomialfeatures(degree=15).fit_transform(x)  model1 = linearregression().fit(polyx, y) model2 = ridge(alpha=0).fit(polyx, y)  print("ols coefs: " + str(model1.coef_[0])) print("ridge coefs: " + str(model2.coef_[0]))  #ols coefs: [  0.00000000e+00   2.69625315e+04   3.20058010e+04  -8.23455994e+04 #  -7.67529485e+04   1.27831360e+05   9.61619464e+04  -8.47728622e+04 #  -5.67810971e+04   2.94638384e+04   1.60272961e+04  -5.71555266e+03 #  -2.10880344e+03   5.92090729e+02   1.03986456e+02  -2.55313741e+01] #ridge coefs: [  0.00000000e+00   2.69625315e+04   3.20058010e+04  -8.23455994e+04 #  -7.67529485e+04   1.27831360e+05   9.61619464e+04  -8.47728622e+04 #  -5.67810971e+04   2.94638384e+04   1.60272961e+04  -5.71555266e+03 #  -2.10880344e+03   5.92090729e+02   1.03986456e+02  -2.55313741e+01] 

Comments

Popular posts from this blog

php - How to display all orders for a single product showing the most recent first? Woocommerce -

asp.net - How to correctly use QUERY_STRING in ISAPI rewrite? -

angularjs - How restrict admin panel using in backend laravel and admin panel on angular? -