Exercise:Logistic Regression

Exercise:Logistic Regression

记录一下逻辑回归作业代码

Task1

假设函数: $h_\theta(x)=sigmoid(\theta^Tx)$

代价函数: $J(\theta)=\frac{1}{m}\sum\limits_{i=1}^{m}[-y^{(i)}log(h_\theta(x^{(i)}))-(1-y^{(i)})log(1-h_\theta(x^{(i)}))]$

梯度(代价函数的偏导数): $\frac{\delta J(\theta)}{\delta \theta_j}=\frac{1}{m}\sum\limits_{i=1}^{m}(h_\theta(x^{(i)})-y^{(i)})x^{(i)}$

Matlab 实现 costFunction 如下(已经实现向量化计算):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
function [J, grad] = costFunction(theta, X, y)
%COSTFUNCTION Compute cost and gradient for logistic regression
% J = COSTFUNCTION(theta, X, y) computes the cost of using theta as the
% parameter for logistic regression and the gradient of the cost
% w.r.t. to the parameters.
% Initialize some useful values
m = length(y); % number of training examples
% You need to return the following variables correctly
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
% You should set J to the cost.
% Compute the partial derivatives and set grad to the partial
% derivatives of the cost w.r.t. each parameter in theta
%
% Note: grad should have the same dimensions as theta
%
h = sigmoid(X * theta);
resultSet = -y .* log(h) - (ones(size(y)) - y) .* log(ones(size(y)) - h);
J = (1/m) * sum(resultSet);
grad = (1/m) .* (X' * (h - y));
% =============================================================
end

costFunction 有两个返回值,分别是 Jgrad,grad 是梯度。

使用 fminunc 寻找最优解(前一个是手动写的迭代),fminunc 是 Matlab 的内置函数,它会选择合适的算法去寻找最优解,且不需要指定学习率。

1
2
3
4
5
6
7
%  Set options for fminunc
options = optimset('GradObj', 'on', 'MaxIter', 400);
% Run fminunc to obtain the optimal theta
% This function will return theta and the cost
[theta, cost] = ...
fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);

GradObj 设置为 on 表示传入的函数会返回 [J, grad],就是代价和梯度;MaxIter 指定最大迭代次数。具体参考 help fminunc

最后画出决策边界如下

Task2

这个任务主要是处理过拟合问题

如上图,我们不能用一条直线作为决策边界,需要一条曲线。因此,引入高次项来更好的拟合这些数据。作业已经将两个特征映射到两个特征1到6次所有多项式,即一个28维的向量。

如果直接用 Task1 的算法会导致过拟合问题,如图

改进代价函数公式如下(正则化处理)

Matlab 重新实现代价函数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
function [J, grad] = costFunctionReg(theta, X, y, lambda)
%COSTFUNCTIONREG Compute cost and gradient for logistic regression with regularization
% J = COSTFUNCTIONREG(theta, X, y, lambda) computes the cost of using
% theta as the parameter for regularized logistic regression and the
% gradient of the cost w.r.t. to the parameters.

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly
J = 0;
grad = zeros(size(theta));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
% You should set J to the cost.
% Compute the partial derivatives and set grad to the partial
% derivatives of the cost w.r.t. each parameter in theta

h = sigmoid(X * theta);
resultSet = -y .* log(h) - (ones(size(y)) - y) .* log(ones(size(y)) - h);
penaltyTerm = (lambda / (2 * m)) * sum(theta(2:size(theta)).^2);
J = (1/m) * sum(resultSet) + penaltyTerm;
penaltyTerm2 = ((lambda / m) .* theta) .* [ 0; ones([size(theta, 1) - 1, 1])];
grad = (1/m) .* (X' * (h - y)) + penaltyTerm2;

% =============================================================
end

调整 lambda 可以获得不同的决策边界。

特征映射的代码如下(作业给的)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
function out = mapFeature(X1, X2)
% MAPFEATURE Feature mapping function to polynomial features
%
% MAPFEATURE(X1, X2) maps the two input features
% to quadratic features used in the regularization exercise.
%
% Returns a new feature array with more features, comprising of
% X1, X2, X1.^2, X2.^2, X1*X2, X1*X2.^2, etc..
%
% Inputs X1, X2 must be the same size
%

degree = 6;
out = ones(size(X1(:,1)));
for i = 1:degree
for j = 0:i
out(:, end+1) = (X1.^(i-j)).*(X2.^j);
end
end

end

degree 可以控制映射到的最高次项,经过我测试用一个圆(即2次)也是可以拟合的,但是准确度肯定就不行了2333.

1
2
Train Accuracy: 81.355932
Expected accuracy (with lambda = 1): 83.1 (approx)

我也测试了 degree > 6 (=12)的情况,准确度结果如下

1
2
Train Accuracy: 83.050847
Expected accuracy (with lambda = 1): 83.1 (approx)

我猜,6次应该是老师选取的最好的结果。

最后,向量化计算后,代码实现的公式可读性极差,有现bug直接重写…..


本博客所有文章除特别声明外,均采用 CC BY-SA 4.0 协议 ,转载请注明出处!