Exercise:Logistic Regression

记录一下逻辑回归作业代码

Task1

假设函数: $h_\theta(x)=sigmoid(\theta^Tx)$

代价函数: $J(\theta)=\frac{1}{m}\sum\limits_{i=1}^{m}[-y^{(i)}log(h_\theta(x^{(i)}))-(1-y^{(i)})log(1-h_\theta(x^{(i)}))]$

梯度(代价函数的偏导数): $\frac{\delta J(\theta)}{\delta \theta_j}=\frac{1}{m}\sum\limits_{i=1}^{m}(h_\theta(x^{(i)})-y^{(i)})x^{(i)}$

Matlab 实现 costFunction 如下（已经实现向量化计算）：

function [J, grad] = costFunction(theta, X, y)
%COSTFUNCTION Compute cost and gradient for logistic regression
%   J = COSTFUNCTION(theta, X, y) computes the cost of using theta as the
%   parameter for logistic regression and the gradient of the cost
%   w.r.t. to the parameters.
% Initialize some useful values
m = length(y); % number of training examples
% You need to return the following variables correctly 
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta
%
% Note: grad should have the same dimensions as theta
%
h = sigmoid(X * theta);
resultSet = -y .* log(h) - (ones(size(y)) - y) .* log(ones(size(y)) - h);
J = (1/m) * sum(resultSet);
grad = (1/m) .* (X' * (h - y));
% =============================================================
end

costFunction 有两个返回值，分别是 J 和 grad，grad 是梯度。

使用 fminunc 寻找最优解（前一个是手动写的迭代），fminunc 是 Matlab 的内置函数，它会选择合适的算法去寻找最优解，且不需要指定学习率。

%  Set options for fminunc
options = optimset('GradObj', 'on', 'MaxIter', 400);
%  Run fminunc to obtain the optimal theta
%  This function will return theta and the cost 
[theta, cost] = ...
	fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);

GradObj 设置为 on 表示传入的函数会返回 [J, grad]，就是代价和梯度；MaxIter 指定最大迭代次数。具体参考 help fminunc

最后画出决策边界如下

Task2

这个任务主要是处理过拟合问题

如上图，我们不能用一条直线作为决策边界，需要一条曲线。因此，引入高次项来更好的拟合这些数据。作业已经将两个特征映射到两个特征1到6次所有多项式，即一个28维的向量。

如果直接用 Task1 的算法会导致过拟合问题，如图

改进代价函数公式如下（正则化处理）

Matlab 重新实现代价函数

function [J, grad] = costFunctionReg(theta, X, y, lambda)
%COSTFUNCTIONREG Compute cost and gradient for logistic regression with regularization
%   J = COSTFUNCTIONREG(theta, X, y, lambda) computes the cost of using
%   theta as the parameter for regularized logistic regression and the
%   gradient of the cost w.r.t. to the parameters. 

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0;
grad = zeros(size(theta));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta

h = sigmoid(X * theta);
resultSet = -y .* log(h) - (ones(size(y)) - y) .* log(ones(size(y)) - h);
penaltyTerm = (lambda / (2 * m)) * sum(theta(2:size(theta)).^2);
J = (1/m) * sum(resultSet) + penaltyTerm;
penaltyTerm2 = ((lambda / m) .* theta) .* [ 0; ones([size(theta, 1) - 1, 1])];
grad = (1/m) .* (X' * (h - y)) + penaltyTerm2;

% =============================================================
end

调整 lambda 可以获得不同的决策边界。

特征映射的代码如下(作业给的)

function out = mapFeature(X1, X2)
% MAPFEATURE Feature mapping function to polynomial features
%
%   MAPFEATURE(X1, X2) maps the two input features
%   to quadratic features used in the regularization exercise.
%
%   Returns a new feature array with more features, comprising of 
%   X1, X2, X1.^2, X2.^2, X1*X2, X1*X2.^2, etc..
%
%   Inputs X1, X2 must be the same size
%

degree = 6;
out = ones(size(X1(:,1)));
for i = 1:degree
    for j = 0:i
        out(:, end+1) = (X1.^(i-j)).*(X2.^j);
    end
end

end

degree 可以控制映射到的最高次项，经过我测试用一个圆（即2次）也是可以拟合的，但是准确度肯定就不行了2333.

1 2	`Train Accuracy: 81.355932 Expected accuracy (with lambda = 1): 83.1 (approx)`

我也测试了 degree > 6 (=12)的情况，准确度结果如下

1 2	`Train Accuracy: 83.050847 Expected accuracy (with lambda = 1): 83.1 (approx)`

我猜，6次应该是老师选取的最好的结果。

最后，向量化计算后，代码实现的公式可读性极差，有现bug直接重写…..

homework

本博客所有文章除特别声明外，均采用 CC BY-SA 4.0 协议，转载请注明出处！

Exercise:Recognize hand-written digits Previous

RDP 配置文件编写记录 Next