Exercise:Recognize hand-written digits

Multi-class Classifification and Neural Networks

这是课程的第三个编程作业,用逻辑回归和神经网络识别手写数字,非常有趣的一个作业。

由于神经网络只学习了模型,并未学习如何训练参数,故这次作业已经提前给了神经网络的参数。

训练集样本如图:

每一张图都是一个手写的数字,每一张图的像素是 20 * 20 ,将每一个像素作为一个特征来训练模型。

数据可视化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
function [h, display_array] = displayData(X, example_width)
%DISPLAYDATA Display 2D data in a nice grid
% [h, display_array] = DISPLAYDATA(X, example_width) displays 2D data
% stored in X in a nice grid. It returns the figure handle h and the
% displayed array if requested.

% Set example_width automatically if not passed in
if ~exist('example_width', 'var') || isempty(example_width)
example_width = round(sqrt(size(X, 2)));
end

% Gray Image
colormap(gray);

% Compute rows, cols
[m n] = size(X);
example_height = (n / example_width);

% Compute number of items to display
display_rows = floor(sqrt(m));
display_cols = ceil(m / display_rows);

% Between images padding
pad = 1;

% Setup blank display
display_array = - ones(pad + display_rows * (example_height + pad), ...
pad + display_cols * (example_width + pad));

% Copy each example into a patch on the display array
curr_ex = 1;
for j = 1:display_rows
for i = 1:display_cols
if curr_ex > m,
break;
end
% Copy the patch

% Get the max value of the patch
max_val = max(abs(X(curr_ex, :)));
display_array(pad + (j - 1) * (example_height + pad) + (1:example_height), ...
pad + (i - 1) * (example_width + pad) + (1:example_width)) = ...
reshape(X(curr_ex, :), example_height, example_width) / max_val;
curr_ex = curr_ex + 1;
end
if curr_ex > m,
break;
end
end

% Display Image
h = imagesc(display_array, [-1 1]);

% Do not show axis
axis image off

drawnow;

end

多元逻辑回归

把每一图的400个像素作为400个特征,训练逻辑回归模型,由于这次特征比较多,所以逻辑回归的实现代码一定要向量化,我在上一个作业中已经实现了向量化版的逻辑回归代价函数,所以直接拿来用。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
function [J, grad] = lrCostFunction(theta, X, y, lambda)
%LRCOSTFUNCTION Compute cost and gradient for logistic regression with
%regularization
% J = LRCOSTFUNCTION(theta, X, y, lambda) computes the cost of using
% theta as the parameter for regularized logistic regression and the
% gradient of the cost w.r.t. to the parameters.

% Initialize some useful values
m = length(y); % number of training examples
J = 0;
grad = zeros(size(theta));

h = sigmoid(X * theta);
resultSet = -y .* log(h) - (ones(size(y)) - y) .* log(ones(size(y)) - h);
penaltyTerm = (lambda / (2 * m)) * sum(theta(2:size(theta)).^2);
J = (1/m) * sum(resultSet) + penaltyTerm;
penaltyTerm2 = ((lambda / m) .* theta) .* [ 0; ones([size(theta, 1) - 1, 1])];
grad = (1/m) .* (X' * (h - y)) + penaltyTerm2;
% =============================================================
grad = grad(:);
end

学习

数字识别,一共有十个分类,分别对应 1,2,3,….,9,0,依次训练每一个数字的分类器。oneVsAll 函数返回所有分类器对应的参数矩阵。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
function [all_theta] = oneVsAll(X, y, num_labels, lambda)
%ONEVSALL trains multiple logistic regression classifiers and returns all
%the classifiers in a matrix all_theta, where the i-th row of all_theta
%corresponds to the classifier for label i
% [all_theta] = ONEVSALL(X, y, num_labels, lambda) trains num_labels
% logistic regression classifiers and returns each of these classifiers
% in a matrix all_theta, where the i-th row of all_theta corresponds
% to the classifier for label i

% Some useful variables
m = size(X, 1);
n = size(X, 2);

% You need to return the following variables correctly
all_theta = zeros(num_labels, n + 1);

% Add ones to the X data matrix
X = [ones(m, 1) X];

% ====================== YOUR CODE HERE ======================
options = optimset('GradObj', 'on', 'MaxIter', 200);
initial_theta = zeros(n + 1, 1);
for i = 1:10 % 依次训练每一个分类器,y == i 只有 y 与 i 对应位置相等则为1否则0
[theta] = fmincg(@(t)(lrCostFunction(t, X, (y == i), lambda)), ...
initial_theta, options);
all_theta(i, :) = theta';
end
% =========================================================================

end

预测

predictOneVsAll 中的参数 X 是一个矩阵,每一行代表一张图片。每一张图片应该经过所有分类器测试,返回概率最大的分类。

向量化计算可以一次性将 X 的所有样例都计算

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
function p = predictOneVsAll(all_theta, X)
%PREDICT Predict the label for a trained one-vs-all classifier. The labels
%are in the range 1..K, where K = size(all_theta, 1).
% p = PREDICTONEVSALL(all_theta, X) will return a vector of predictions
% for each example in the matrix X. Note that X contains the examples in
% rows. all_theta is a matrix where the i-th row is a trained logistic
% regression theta vector for the i-th class. You should set p to a vector
% of values from 1..K (e.g., p = [1; 3; 1; 2] predicts classes 1, 3, 1, 2
% for 4 examples)

m = size(X, 1);
num_labels = size(all_theta, 1);

% You need to return the following variables correctly
p = zeros(size(X, 1), 1);

% Add ones to the X data matrix
X = [ones(m, 1) X];
% ====================== YOUR CODE HERE ======================
test = sigmoid(X * all_theta'); % 向量化计算每一个样例
[m, p] = max(test, [], 2); % p 是最大概率对应的下标
% =========================================================================
end

预测准确度如下

1
Training Set Accuracy: 96.460000

神经网络预测函数

使用神经网络也可以完成手写识别。这个作业需要使用3层神经网络, 并且已经给了参数,只要求写出 predict 函数,所以直接套公式即可计算。

实现代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
function p = predict(Theta1, Theta2, X)
%PREDICT Predict the label of an input given a trained neural network
% p = PREDICT(Theta1, Theta2, X) outputs the predicted label of X given the
% trained weights of a neural network (Theta1, Theta2)

% Useful values
m = size(X, 1);
num_labels = size(Theta2, 1);

p = zeros(size(X, 1), 1);


X = [ones(size(X, 1), 1), X];% append bias to X
a2 = sigmoid(Theta1 * X');
a2 = [ones(1, size(X, 1)); a2];% append bias to a2
a3 = sigmoid(Theta2 * a2);
[t, result] = max(a3, [], 1);
p = result';
% =========================================================================

end

第一层和第二层都会添加 bias unit 目前还不清楚有什么用。下次的作业应该涉及到如何训练神经网络的参数,期待。

参考资料

http://www.ai-start.com/ml2014/html/week4.html

https://s3.amazonaws.com/spark-public/ml/exercises/on-demand/machine-learning-ex3.zip


本博客所有文章除特别声明外,均采用 CC BY-SA 4.0 协议 ,转载请注明出处!