control-systems-and-automation
How to Use Matlab's Neural Network Toolbox for Pattern Recognition
Table of Contents
Understanding Pattern Recognition with Neural Networks
Pattern recognition is the automated process of identifying patterns and regularities in data. In engineering and scientific applications, it often translates to classifying input data into predefined categories—for example, identifying handwritten digits or detecting defects in manufacturing images. Neural networks are particularly well suited for pattern recognition because they can learn highly nonlinear decision boundaries directly from data without requiring explicit feature engineering. MATLAB’s Neural Network Toolbox (now part of Deep Learning Toolbox) provides a comprehensive set of functions and apps that streamline the entire workflow: from data preparation and network creation to training, evaluation, and deployment. This makes it an indispensable tool for researchers and practitioners who need to build robust pattern recognition systems quickly.
Pattern recognition tasks can be broadly divided into supervised learning (where the correct categories are known during training) and unsupervised learning (clustering). The Toolbox supports both, but this article focuses on supervised pattern recognition using feedforward neural networks. Key performance metrics include accuracy, confusion matrix, sensitivity, specificity, and area under the ROC curve (AUC). Understanding these metrics is critical for evaluating how well your model generalizes to unseen data.
Getting Started with MATLAB’s Neural Network Toolbox
Installing and Verifying the Toolbox
Before diving in, ensure that you have MATLAB installed along with the Deep Learning Toolbox (formerly Neural Network Toolbox). In recent MATLAB versions, you can verify installation by running ver(‘nnet’) in the command window. If the toolbox is missing, install it via the Add-On Explorer or through your license manager. The Toolbox offers both command-line functions and the interactive Neural Network Start GUI (nnstart) and the more modern Deep Network Designer app.
Data Preparation
Proper data preparation is the foundation of any successful neural network model. Start by organizing your dataset into an input matrix X and a target matrix T. For pattern recognition, targets are typically represented as a matrix of class indices or as indicator vectors (one-of-N encoding). For example, for 10 classes (e.g., digits 0‑9), each target vector is a 10‑element binary vector with a 1 in the row corresponding to the correct class.
Normalization is essential: neural networks train faster and perform better when input features have similar scales. Use the mapminmax function to scale inputs to the range [-1,1]. Alternatively, you can normalize to zero mean and unit variance using zscore. Always compute normalization parameters from the training set only, then apply the same transformation to validation and test sets to avoid data leakage.
Data splitting: the Toolbox automatically divides data into training, validation, and test sets using the default ratios (70/15/15). You can control this by setting net.divideFcn and net.divideParam. For small datasets, consider using cross-validation (e.g., crossvalind or manual k‑fold loops) to obtain a more reliable performance estimate.
If your dataset includes categorical inputs, convert them to dummy variables. The Toolbox’s dummyvar function can help. Missing data should be imputed or discarded; avoid using them naïvely because neural networks do not handle NaN gracefully.
Creating a Neural Network
The primary function for pattern recognition is patternnet. It creates a two‑layer feedforward neural network with logistic sigmoid (logsig) or tan‑sigmoid (tansig) transfer functions in the hidden layer and a softmax transfer function in the output layer. Softmax ensures that outputs sum to 1, making them interpretable as class probabilities.
hiddenLayerSize = 10;
net = patternnet(hiddenLayerSize);
You can customize the network further:
- Number of hidden layers:
patternnetuses one hidden layer by default. For complex problems, usefeedforwardnetwith multiple hidden layers (deep learning). However, for most pattern recognition tasks, one hidden layer is sufficient if the number of hidden neurons is chosen appropriately. - Transfer functions: change the hidden layer activation using
net.layers{1}.transferFcn = ‘tansig’;. Tansig often converges faster than logsig. - Training algorithm: by default,
patternnetuses scaled conjugate gradient (trainscg). For larger datasets, consider Levenberg‑Marquardt (trainlm) but note it is memory‑intensive. Other options include Bayesian regularization (trainbr) which provides built‑in overfitting protection. - Performance function: the network uses cross‑entropy (
crossentropy) as the default performance metric, which is appropriate for classification. You can change it vianet.performFcn.
Training the Network
Training is performed with the train function. The Toolbox uses an internal validation set to implement early stopping, which prevents overfitting by halting training when the validation error starts to increase.
[net, tr] = train(net, inputs, targets);
The structure tr contains training, validation, and test performance indices, as well as the epoch numbers. After training, you can view the training performance plot with plotperform(tr). It is good practice to run the training multiple times (due to random weight initialization) and select the network with the lowest validation error.
Training tips:
- If the training is too slow, try increasing the learning rate (via
net.trainParam.lrfor gradient descent methods) or switch to a faster algorithm liketrainscg. - Monitor the confusion matrix during training to ensure the model is learning all classes equally. Use
plotconfusion(targets, outputs)after training. - Use
net.trainParam.max_failto control early stopping patience (default is 6 validation checks). Decreasing it can help avoid overfitting on noisy data.
Testing and Evaluation
Once the network is trained, evaluate its performance on the test set that was never used during training. Simulate the network using the sim function:
testOutputs = sim(net, testInputs);
The output is a matrix of probabilities. To obtain class labels, use [~, predictedClasses] = max(testOutputs, [], 1);. Compare these to the actual target indices. Common evaluation tools:
- Confusion matrix:
plotconfusion(actualTargets, testOutputs) - ROC curves:
plotroc(actualTargets, testOutputs)– especially useful for multi‑class problems. - Accuracy calculation:
accuracy = sum(predictedClasses == trueClasses) / length(trueClasses); - Precision and recall: compute per‑class using
confusionmatand iterate over classes.
If performance is unsatisfactory, revisit data preparation, increase the number of hidden neurons, try a different training algorithm, or collect more training data.
Advanced Techniques for Better Pattern Recognition
Choosing the Optimal Number of Hidden Neurons
This is one of the most important architectural choices. Too few neurons lead to underfitting (high bias); too many result in overfitting (high variance). A common rule of thumb is to start with a number between the size of the input layer and the output layer, but systematic exploration is better. Use a loop to train networks with increasing hiddenLayerSize and plot validation performance. Alternatively, apply Bayesian regularization (trainbr) which automatically penalizes network complexity.
For small datasets, use k‑fold cross‑validation within the selection loop to obtain unbiased performance estimates. Be cautious: as the number of hidden neurons grows, training time increases quadratically.
Regularization and Overfitting Prevention
Beyond early stopping, the Toolbox offers explicit regularization. Use net.performParam.regularization = 0.5 (a value between 0 and 1) to add a penalty on network weights. Higher values increase regularization. Bayesian regularization (trainbr) provides sophisticated automatic regularization and is highly recommended when you have limited data.
Another effective technique is weight decay, available by setting net.performFcn = ‘msereg’ (mean squared error with regularization) and adjusting the ratio via net.performParam.ratio. For deep networks, dropout layers are supported in the Deep Learning Toolbox’s training options for trainNetwork (used with layer graphs), but the older patternnet does not have built‑in dropout. For most shallow problems, early stopping and regularization suffice.
Data Augmentation for Limited Datasets
When training data is scarce, artificial expansion of the dataset can improve generalization. For image‑based pattern recognition, simple transformations (rotation, translation, scaling, noise injection) can be applied. Use MATLAB’s imageDataAugmenter (requires Deep Learning Toolbox) to create an augmented datastore. For non‑image data, add Gaussian noise to input features or create synthetic samples using SMOTE (available via File Exchange).
Ensemble Methods
A single neural network may have high variance. Ensembles—combining predictions from multiple networks trained on different random initializations or data subsets—often yield more stable and accurate results. In MATLAB, you can train several patternnet instances and average their outputs. For example:
numNets = 10;
allOutputs = zeros(size(targets,1), size(targets,2), numNets);
for i = 1:numNets
net = patternnet(hiddenLayerSize);
net = train(net, inputs, targets);
allOutputs(:,:,i) = sim(net, testInputs);
end
ensembleOutput = mean(allOutputs, 3);
[~, ensemblePred] = max(ensembleOutput, [], 1);
Ensembling can boost accuracy by 1‑5% on many real‑world problems and is especially useful when you cannot perform extensive hyperparameter tuning.
Practical Example: Handwritten Digit Recognition
A canonical pattern recognition problem is classifying images of handwritten digits (MNIST dataset). Although the full MNIST is large, a subset can be loaded using digitDataset = load(‘digits.mat’) from the Toolbox’s sample data. Steps:
- Load and preprocess: resize images to a fixed size, convert to column vectors, normalize pixel values to [0,1].
- Create target matrix: for 10 classes, use
targets = full(ind2vec(labels))wherelabelsare 0‑9. - Choose hidden layer size (e.g., 20) and train with
patternnet. - Evaluate: The Toolbox’s confusion plot will show misclassifications (e.g., 3 mistaken for 5). Investigate those cases to see if they are genuinely ambiguous.
With proper tuning, a single hidden layer network can achieve >95% accuracy on the test set. Deeper networks with convolutional layers (using the Deep Learning Toolbox) can reach >99%, but patternnet is excellent for rapid prototyping and baselines.
Deploying Trained Networks
Once you are satisfied with performance, you may need to deploy the network. The Toolbox provides several options:
Exporting to Simulink
Use gensim(net) to generate a Simulink block that encapsulates the trained neural network. This is useful for real‑time simulation or integration with other dynamic systems.
Generating C/C++ Code with MATLAB Coder
For embedded deployment, use genFunction to produce a standalone MATLAB function that can be run using the MATLAB Runtime. Alternatively, with MATLAB Coder, generate portable C code via codegen. For example:
netFcn = @(x) net(x); % create a wrapper
codegen netFcn -args {rand(size(inputs(:,1)))} -config:lib -o patternNetLib
This generates a library that can be integrated into larger C++ applications. Note that you must have the appropriate licenses (MATLAB Coder and, for floating point, the Fixed‑Point Designer if needed).
Creating a Standalone Application
The compiler.build toolchain can package your trained model into a standalone executable or shared library, optionally with a GUI. This is useful for sharing the model with colleagues who do not have MATLAB.
Conclusion and Further Resources
MATLAB’s Neural Network Toolbox provides a robust and accessible platform for developing pattern recognition systems. By following the structured workflow—data preparation, network creation, training, and evaluation—you can quickly build and validate models that achieve high accuracy on classification tasks. Advanced techniques such as regularization, ensemble methods, and careful architecture selection further enhance performance. The ability to deploy trained networks to Simulink, C/C++, or standalone applications makes this toolbox a full‑lifecycle solution for pattern recognition projects.
For deeper exploration, consult the following resources:
- MathWorks patternnet Documentation
- Neural Network Pattern Recognition Tutorial
- MNIST Dataset in MATLAB Format (File Exchange)
- “An Overview of Overfitting and its Solutions” (arXiv)
With practice, you will be able to apply these tools to a wide variety of pattern recognition challenges, from medical diagnosis to industrial quality control.