text.skipToContent text.skipToNavigation

Artificial Intelligence in Image Processing

Artificial Intelligence (AI) is revolutionizing industrial image processing. The technology uses AI models to distinguish objects, deviations and characters, for example, while allowing for natural variations. AI systems combine the benefits of human visual inspection with the robustness and speed of computer-aided systems.

What Do You Need Artificial Intelligence For?

Traditional image processing systems use rule-based algorithms that provide high reliability for repetitive tasks. Because the process steps are always identical, only minor deviations are tolerated. As soon as the variance of the images increases, for example due to natural variations of the object, such as color nuances, shapes or changing ambient conditions such as lighting conditions, the application becomes increasingly complex and can become uneconomical as a result. Otherwise, the variance has a direct impact on the accuracy or performance of the algorithm. This results in incorrect decisions by the rule-based image processing system, such as rejected good parts (also known as pseudo rejects) or unrecognized bad parts (also known as slips). Industrial image processing always aims to minimize these.
 
AI technologies expand the possibilities of machine vision by tolerating fluctuations in the image data, which means that irregular errors, for example, can also be reliably identified. They increase profitability through more robust detection rates with high variance and reduce the entry barrier to machine vision, as in-depth algorithm knowledge is no longer required in some cases.

Artificial intelligence is the umbrella term for technologies that enable machines to solve tasks independently – often inspired by human thinking and learning. Machine learning is a subarea of AI in which algorithms are not programmed rigidly, but learn patterns and relationships from example data. Instead of manually setting each rule, the system “learns” how to translate inputs into outputs itself. Deep learning is a specialized form of machine learning based on artificial neural networks with many processing layers. This architecture makes it possible to detect very complex patterns and deliver precise results even under varying conditions.

Deep Learning
Machine Learning
Artificial Intelligence

What Are the Benefits of Using AI in Quality Control?

Despite the availability of rule-based image processing, quality control in processes is often still carried out manually by humans, as the large variance of errors is often difficult or impossible to detect. This is where the use of AI-supported image processing systems is useful. 
Problem areas of manual control Solution through advantages of AI-powered image processing
Inconsistent assessment of quality Consistent and repeatable assessment based on large data sets
Limited attention span 24/7 operation without fatigue
Time-consuming documentation of decisions Automatic image storage with heatmap display and score value for verifiability and traceability
Higher personnel costs, staff shortages and high training costs Scalable regardless of staff availability, low entry barrier thanks to less training effort
Problem areas of manual control
Inconsistent assessment of quality
Limited attention span
Time-consuming documentation of decisions
Higher personnel costs, staff shortages and high training costs
Solution through advantages of AI-powered image processing
Consistent and repeatable assessment based on large data sets
24/7 operation without fatigue
Automatic image storage with heatmap display and score value for verifiability and traceability
Scalable regardless of staff availability, low entry barrier thanks to less training effort

Does AI Replace Rule-Based Image Processing?

AI expands the possibilities of industrial image processing, but very rarely completely replaces tried-and-tested rule-based solutions. The combination of rule-based and AI-based image processing expands the versatility of machine vision. With AI, inspection tasks can be implemented that would be too time-consuming or inefficient with rule-based evaluation.

When Are Rule-Based and When Are AI Image Processing Systems Used?

The rule-based approach of traditional image processing systems remains a proven solution for visual inspection tasks. AI technologies are expanding the fields of application of industrial image processing. Despite different algorithms, there are numerous overlaps in the capabilities of both technologies.
Typically, a combination of rule-based and AI tools is used. For example: Rule-based localization with part tracking and cutting of the object to be inspected is combined with AI-based error classification. Finally, the rule-based measurement of the error takes place.

Typical Applications

Rule-Based Image Processing

Measurement, measurement tasks
Code reading
Precise alignment, positioning (also robot vision, robot guidance)

Combinations and Overlaps

Inspection, defect detection
Identification (code reading, OCR/character recognition)
Localization of objects and features (also robot vision, robot guidance)

AI-Based Image Processing

Detection of highly varying objects or errors
Challenging OCR (e.g. poor print quality, varying backgrounds)
Localization of objects with high variance
Classification (e.g. of materials or textures)

Which AI Technologies Are Used in Image Processing?

Classification

Classification assigns an image to one or more predefined classes. We differentiate between multi-class and multi-label.

Multi-class: One class per image, e.g. “screw”

or error classification with mutually exclusive classes: “OK” (free of defects) or “NOK” (defect). Simultaneous classification in both classes is excluded. 

Multi-label: Multiple classes per image possible, e.g. “screw”, “nail”

or error classification with independent labels: “dent” and “scratch”. An object can have a dent, a scratch, both or neither, as these classes do not mutually exclude each other.

Object Detection

Object detection locates and classifies multiple objects in the image using bounding boxes. For each object found, it specifies which class it belongs to and where exactly it is located in the image. A distinction must be made between “axially parallel” (as shown in the picture) and “oriented” object detection. With oriented object detection, the bounding boxes are aligned with the object and each describe the smallest possible bounding box.

Segmentation

Assigns a class to each individual pixel in the image for the exact delimitation of objects (e.g. nail, screw, background) or defects (e.g. paint defects). 

Note: In addition to common AI models, there are increasingly AI models trained for a specific use case, such as Deep OCR. Optical character recognition using Deep OCR uses neural networks trained on large amounts of text images to extract letters and numbers. Unlike traditional OCR, it enables precise recognition of dynamic text with variable font sizes and different backgrounds, even with specifically designed or damaged prints and labels.

What Is an AI Model?

AI models are computer-aided models inspired by the human brain. They consist of artificial neurons that process information and are connected to each other by weightings. A weighting is a numerical value that determines how much an input signal affects the neuron.
An AI model is built up in layers: The “input layer” receives the raw data (e.g. images). In “hidden layers”, features are automatically detected and the “output layer” makes a decision based on this. 
During training, the AI model compares its predictions with the ground truth and adjusts the weightings step by step. This learning process is repeated across many examples until the AI model reliably recognizes patterns. 
 
AI Model or Neural Network – What’s the Difference?
Not every AI model is a neural network. The term “AI model” is a generic term for many types of algorithms, including decision trees, statistical models and neural networks. The latter are a form of AI models that are particularly suitable for complex tasks such as image recognition or speech processing. However, the terms “neural network” and “AI model” are often used synonymously. 
Input layer
Hidden layers
Output layer

ONNX – the Universal Exchange Format

The uniVision 3 image processing software enables seamless integration of ONNX networks. You can also use GitHub to quantify your ONNX network for use on wenglor hardware.

The wenglor AI Loop – How AI Works in Industrial Image Processing

Rarely is there a representative and extensive data set of an application at the beginning. Highly precise and reliable AI models are created through the continuous data expansion and validation of the networks once created. With a data and process-centric approach, the accuracy of an AI model can be systematically optimized and maintained consistently throughout the lifecycle of a test facility. Existing data is checked again or new data is recorded and annotated. 
Create and Manage Data Sets
Annotate Data Set (label)
Train and Validate AI Model
Deploy and Run an AI Model
The first and most important step for the data-centric approach is to capture images that represent the application as representatively as possible. New images will be added continuously in the course of the project. 

What Is Important when Creating Suitable Data Sets?

50 to 100 real images per class can already be enough to achieve the first practical results. The data must be well selected, varied and consistent. A lot of image data does not automatically mean better models. The aim is to achieve the entire natural scattering of batches, colors, lighting influences, etc. with a few but high-quality images, and thus create a robust and generalisable solution.
Example: If a factory only captures images of faulty PCBs from a machine under certain lighting conditions, the AI model could learn to associate errors with the particular background or lighting conditions of that machine, rather than the actual error features. This distortion could cause the model to misclassify errors from other machines or under other light conditions. By incorporating diverse images from different machines, lighting conditions and viewing angles, the AI model learns the actual error characteristics, ensuring reliable detection in all production scenarios.
A targeted illumination strategy reduces image variance, increases model accuracy and reduces the need for training images. For example, the same accuracy can be achieved with only a quarter of the image quantity by significantly improving image quality by choosing the right illumination principle, light color (wavelength), homogeneous illumination and optical filters. 
As with rule-based image processing systems, the following also applies here: Bad images lead to significantly poorer accuracy of the AI model. Ensure sharp images with sufficient depth of focus, rich contrast and consistency in the setup (camera, illumination, optics).

A higher resolution shows more details, but requires longer training times and higher resources. The data set images are often reduced for training, e.g. to 320 × 320 pixels (AI input image).
Important: The key feature must also be clearly recognizable in this reduced resolution. What is visible to the human eye can usually also be captured by the AI model.

Images should be taken in real situations and, if possible, under production-like conditions. Provide natural variations, such as background changes, slightly different lighting conditions, dust, noise or slight position variations to make the data set more robust. However, heavy editing or artificially produced bad parts can lead to unrealistic learning patterns. It is also important to avoid systematic errors, such as every good part having a marking, but not bad parts. Ensure a consistent setup for the camera, illumination and optics.
During the training, only use the image area in which the relevant object or error is located. This prevents the AI model from unintentionally learning from the background or important details from being proportionally underrepresented. Cropping ensures more relevant details at low resolution and saves training time.
An equally weighted representation of all classes (e.g. good part, bad part) is recommended. An imbalance, such as 99% OK and 1% NOK, leads to distorted AI models that often miss errors in the application. A balanced data base prevents selection bias of the AI model and improves detection performance even on rare error patterns.
Augmentation means the artificial generation of variants by, for example, rotation, enlargement (zoom), distortion, noise or change in brightness. This allows existing data sets to be expanded and the AI model to be prepared for real scatters, which is especially important for small data sets to achieve higher accuracies quickly.
Important: Augmentation must remain realistic and application-oriented, as its use has a major impact on the balanced accuracy of the AI model. For example, rotation could be an error and is therefore not suitable for every application.
In the second step, the images are annotated or labeled. The user specifies a so-called ground truth for each image, such as whether it is a good or a bad part. 

How Can the Effort of the Label Process Be Minimized?

To ensure consistent annotation, application experts, e.g. from production and quality control, should be involved in the process.
The classification according to “NOK” and “OK” can be subjective, which is why a clear distinction between classes should be ensured before training and further annotation.
Tip: Boundary samples should be specifically marked as such using tags. This information can therefore be included in the subsequent validation of the network.
When evaluating training data, it is crucial to rely solely on the images, not on the real object. Even if a fault on the original part is easier to detect, only what is visible in the image counts for the AI model. If additional knowledge from the real object is included, inconsistencies arise, as the AI model later also only works with image information.
Maintaining a defect catalog with clearly described defect types and example images helps to define exclusion criteria reliably and comprehensibly. If this is updated regularly, e.g. in the event of new errors or products, this facilitates knowledge transfer and the integration of additional labellers.
Tags make it possible to add keywords, which makes data sets clearer and easier to sort. By assigning tags, it is possible to make visible, for example, which data were recorded on which calendar day and at which time of day or which data are considered boundary samples.
In the next step, the AI model is trained or retrained. There are various approaches here, whereby a data set is always divided into training and test data. 

What Must Be Observed when Training an AI Model?

Choosing the network architecture in a format suitable for the inference hardware (e.g. INT8) is crucial for the latency (execution speed) and balanced accuracy of the AI model. Depending on the application, it can be optimized towards latency or accuracy
Tip: Multiple training of the AI model with the identical data set can lead to different performance values. Differences above 5% indicate an inconsistent data set. 
A resolution must be specified for the input image for the training. 
The higher the resolution,
  • the longer the evaluation time,
  • the higher requirements are placed on the RAM for interference (execution),
  • the longer the training takes,
  • the more training data is needed to achieve the same balanced accuracy.
The next step is to validate the AI model. AI is often seen as a black box with input and output, but without clear information to validate the AI model. The report of the AI model provides information on balanced accuracies (recall and precision), false predictions, expected inference time and thus helps with traceability.

Tips and Tricks on the Traceability of the AI Model

The matrix shows how often the predictions of an AI model match the actual classes and where errors occur.
Tip: The most common reasons for false predictions are an incorrect annotation or boundary pattern. In this case, the annotation must be adjusted and retrained. The aim is to minimize false predictions.
The heatmap shows which image area was decisive for predicting the result. The detailed insight allows conclusions to be drawn about errors in the annotation or selection bias. This ensures greater transparency and traceability of the data patterns. 
The prediction of an AI model is based on the so-called score, which indicates how confident the AI model is in its decision. It is not crucial to achieve the highest possible scores in all cases, but rather to achieve a clear distinction between safe and unsafe cases: High scores should only occur with clear results, whereas in uncertain cases deliberately lower scores make sense. This prevents the AI model from being “incorrectly too certain” in its decisions even in unclear situations.
Both during training and inferencing of the AI model, the resolution of each data set image is reduced before it is used as the AI input image. Therefore, check whether the relevant feature is still recognizable. If not, more consistent cropping or choosing a higher resolution of the AI input image can help, for example.
There are various validation methods for evaluating AI models. Typically, data sets are divided into training, validation and test data. 
  • Training data is used to train the AI model and typically accounts for 70–80% of the data. 
  • Validation data is used during training to reconcile weightings and check if the AI model is overfitting. They typically account for 10–20% of the data. 
  • Test data is only used to evaluate the final quality of the AI model and accounts for 10–20% of the data. 
Comparison of Two Common Validation Methods
 
FeaturesHold-out validationK-Fold cross validation
DescriptionData set is split once, e.g. 80% training/20% testData set is divided into k parts, AI model is evaluated k times, each with different test data
Benefits
  • Reduced computing effort
  • Reliable and stable result evaluation thanks to scattering
  • Further quality value for the data set quality thanks to standard deviation
  • Better use of small data sets
  • Reduce the risk of random error evaluations
Disadvantages
  • Result strongly dependent on division
  • Susceptible to distortion, especially with smaller data sets
  • Higher computing effort
Note: Especially for smaller data sets, users benefit from the robust evaluation of the K-Fold validation. Choosing an inappropriate validation method leads to unreliable evaluations of the AI model.
The fully trained AI model is transferred from the training environment to the inference platform, often also on more than one system at the same time. This initial commissioning is referred to as deployment of the AI model. 

How Can AI Be Implemented in the Image Processing Application?

Using AI effectively and efficiently requires appropriate expertise, otherwise implementation will be very costly. With the AI Lab, wenglor offers an intuitive training platform for creating AI models that can be executed seamlessly on the powerful wenglor hardware. weHub allows training cycles to be run continuously and data sets to be continuously expanded with relevant new data.



When creating an AI model, the user first decides on a suitable network architecture. Based on this, a suitable training platform and the tools for execution are then chosen. The universally usable Open Neural Network Exchange (ONNX) exchange format for AI networks offers the cross-platform use of AI models as an open standard. The uniVision 3 machine vision software enables seamless integration of AI models in the ONNX format. With GitHub, you can also quantify your ONNX network for use on wenglor hardware.

  
 
After initial startup, there may be the following reasons for retraining:
  • New classes appear that need to be detected.
  • The score values decrease, e.g. due to batch changes, contamination or wear of workpiece carriers or reduced light output.
  • Requirements for balanced accuracy are changing. 
If retraining is required, the AI Loop restarts. 
First make sure that the existing database is consistent and unique before adding new data. Review false predictions and re-check each annotation if necessary. Use validation tools such as the heatmap, score values and confusion matrix. Then focus on the class with the worst performance and provide about 100 additional images of this class, ideally of variants or products where the model performs particularly poorly and the score value is correspondingly low. Alternatively, 50 new images can be added per class. 
The higher the quality of the data, the more robust the AI model. Quality always comes before quantity. The focus on high-quality, balanced and realistic data leads to more reliable AI models, reduces the risk of overfitting and increases everyday usability in production. Investing time in clever data selection saves time on training later and allows you to achieve high accuracy and traceable results faster.

Comparison of Three Basic Approaches for Training AI Models

Deep learning uses complex neural networks and is particularly suitable for applications with high image variance and high accuracy requirements. This usually requires a lot of computing power and longer training times. Edge learning is also based on deep learning, but differs in that training takes place directly on the end device. This allows for quick and easy implementation, but typically results in less powerful AI models that are more suitable for simple inspection tasks.
Edge learning is often used by AI beginners as a simple solution for image processing tasks, even where traditional, rule-based methods are better suited and have been proven for years. The use of edge learning carries risks, as the simple set-up of edge solutions often comes at the expense of robustness, traceability and detection accuracy. 
Product Comparison