WHY AI IN IMAGE Analysis?
In almost all Image Analysis applications, the most challenging task is to separate the "object of interest" from all other features in the image. When ever it's to separate Perlite from Ferite, a coating layer from the substrate and background, or grain boundaries in a grain image it is often a very challenging task.
The traditional way of performing separation is using gray-level thresholding, often in combination with a number of hard to understand filters (such as Erode, Dilate, Remove Small Objects, Fill Holes, Watershead, Laplace transforms etc). All in all, these scripts are not just very difficult to construct, they have proven to be quite unreliable and seize to work if if there is just the slightest change in preparation, lightning condition or among different samples. This in turn resulting in fiddling with the settings or time consuming manual editing of the image, or even worse ignoring the obvious error.
So when it comes to segmentation, this is one area where AI shows extremely promising results, and can easily outperform both human beings and traditional image analysis.
Here is a sample comparison where the task is to separate the Ferite in duplex Steel, a seemingly simple task:
The first column shows when a human being has manually edited the mask (red) in an editing software by means of Thresholding, filling holes, manual painting of missing pixels and erasing of pixels that was wrong. A process that toke well over one 1 hour (and is was damn boring). If properly done, this could be thought of as the ground truth.
The second column shows the result of manual thresholding. It's obvious how bad it performs on this image! Not only is the uneven illumination (barely visible for us), ruing the detection of Ferite, where it's over detected on the left side, and under detected on the right side. On top of this, the dirt in both the sensor and optical setting is detected (as they are darker) as is dirt on the sample. All in all a very bad separation. In this particular case, the separation is so bad, there are no likely hood at all that further processing would improve the outcome. Of course, applying background subtraction, perhaps applying edge detection and other advanced filters could improve things.
Third column shows how well AI separates Ferite. Observe the almost identical result to the one made by a human being, which we think is quite astonishing! Also consider that this operation took less than a second and will perform equally well on all other subsequent images, regardless of illumination or changes in sample.
Take this one for example:
Here is another example with a substrate coated in with a gold layer. Obviously the gold layer is much brighter compared to the background (bakelite) and the substrate. The challenge is of course that the gold layer is not homogeneous, with small black voids, and the substrate having bright contents (of gold) we do not define as part of the coating. In this particular image the coating is not ok, with substantially thicker coating in the center. Also note the scratches in the sample introduced at sample preparation, that is a natural result when preparing softer materials.
Once again the thresholded image mask is fails quite terribly (well, not as bad as the previous example). In comparison with the manually edited image, there are holes in the coating (due to the voids in the coating), very weak and noisy detection on the right hand side (due to uneven illumination) and alot of detected pixels in the substrate, due to the scratches introduced at preparation (see left side) and brighter spots in the substrate (see center bottom). In this case. In this case its likely advanced image processing could improve the end result, but never to the point it's identical to the manual edited mask. Especially the weak signal on the right hand side is difficult to fix.
Now compare with the AI-processed image, which is more or less identical with the manually separated image. It's really hard to spot a difference with the manually edited image!
And here is another example using a similar sample (now with the layer being ok, but the sample slightly tilted) processed with AI. Again it looks flawless!
Hopefully these sample mad you both convinced and intrested in learning more about AI, and how it can be used to make you daily Image Analysis challenges easier and faster.
How is it DONE?
AI is a broader term, commonly used to denote algorithms that acts or at least seems to act like a human being. In some cases, as in imaging applications, Neural Networks are used, and even though the name suggest we build a "brain" using Neurons these algorithms are far of from how a human brain works.
But what the algorithms shares with humans is that the learn from experience, very much as a us. For example, we do not tell a child that a chair has some given features, such as 4 legs that are spiky, soft cushion and is roughly 1 meter tall. We simply show a few different chairs and the child will find out the shared characteristics of a chair. Based from that experience, a child easily discover other chairs, even if that next chair does not at all look at the chairs from home. Things like context (like the chair is close to a table or arranged in an earlier seen way helps). It's kind of magic, but the human body is kind of magic!
For a mathematician or anyone knowing the details of Neural Networks it's not at all Magic, the mathematics is fairly simple, but the total complexity and amount of computations is enormous, but the end result is often quite magic as well.
What is common between a human brain and Neural Networks is that they both learns by "trial and error" and by showing examples. So by showing a AI-algorithm a few examples, the algorithm learns by that and is ready to perform the magic.
In the above examples with the duplex material, only one image was used to train the system. With the gold layer coating example, 4 images was used. The number of images required to train the system will vary from task to task, for example depending on how much your sample is likely to vary. So basically, you pick a few images (that looks differently, otherwise it's kind of pointless) and annotate them manually. The more precise you are, you better outcome. In this process you sometimes use traditional thresholding to have a starting image, and then you work from there using a manual editing tool. Next step is to produce a number of "augmented" images that could be "theoretical" typical images out from the images made in step 1. For example brighter and darker images, slightly rotated images, unevenly illuminated images, blurry or unsharp images, images with scratches etc. This is a totally automatic step and does not require any manual work. 3 step is to train the network, a process that can often take anytime between 10 minutes and hours, depending on computer and number of images. The final step, which might be skipped, is to verify or measure the quality of or new brain by running it though a set of images that was not part of the training process. In order for this step to work we need a few more images that is manually annotated (why is why we often tend to skip this step).
The final outcome is what is often called a "model", as our community grow we can share our models and together improve them over time (by adding more and more sample images, and do re-traning)
In order to get this to work in practice in Pixelina, you probably do not need to know so much more.
The whole training process is easily done in the software, and