This holds a few summary notes and snapshots.
Object Detection and Machine Learning Notebooks:
- Object detection with the
(Faster RCNN)
two-stage object detection algorithm. The backbone is a ResNet 50 CNN w/
a Feature Pyramid Network pretrained on the COCO dataset.
A transfer-learning technique called fine-tuning is used to replace the last layer,
that is, the classification layer to adapt the model to the small, specific dataset
of android statue images. The fine-tuned model is then used on unseen video frames
from a YouTube video.
The jupyter notebook. Summary snapshots and statistics.
Highlights of computer vision algorithms (without machine learning):
- Finding an object in a segmented image by shape alone.
PartialShapeMatcher can find a matching articulated, occluded object in an
image and MultiPartialShapeMatcher can find the same object among a
database of candidate objects.
The quality of the closed curve contours matters. Here are shapshots from
using MultiPartialShapeMatcher on a database of contours constructed using
Segment Anything Model-2 (SAM-2) and using the gingerbread man as a query object.
Also shown are the results when the contours are formed using a quick custom
segmentation method.
-
A correspondence list can be made quickly with
keypoints (e.g. Harris corners, inflection points, peaks in aTrous wavelets) and
the rotationally invariant ORB descriptors. RANSAC is also used to further remove outliers
through epipolar fits.
-
An improved Canny Edges Filter can be made for color images by combining the greyscale
Canny edges with the same made with the "C" image of the LCH colorspace.
-
MSER (Maximally Stable Extremal Regions) are a very fast algorithm that can be used
where level sets are needed. Using MSER on greyscale and on the "H" image of
LCH color space can find potential objects.
I combined the Canny color edges with the boundaries of MSER regions to create
edges that are complete contours (== segmentation).
The current version has some color filter rules that are potentially fragile and
the use of MSER is resolution dependent so a wrapper class is provided.
The results look promising.
-
Finding an object in another image in which it has changed location, lighting,
and pose is not easy, but can be done with MSER and HOGs.
The current version has wrapper classes which help pre-prepare the images.
Some of the notes while implementing comp vis w/o machine learning: