Curvature-scale-space-corners-and-transformations by nking

This holds a few summary notes and snapshots.

Object Detection and Machine Learning Notebooks:

Object detection with the (Faster RCNN) two-stage object detection algorithm. The backbone is a ResNet 50 CNN w/ a Feature Pyramid Network pretrained on the COCO dataset. A transfer-learning technique called fine-tuning is used to replace the last layer, that is, the classification layer to adapt the model to the small, specific dataset of android statue images. The fine-tuned model is then used on unseen video frames from a YouTube video.

The jupyter notebook.

Summary snapshots and statistics.

Highlights of computer vision algorithms (without machine learning):

Finding an object in a segmented image by shape alone. PartialShapeMatcher can find a matching articulated, occluded object in an image and MultiPartialShapeMatcher can find the same object among a database of candidate objects. The quality of the closed curve contours matters. Here are shapshots from using MultiPartialShapeMatcher on a database of contours constructed using Segment Anything Model-2 (SAM-2) and using the gingerbread man as a query object. Also shown are the results when the contours are formed using a quick custom segmentation method.
A correspondence list can be made quickly with keypoints (e.g. Harris corners, inflection points, peaks in aTrous wavelets) and the rotationally invariant ORB descriptors. RANSAC is also used to further remove outliers through epipolar fits.
An improved Canny Edges Filter can be made for color images by combining the greyscale Canny edges with the same made with the "C" image of the LCH colorspace.

MSER (Maximally Stable Extremal Regions) are a very fast algorithm that can be used where level sets are needed. Using MSER on greyscale and on the "H" image of LCH color space can find potential objects. I combined the Canny color edges with the boundaries of MSER regions to create edges that are complete contours (== segmentation). The current version has some color filter rules that are potentially fragile and the use of MSER is resolution dependent so a wrapper class is provided. The results look promising.

Finding an object in another image in which it has changed location, lighting, and pose is not easy, but can be done with MSER and HOGs. The current version has wrapper classes which help pre-prepare the images.

Some of the notes while implementing comp vis w/o machine learning:

segmentation and edges

misc scale space contour notes

projection

Curvature-scale-space-corners-and-transformations

Machine Learning and Computer Vision Programs