Thoughts on CS6476: Computer Vision

I recently completed my first course for the Georgia Tech OMSCSComputer Vision—and wanted to share some thoughts I had on it.

Why choose this course?

I recently built APIs for image classification and reverse image search using deep learning libraries. Through the process, I gained an understanding of how images work as a data structure, and how to apply machine learning on them to build useful data products.

Nonetheless, there was a yearning to get a more in-depth understanding of the fundamentals of working with images. In addition, there are plenty of other useful applications for image and video, and the course seemed to provide a broad overview.

What specific CV applications were covered?

The class covered several CV algorithms, and how to apply them to solve simple problems, including:

  • Detecting lines and circles, including counting the total value of currency (Hough)
  • Measuring depth from multiple images (Window-based stereo matching)
  • Detecting features to match images/stitch a panorama (Harris, SIFT, RANSAC)
  • Detecting movements of objects across multiple images (Optical flow)
  • Tracking movements of subjects in videos (Particle filters)
  • Classifying motion in videos (Motion history images)

Here’s an example of detecting circles to count the total value of coins in an image. The algorithm is built solely on Numpy while we used OpenCV libraries to draw circles.


Continue reading