Header image
Ph.D. Student  

Machine Learning

I took this class taught by Charles Isbell in spring 2007 at the Georiga Institute of Technology.

I did 4 assignments and the final project. You can find a brief description of each underneath as well as the source code and the analysis.


Final project

I did this project together with Adebola Osuntogu and Mingxuan Sun. We won the best-paper award that the other students could vote for.

Given some images of buildings and non-building our goals are:

  1. discriminate buildings from non-buildings
  2. recognize a specific building

We compare three different algorithms: Consistent Line Clusters(CLC), Randomized Decision Trees and the Vocabulary Tree.To compare our results to different authors we test our algorithms for the first problem on a subset of the Caltech256 dataset. The second problem is tested on the ZuBuD dataset.

We obtain 99% accuracy on the building versus non-building classification and 75.6% on specific building classification. We conclude that the CLC method performs best in first case, while the Vocabulary Tree performs best
in the second case, although Randomized Trees perform similar (72%).

More details can be found in our paper and our slides.

Sample images

CLC overlayed
over building

Pixel locations for
randomized tree

MSER features overlayed
over building

[ paper | slides ]



Assignment 1 - Supervised learning

We had to evaluate 5 different machine learning algorithms on 2 datasets. The algorithms are:

  • Decision trees with some form of pruning
  • Neural networks
  • Boosting
  • Support Vector Machines
  • k-Nearest neighbors

I used two datasets for evaluation of the 5 machine learning algorithms, one set of images of
hand written digits and one set of cell nucleus properties to enhance breast tumor diagnosis.

Sample images

Digit sample
Snakes fitted around
cell nucleus
Search for optimal
SVM parameters


More details can be found in the analysis.

[ analysis | source ]



Assignment 2 - Randomized optimization

We had to evaluate 4 different optimization techniques on several optimization problems. The optimization techniques are:

  • randomized hill climbing
  • simulated annealing
  • a genetic algorithm

Sample images

Rastrigin’s function
Solutions found by
the genetic algorithm
for rastrigin's function
TSP solved by genetic algorithm


More details can be found in the analysis.

[ analysis | source ]



Assignment 3 - Unsupervised learning and dimension reduction

We had to compare two clustering algorithms

  • k-means clustering
  • Expectation Maximization

and four dimension reduction techniques:

  • PCA
  • ICA
  • Randomized Projections
  • naive dimension reduction via downsampling and mean intensity

on two datasets. I re-used the datasets from assignment 1.

Sample images

First eigenimage
for digit datset
5th indpendent image
for digit dataset
digit 9 projected
on the first 319 random


More details can be found in the analysis.

[ analysis | source ]



Assignment 4 - Markov Decision Processes

We had to compare policy and value interation on two interesting Markov Decision Processes. I took a process with only a few states (a grid world) and one with many states (car racing example).

Sample images

Solution for easy
Solution for 20x20 grid
Solution for the hard
car track


More details can be found in the analysis.

[ analysis | source ]