A variety of techniques to improve classification of images and blogs by age and gender for the Machine Learning 2009 competition by adding additional features to Naive Bayes and stemming the dictionary of unique words.
We implemented a variety of techniques to improve classification of images and blogs by age and gender for the Machine Learning 2009 competition. The starter files provided us a baseline implementations using simple versions of Naive Bayes for blogs and SVM for images. Blog classification was improved by adding additional features to Naive Bayes and by stemming the dictionary of unique words. Image classification was improved by moving to the HSV colorspace, introducing Haar wavelets, and implementing PCA. Perceptron and generated typical faces where also investigated before being discarded for low performance.