Based on history of blood donated, predict if donor will donate or not. Maximum possible efficiency tried to achieve in a small dataset.
(Done as a part of TCS Digital Highway Challenge) The data has 5 attributes and need to predict the 5th one. R (Recency - months since last donation), F (Frequency - total number of donation), M (Monetary - total blood donated in c.c.), T (Time - months since first donation), and a binary variable representing whether he/she donated blood in March 2007 (1 stand for donating blood; 0 stands for not donating blood). Dataset was checked for inconsistencies and attributes normalised to get an even distribution. Increased the number of low incidence examples (True positives) in a dataset using synthetic minority oversampling to try to get a good accuracy. On the available two class classification algos, Decision forest gave the most efficient result,