# HR Data - End Term

December 4, 2019
Algorithms

Report Abuse
Ans 1) Another Position, Unhappy, More Money, Career Change. Ans 2) 16 Ans 3) 11 Ans 4) Career Builder, Pay Per Click, MBTA Ads, On-Campus Recruiting. Ans 5) 82.76 % keeping future joiners out of the calculation for "Employee Referral" category. Ans 6) To build a linear regression model, I have taken Pay rate as dependent Variable because this is the rate of money attained by an employee by keeping position, manager name, employee source, performance score and department as independent variables. This model explains the that the amount of money is mostly depends on the position, department and manager as well which explains the general rule of salary earned also. Ans 7) To build a logistic regression model, I have taken employment status as dependent variable which is of categorical nature. This would help me understand what are the variables which is impacting the status of employment. The factors for logistic regression taken are payrate, department, position, manager name and performance score. These variables are taken into consideration because they affect the status for any employee by a large degree. Ans 8) If we would have to choose CART, I would like to use Reason for Termination in addition to the variables taken into consideration for logistic regression as well in order to predict the results more accurately as Cart is a process of answering the sequence of questions in hierarchical basis using algorithms. So, for predicting the better results. Ans 9) The correlation between the two variables depends upon the lift ratio. The larger the lift ratio, the more significant is the correlation between the two variables, which means that if the two defined variables are independent, they still show high degree of association. In the first case, performance score=fully meets and terminated for a cause have high correlation. Having confidence 1 for all the association rules meaning high confidence for single marital status managers to take termination for a cause. Similarly, for marital status “divorced” takes voluntary termination. With support value equal in 4 out of 5 rules signify all 4 items in the rules are equally frequent to appear in the data set.(0.0108) Lift is highest for single manager, whose employment status is volunteer termination If the lift ratio is larger than 1, it implies that the relationship between the Marital description= divorced and employee status=voluntarily terminated is more significant as compared to the situation in which the two values are independent. Also, the count of 3 shows that three case where a person has voluntarily left the organisation due to divorce. If the employee can meet performance score in 90 days, the employee is voluntarily terminated. A lift ratio larger than 1 implies that the relationship between the Performance score= 90 days meets and employee status=voluntarily terminated is more relevant than would be expected if the two items were independent. The count of 3 shows, the three situations where a person has met his performance score and has voluntarily left the organisation. If the employee's marital description is divorced and his performance score ranges from N/A to too early to review, the status of employee being voluntarily terminated. Confidence measures the reliability of the inference made, in this case, the confidence is 100% at a count of 4, which means the probability of an employee who is divorced and whose performance score is between N/A to too early to review is likely to get terminated voluntarily. When the employee is Male and a Production Technician II, the status of employee being voluntarily terminated. The lift value is 3.349 which is simply the ratio of these values: target response (i.e. employee status) divided by average response (i.e. position and gender). This value of lift lets us know the degree to which those two occurrences are dependent on one another and makes those rules potentially useful for predicting the outcome for future such occurrences.