# End term Submission_Manisha

December 4, 2019
Algorithms

Report Abuse
1. Reason for termination: • Another position • Unhappy • More money • Career change 2. No. of females left =14 3. No. of people joining in Near Future = 11 4. Most expensive recruitment channels are as follows: Career builder: 7790 Pay per click: 1323 MBTA ads: 646 On campus recruiter: 625 5. Retention rate of “Employee Referral” = 82.76% 6. To build a linear regression model, pay rate has been taken as dependent Variable because this is the rate of money attained by an employee by keeping position, manager name, employee source, performance score and department as independent variables. This model explains the that the amount of money mostly depends on the position, department and manager as well which explains the general rule of salary earned also. 7. To build a logistic regression model, employment status has been taken as dependent variable which is categorical in nature. This would help us understand what are the variables which are creating an impact on the status of employment. The factors taken for logistic regression are payrate, department, position, manager name and performance score. These variables are taken into consideration because they affect the status for any employee in large dimensions. 8. In CART, we use the same variables as in Logistic Regression i.e. EmpStatus ID is based on sex, unit, age, efficiency, director, marital status, as well as reducing the number of decision trees to 1 and evaluating results for better analysis. 9. The interpretation is given below: • The correlation between two variables depend on the lift ratio. The lift ratio is directly proportional to the lift ratio. It also means that the two variables are independent and have nothing in common, still they show high degree of association. In the first case, performance score which equals to fully meets and terminated for a cause have high correlation. • A lift ratio larger than 1 shows that the relationship between the “Marital description” = “divorced” and “Employee status” = “voluntarily terminated” is clearly more significant than what would be expected if the two sets were not dependent on each other. Also, the count of 3 implies that, the three case where a person was divorced has voluntarily left the organisation. • If the employee is able to reach the required performance score in 90 days, the employee is voluntarily terminated. A lift ratio larger than 1 signifies that the relationship between the Performance score = 90 days meets and employee status = voluntarily terminated is more appropriate than that would be expected if the two items were independent. The count of 3 implies that, the three transactions where a person has met his performance score and has voluntarily left the organisation. • If the employee's marital description is divorced and his performance score ranges from N/A to too early to review, the status of employee comes out to be voluntarily terminated. Confidence measures the reliability of the inference made, in this case, the confidence is 100% at a count of 4, which means the probability of an employee who is divorced and whose performance score is between N/A to too early to review is likely to get terminated voluntarily. • When the employee is Male and belongs to the category of Production Technician II, the status of employee being voluntarily terminated. The lift value is 3.349 which is simply the ratio of these values: target response (i.e. employee status) divided by average response (i.e. position and gender). This value of lift lets us know the degree to which those two occurrences are dependent on one another and hence makes those rules potentially useful for predicting the outcome for future such occurrences and becomes useful for future uses.