Loy Taxi fare

June 15, 2019
How much is the taxi fare? Using Boosted Decision Tree Regression to predict taxi fare
**Problem**<br> This problem is to predict the fare of a taxi trip in New York City. At first glance, the fare may seem to depend simply on the distance traveled. However, taxi vendors in New York charge varying amounts for other factors such as additional passengers, paying with a credit card instead of cash, and so on. This prediction could help taxi providers give passengers and drivers estimates on ride fares. <br><br> **Using Boosted Decision Tree Regression to predict taxi fare**<br> <br> The finished model<br> https://raw.githubusercontent.com/laploy/ML.NET/master/GitHub-Issue/github-issue-azureML-model.JPG <br> Question and Data<br> Question: How much is the taxi fair?<br> <br> Dataset: <br> taxi-fare-train.csv https://raw.githubusercontent.com/laploy/ML.NET/master/Taxi-fair/taxi-fare-train.csv taxi-fare-score.csv<br> https://raw.githubusercontent.com/laploy/ML.NET/master/Taxi-fair/taxi- fare-score.csv<br> <br> taxi-fare-batch.csv<br> https://raw.githubusercontent.com/laploy/ML.NET/master/Taxi-fair/taxi-fare-batch.csv<br> <br> **Dataset description**<br> vendor_id: A code indicating the TPEP provider that provided the record. <br> rate_code: The final rate code in effect at the end of the trip. <br> 1. Standard rate<br> 2. JFK<br> 3. Newark<br> 4. Nassau or Westchester<br> 5. Negotiated fare<br> 6. Group ride <br> passenger_count: The number of passengers in the vehicle<br> trip_time_in_secs:<br> trip_distance: The elapsed trip distance in miles reported by the taximeter. <br> payment_type: A numeric code signifying how the passenger paid for the trip. 1= Credit card <br> 2= Cash <br> 3= No charge <br> 4= Dispute <br> 5= Unknown <br> 6= Voided trip<br> fare_amount: The time-and-distance fare calculated by the meter.<br> <br> **Evaluation Metrics** <br> Mean absolute error (MAE)<br> average of all the model errors, where model error is the distance between the predicted label value and the correct label value. <br><br> Coefficient of determination<br> how well data fits a model. Ranges from 0 to 1. A value of 0 mean that the data is random or otherwise cannot be fit to the model. A value of 1 means that the model exactly matches the data. <br><br> ![Model][1] <br><br>![Score][2] ![enter image description here][3] ![enter image description here][3] [1]: https://raw.githubusercontent.com/laploy/ML.NET/master/Taxi-fare/taxi-fare-model.JPG [2]: https://raw.githubusercontent.com/laploy/ML.NET/master/Taxi-fare/taxi-fare-score.JPG [3]: https://raw.githubusercontent.com/laploy/ML.NET/master/Taxi-fare/taxi-fare-metrix.JPG