When considering model accuracy in Machine Learning, several important concepts come into play. Here are key concepts related to model accuracy:
1. True Positive (TP): It refers to the number of correctly predicted positive instances by the model. For instance, correctly identifying a disease when the person actually has it.
2. True Negative (TN): It represents the number of correctly predicted negative instances by the model. For example, correctly classifying a non-diseased person as negative.
3. False Positive (FP): It indicates the number of negative instances incorrectly classified as positive by the model. This is also known as a Type I error. For instance, incorrectly diagnosing a healthy person as having a disease.
4. False Negative (FN): It signifies the number of positive instances incorrectly classified as negative by the model. This is referred to as a Type II error. For example, failing to diagnose a diseased person accurately.
5. Accuracy: It is the overall correctness of predictions made by the model and is calculated as the ratio of correct predictions (TP + TN) to the total number of predictions.
Accuracy = (TP + TN) / (TP + TN + FP + FN)
6. Precision: Precision measures the proportion of correctly predicted positive instances (TP) out of all instances predicted as positive (TP + FP). It indicates the model’s ability to minimize false positives.
Precision = TP / (TP + FP)
7. Recall (Sensitivity or True Positive Rate): Recall represents the proportion of correctly predicted positive instances (TP) out of all actual positive instances (TP + FN). It indicates the model’s ability to minimize false negatives.
Recall = TP / (TP + FN)
8. F1-Score: The F1-score is the harmonic mean of precision and recall. It provides a balanced measure that combines both precision and recall into a single metric. It is useful when the data is imbalanced.
F1-Score = 2 * (Precision * Recall) / (Precision + Recall)
9. Specificity (True Negative Rate): Specificity measures the proportion of correctly predicted negative instances (TN) out of all actual negative instances (TN + FP). It indicates the model’s ability to minimize false positives.
Specificity = TN / (TN + FP)
10. Receiver Operating Characteristic (ROC) Curve: The ROC curve is a graphical representation of the performance of a binary classification model at various thresholds. It shows the tradeoff between the true positive rate (sensitivity) and the false positive rate (1 – specificity).
11. Area Under the ROC Curve (AUC-ROC): AUC-ROC quantifies the overall performance of a classification model. It represents the probability that a randomly chosen positive instance will be ranked higher than a randomly chosen negative instance. A higher AUC-ROC indicates better model performance.
These concepts related to model accuracy help in evaluating and interpreting the performance of Machine Learning models. They provide insights into the model’s ability to make correct predictions, minimize errors, and handle imbalanced datasets. Understanding these concepts aids in selecting appropriate evaluation metrics and optimizing model performance.