APEX · Body Performance Intelligence

Participant Profile

11 inputs required

Simulation mode — approximated RF decision boundaries Run real model in Colab →

01 — Demographics

Age yrs

35

Gender

02 — Body Composition

Height cm

170

Weight kg

70

Body Fat %

20

BMI: 24.2 (Normal)

03 — Cardiovascular

Diastolic BP mmHg

80

Systolic BP mmHg

120

04 — Fitness Test Results

Grip Force kg

35

Sit & Bend cm

20

Sit-ups reps

35

Broad Jump cm

160

AWAITING INPUT

Adjust sliders and click Analyze

Random Forest Prediction

— cm

Linear Regression Prediction

— cm

Age/Gender Benchmark: —

Performance Profile Radar

Model Performance

Leaderboard

#Model Accuracy Precision Recall F1 CV Mean ± Std

1

Random Forest

74.26% 74.71% 74.26% 74.12% 73.32% ± 0.78

2

Neural Network (MLP)

74.06%75.19%74.06%74.16%74.00% ± 1.12

3

SVM (RBF Kernel)

71.62%72.12%71.62%71.57%70.90% ± 0.69

4

Decision Tree

65.17%66.79%65.17%65.32%64.75% ± 0.80

5

Logistic Regression

62.36%62.05%62.36%61.97%61.67% ± 0.83

6

KNN (k=11)

61.84%63.81%61.84%61.91%62.08% ± 0.72

R

Linear Regression (OLS)

77.88%^R² N/AN/AN/A RMSE 18.80

Split Stability Analysis

80/20 · 70/30 · 50/50

Accuracy Across Train/Test Splits ALL 6 MODELS · 3 RATIOS

Grouped bar chart — each cluster is one split ratio · Y-axis starts at 55% for readability

Most Stable: RF Δ 1.90% from 80/20 → 50/50. Consistently top performer regardless of data volume.

70/30 Is Best Split All 6 models peak or improve at 70/30 vs 80/20, confirming this as the optimal ratio for this dataset.

KNN Most Sensitive KNN degrades consistently with less training data — confirming distance-based methods need data volume.

K-Fold Cross-Validation

5-Fold · Generalisation

CV Mean ± Std Dev 5-Fold · Full Dataset

Lower std = more generalizable · Results prove RF isn't a lucky split — it dominates across all folds

#	Model	CV Mean	± Std Dev	Performance Bar

Finding: SVM has the lowest std (±0.69%) — most consistent across folds. Neural Network has the highest std (±1.12%) — sensitive to fold composition. Random Forest achieves 73.32% ± 0.78% CV, confirming its robustness over any single split result.

Classification Models

Detailed Analysis

Random Forest

Best overall robust bagging ensemble of 200 Decision Trees.

Accuracy74.26%

F1 Score74.12%

✓Highest accuracy, robust to outliers and prevents overfitting.

✓Most stable across 80/20 & 50/50 splits.

✕Less interpretable than a single tree.

Neural Network

Multi-Layer Perceptron (128, 64) modeling highly non-linear boundaries.

Accuracy74.06%

F1 Score74.16%

✓Top-tier accuracy predicting non-linear boundaries.

✕Sensitive to data volume; drops significantly on 50/50 split.

✕Least interpretable (black-box).

Support Vector Machine

Hyperplane separation utilizing the RBF (Gaussian) Kernel.

Accuracy71.62%

F1 Score71.57%

✓Very effective in high-dimensional feature spaces.

✓Strong non-linear capture with C=10.

✕Feature scaling is strictly mandatory for distance computations.

Logistic Regression

Linear multinomial classification optimized via L-BFGS.

Accuracy62.36%

F1 Score61.97%

✓Highly interpretable feature coefficient analysis.

✓Outputs calibrated class probabilities well.

✕Struggles completely with non-linear class separations.

Decision Tree

Recursive partition splits maximizing node purity (Gini).

Accuracy65.17%

F1 Score65.32%

✓Fully interpretable branching and scaling not required.

✕Prone to high variance and overfitting deeper than max_depth=8.

K-Nearest Neighbors

Instance-based learner computing Minkowski distances (k=11).

Accuracy61.84%

F1 Score61.91%

✓Simple lazy learner with no training phase.

✕Lowest performer overall due to high dataset dimensionality.

✕Drops sharply on smaller data splits.

Confusion Matrix & Error Analysis

Random Forest · 70/30 Split

RF Performance by Class True vs Predicted

Interactive 4×4 grid revealing specific misclassification boundaries

		Predicted Class
		A	B	C	D
True Class	A	882 87.8%	94	22	6
	B	136	603 60.1%	240	25
	C	35	175	660 65.7%	135
	D	8	24	170	802 79.9%

High B↔C Confusion
The most common error is confusing Class B with C (240 + 175 = 415 errors). These intermediate fitness levels have highly overlapping feature distributions, making linear separation impossible.

Excellent Extreme Accuracy
The model rarely makes catastrophic errors. Only 6 Class A participants were misclassified as D, and 8 D's as A. Extreme fitness tiers are highly distinct.

Per-Class F1 Scores

A

0.78

Best

B

0.63

Lowest

C

0.69

Moderate

D

0.86

Outstanding

Regression Models

Broad Jump Prediction

RF Regressor

Ensemble averaging across 200 jump-predicting trees.

R² Score0.7842

RMSE / MAE18.57 / 13.82

✓Best fit capturing complex agility metric interactions.

✓Extremely resilient to outliers.

Neural Network Regressor

MLP (128, 64) non-linear regression for jump prediction.

R² Score0.7837

RMSE / MAE18.59 / 13.88

✓Confirms strong linear association between variables and jump.

✕Cannot fully capture peak performance explosive thresholds.

SVR (RBF Kernel)

Support Vector Regression utilizing insensitive tube (ε=0.1).

R² Score0.7796

RMSE / MAE18.76 / 13.92

✓Produces smooth continuous predictions unlike decision trees.

✕Highly sensitive to the hyperparameter C choices.

Linear Regression

Ordinary Least Squares (OLS) identifying linear variable associations.

R² Score0.7788

RMSE / MAE18.80 / 14.12

✓Zero training time; identifies global feature impact directly.

✕Assumes linearity where fitness datasets often show non-linearities.

Feature Importance

Permutation Importance

Top Predictors RF · Permutation

sit_and_bend_forward_cm

0.258

sit-ups counts

0.231

age

0.132

weight_kg

0.071

body_fat_%

0.058

gripForce

0.050

Lower Predictors Ranked 7–11

gender

0.050

broad_jump_cm

0.028

height_cm

0.010

systolic

0.006

diastolic

−0.002

Key Finding: Flexibility (sit-and-bend) and core endurance (sit-ups) together account for 48.9% of total permutation importance — far ahead of body composition metrics.

Dataset Overview

Body Performance

13,393Total Records

12Feature Columns

11Input Features

4Performance Classes

0Missing Values

~3,348Records per Class

ML Pipeline

End-to-End

Load

CSV · 13,393 rows

›

Audit

Physiol. Laws

›

EDA

Bivariate Analysis

›

Prep

IQR Capping

›

Split

80/20 · 70/30 · 50/50

›

Train

6 classifiers · 3 reg.

›

Evaluate

Gini · Permutation

›

Deploy

APEX Dashboard

Data Quality Audit

Physiological Laws

Blood Pressure Constraint

Enforced the Systolic > Diastolic physiological law. Measurements where resting pressure exceeded beating pressure were flagged as illogical and removed to ensure data integrity.

Duplicate Rectification

Identified and purged exact row duplicates. This prevents "Data Leakage" where the model might "memorize" identical participants across training and testing splits, artificially inflating accuracy.

Column Definitions

Schema

Column	Type	Description	Valid Range	ML Role
age	INT	Participant age in years	18 – 80	Feature
gender	CAT	Biological sex — M or F	M / F	Feature (encoded)
height_cm	FLOAT	Standing height in centimetres	100 – 220	Feature
weight_kg	FLOAT	Body weight in kilograms	20 – 250	Feature
body fat_%	FLOAT	Body fat percentage	3 – 65%	Feature
diastolic	INT	Diastolic blood pressure	40 – 130 mmHg	Feature
systolic	INT	Systolic blood pressure	70 – 200 mmHg	Feature
gripForce	FLOAT	Hand grip strength	0 – 70 kg	Feature (high importance)
sit_and_bend_forward_cm	FLOAT	Flexibility: sit-and-reach test	−25 – 200 cm	Feature (top importance)
sit-ups counts	INT	Number of sit-ups completed	0 – 80	Feature (high importance)
broad jump_cm	FLOAT	Standing broad jump distance	50 – 300 cm	Feature + Regression Target
class	CAT	Performance band — A (best) to D (worst)	A / B / C / D	Classification Target

Executive Summary

Final Report

03_Gharieb_Team
Body Performance
Final Analytics

The Gharieb Team from the Military Technical College presents the definitive body performance intelligence system. Our methodology follows a recursive 5-stage pipeline—Data Cleaning, EDA, Multi-Split Modeling, Cross-Validation, and Production Deployment—to classify fitness tiers (A–D) with optimized precision.

13,393 Records Analyzed

6 Classifiers Trained

3 Regression Models

74.26% Best Accuracy (RF)

Through rigorous statistical auditing (Physiological BP Laws, IQR Outlier Capping, and Permutation Importance analysis), we have engineered a robust engine that captures the non-linear relationship between physical metrics and athletic performance grades.

Key System Insights

Feature Analysis

#1

Body Composition Dominance

Predictor: FAT %

Body fat percentage is the primary driver of performance classification. Individuals in Class 'A' consistently exhibit significantly lower fat levels, making it the most reliable physiological metric for predicting peak fitness grades.

#2

Strength-Agility Synergy

Indicators: JUMP / GRIP

Our analytics confirm that broad jump distance, sit-ups, and grip force move in near-perfect lockstep. High scores in one usually signal high scores in others, representing a unified explosive power-endurance coefficient.

#3

Gender Threshold Scaling

Condition: M/F Balanced

Distributions across Classes A-D are exceptionally balanced between genders. This indicates that our grading criteria effectively scale according to biological sex standards, ensuring fair and accurate classification for all participants.

#4

Blood Pressure Indifference

Predictor: Low Corr.

While vital for health monitoring, systolic and diastolic blood pressures showed minimal correlation with raw performance classes. This confirms that cardiovascular health is a background constant rather than a direct performance driver.

Methodology & Results

Performance

Random Forest Classifier

74.26%

Classification Accuracy

Precision74.71%

Recall74.26%

F1 Score74.12%

Training Split70 / 30

Cross-Validation5-fold

Best Classifier

RF Regressor (Jump)

0.7842

R² Score

TaskBroad Jump (cm)

RMSE18.57 cm

MAE13.82 cm

Best Regressor

Methodology

Gharieb 5S

Audit Pipeline

1. AuditBP logic + Duplicates

2. EDABivariate Correlation

3. PrepStandard Scaler + IQR

4. ModelGrid Search + 5-Fold

5. DeployInteractive Dashboard

5-Stage Flow

Strategic Roadmap

Future Work

Ensemble Stacking

Combine RF, SVM, and MLP into a single meta-model to minimize residual variance.

SHAP Integration

Implement Game Theory explainability to provide local reasons for every prediction.

Deep Regression

Utilize Keras/PyTorch ANN architectures to push Broad Jump R² beyond current thresholds.

Grid Search Pro

Execute exhaustive hyperparameter optimization across all split ratios simultaneously.

Participant Profile

Performance Profile Radar

Personalized Insights

Structured Training Strategy

Model Performance

Split Stability Analysis

Accuracy Across Train/Test Splits ALL 6 MODELS · 3 RATIOS

K-Fold Cross-Validation

CV Mean ± Std Dev 5-Fold · Full Dataset

Classification Models

Random Forest

Neural Network

Support Vector Machine

Logistic Regression

Decision Tree

K-Nearest Neighbors

Confusion Matrix & Error Analysis

RF Performance by Class True vs Predicted

Regression Models

RF Regressor

Neural Network Regressor

SVR (RBF Kernel)

Linear Regression

Feature Importance

Top Predictors RF · Permutation

Lower Predictors Ranked 7–11

Dataset Overview

ML Pipeline

Data Quality Audit

Blood Pressure Constraint

Duplicate Rectification

Column Definitions

Executive Summary

Key System Insights

Body Composition Dominance

Strength-Agility Synergy

Gender Threshold Scaling

Blood Pressure Indifference

Methodology & Results

Strategic Roadmap

Ensemble Stacking

SHAP Integration

Deep Regression

Grid Search Pro

Project Artifacts

History & Compare

Artifact Preview

Model Intelligence

Confusion Matrix / Distribution

Per-Class Performance (F1)

Hyperparameters

Split Comparison

Analyst Insight