As an experienced educator, researcher, and designer with a passion for machine learning, I hold a Bachelor's degree in Manufacturing Engineering and a Master's of Applied Science in Mechanical Engineering from the University of Ontario Institute of Technology. My expertise in both software and mechanical engineering has equipped me with the ability to approach problem-solving with a unique perspective. In addition to my technical skills, I enjoy discussing historical and geopolitical topics, working out, and spending time with family and friends.
-An unsupervised project that works to categorize people based on their fitness data.
-K-Means Clusting machine learning algorithm is used to categorize the data based on each persons data which includes age, gender, height, weight, and many more features.
-An extensive EDA concluded that the data may be clustered into 10 different clusters, however for better visualization PCA was applied.
-PCA of two principal components was applied for data visualization. 2 principal components resulted in 65% variance.
-With PCA the data seemed to be easily sepratable which meant the data can be applied for future supervised models.
-PCA was able to seperate the data based on overall performance and gender.
-Technologies Utlized:
-An unsupervised project that works to categorize 167 countries to view which countries need the most help by NGOs.
-K-Means Clusting machine learning algorithm is used to categorize the data based on each country's children mortality rate per 1000 births, exports, health spendings, imports, income, inflation, life expectency, total fertility, and GDP per person
-An extensive EDA concluded that countries that are in need have high children mortality rate, high fertility, high inflation, poor income, and low GDP per person. Based on the "Elbow Method" the optimium K (i.e., clusters values) were 9 and 16.
-Technologies Utlized:
-A NLP project that works on predicting if an indivudal is potentially suffering from psychological stress or not, based on the text data they input.
-ML algorithms are constrcuted based on histroical data from multiple sub-reddits.
-The goal of the analysis is to decrease the number of people that have psychological stress but the algorithms predicts otherwise (i.e., reduce false negatives).
-For this NLP project three algorithms were utilized: Logistic Regression, Naive Bayes (Multinominal) and Naive Bayes (Bernoulli).
-In terms of accuracy all algorithms perfromed very similar at around mid 70%. However, when it came to recall Naive Bayes (Bernoulli) was the best performer.
-Based on the results of the algorithms the best performing algorithm is the Bernoulli Naïve Bayes with an alpha of 3, because it produced a low False Negative values and an acceptable values of True Positives. Even though Logistic Regression with recall as the metric produced 0 False Negatives, it resulted in a very low value for the True Positives.
-Technologies Utlized:
-A supervised regression ML project that predicts the medical insurance charges of patients.
-Features were: Age, Sex, BMI, Number of Children, Smoking Status, and Region. Based on an extensive EDA the Age and Smoking Status were the most influential features on the charges label feature.
-The following algorithms were used to predict the medical insurance charges: Linear Regression, Polynomial Regression, KNN, SVM, Decision Trees, and Random Forest.
-GridSearchCV as utilized during the training of each algorithm to ensure the optimum hyperparameters were selected to minimize the RMSE.
-The top performing algorithm was Decision Trees with Gradient boost with a RMSE of 33.86%.
-Technologies Utlized:
-A supervised classification ML project that predicts if an individual will have a stroke or not.
-The goal of the analysis is to decrease the false negatives. Therefore recall was chosen as the metric to optimize during the GridSearchCV of each algorithm. Multiple ML algorithms were utilized including: Logistic Regression, KNN, SVM, Decision Trees, and Random Forests.
-Age seemed to be the most influential parameter in predicting if a person will have a stroke or not. In conclusion, the top performing algorithms that had the lowest false negatives were Decision Trees with Ada boost and SVM.
-Technologies Utlized:
-Utilized the Requests and Beautiful Soup libraries of Python to scrape the data prices of used vehicles on the website of a major automotive dealership in Mississauga Ontario.
-The web application scrapped the data off every single web page. Then the data was displaced in: a CSV format sheet and in PostgreSQL.
-Technologies Utlized:
-Developed a command line application using Python that duplicates the game Black Jack or 21.
-Technologies Utlized:
-Developed a command line application using Python that duplicates the game Tic Tac Toe.
-Technologies Utlized:
-Utilized HTML, CSS, and JS to create an intractive animal sounds web application.
-The web application is able play animal sounds by either clicking on the animal's images or typing the animal name. The page is live on Github and can be found by clicking this link.
-Technologies Utlized:
-Utilized HTML, CSS, and JS to create an intractive Dice game.
-The web application is used to interact between two players. The page is live on Github and can be found by clicking this link.
-Technologies Utlized:
-Utilized HTML and CSS to create a single page front-end web application.
-Bootstrap library of CSS was utilized to create a Navbar, Carosel, and many more cool features. The page is live on Github and can be found by clicking this link.
-Technologies Utlized: