A/B testing

Will changing the website layout attract more users?

In this project, A/B testing is applied to answer the question of whether changing the layout of course description on a e-learning platform will attract new users. Three different metrics are used in this analysis, namely: Enrollment rate, Avg. classroom time, and completion rate. a type error 1 (alpha) of 0.05 is used. To combine the results of the three tests (metrics), Bonferroni correction is used.

keywords: A/B testing, p-value, Alpha, Bonferroni correction, data Analytics

Code

Risk modelling using tree-based models

Predicting whether a patient will die in 10 years

In this project, patients' medical data like age, blood pressure, sedimentation rate, race, etc.. are analyzed, visualized, and used to build a model using DT and RF using different techniques to deal with missing values like complete analysis and mean & iterative imputation. all models variations are evaluated using c-index

keywords: risk-model, DT, RF, imputation, c-index, SHAP

code

Democracy, unemployment, and other things

working on Gapminder datasets to draw conclusions

Gapminder is a great global project that aims to gather datasets on literally everything. In this project, I use datasets unemployment rate, democracy index, and internet users numbers to answer questions and understand trends.

keywords: pandas, numpy, matplotlib, ILO, Gapminder

Simson's paradox

When numbers lie!

In this project, simpson's paradox is explored. is a phenomenon in probability and statistics, in which a trend appears in several different groups of data but disappears or reverses when these groups are combined. The dataset used is a university admission dataset where female vs males admission rates are explored on different levels (overall rate vs major specific rate) and results change from one level to another