Law of Large Numbers demonstration
This application illustrates the concept of the Law of Large Numbers. The experiment is based on the expected face value of tossing a six-sided die. Users can experiment with changing the sample size and observe the effect of increasing the sample size on the behaviour of the sample mean.
Central Limit Theorem demonstration
This application illustrates the concept of the Central Limit Theorem. Users can gauge at the changing behaviour of the distribution of the sample mean when the sample size increases. The population distributions can also be changed, with options given for the uniform, right skewed and left skewed distributions. This illustrates that the Central Limit Theorem holds even if the data and population distributions are not normal.
Effect of Changing Correlation demonstration
This application demonstrate the effect of changing correlation on the resulting visual plots of the data. Users can change the level of correlation between two variables and study the changes in the scatter plot and the heat map, both of which are typically used tools to visualize possible relationships.
Regression Analysis Tools
This application allows users to upload data files, either in the .csv or the .xlsx formats, and perform regression analysis. Once the data is uploaded, the dashboard reports the descriptive statistics and correlation table, as well as produces the scatter plots of numerical data columns. The user can define their regression model of choice under the “Regression” tab. Once a regression model is estimated, residual diagnostic plots are populated. Predictions can also be performed based on the estimated regression model and user input predictor values.
Logistic Regression Tools
This application allow users to construct the logistic regressions for binary outcomes. The application contains a default demonstration data set, but also allow users to upload their own data files in .csv or .xlsx formats. Data exploration features include a tool that allow users to transform specific columns of data, automated descriptive statistics, scatter plots and correlation table. Users can construct their own logistic regression models, with an option to fill in missing values using k-nearest neighbours algorithm, and also an option to split their data into the training and test set for model evaluation. The model evaluation tab provides the hit and miss table with user-controlled cut-off probability; as well as providing the lift charts. Finally, the prediction tab allows users to view and download the predictive probabilities that is implied by the model.