Machine Learning

Introduction

The purpose of this project is to explore machine learning applications including supervised, unsupervised, and deep learning methodologies and identify trends that cannot be visualized in easily in 2-D and 3-D space.

These projects will be ongoing as optimal machine learning models require trial and error to improve upon.

S&P500 Price, Volume, Volatility Analysis

Unsupervised machine learning methods were used to evaluate clusters within the data set. The methods used for this analysis include principle components analysis (PCA), k-Means elbow curve, and t-SNE for clustering visualization.

Raw daily S&P500 price and volume data was taken from 1928 to 2021. The data was cleaned, formatted, and calculations for daily price changes, average true range (ATR), and relative volume were computed. ATR is a measure of volatility based on price spread fluctuations. Also, columns were added which logged the number of days in a row that price or volatility increased.

The elbow curve derived from k-Means analysis suggests the optimal number of clusters to be 4. The 4 clusters were input to a k-Means prediction model to produce the class labels for the original data set.

After applying the machine learning assigned classes to the original dataset and color coding as shown below, there is no clear predictive ability of the current classification schema to choose good buying and selling opportunities. The first graph shown below is the S&P500 price data from 1928 to 2021. The second graph is S&P500 price data from December 2020 to December 2021.

Recommendations for further improving the model with the aim of identifying high probability buy and sell points are to measure rates of change over time of the prices based on various simple or exponential moving averages. Also, the number of days in a row that price or volatility has gone up can be replaced with other more descriptive meaningful metrics to improve the predictions of the model.

Current analysis completed and published as of December 22nd, 2021