top of page
Writer's pictureMorgan Price

Applications of Artificial Intelligence Models to Time-Series Data | Systematic Investing

Note:

If you're familiar with the topics and processes and are just interested in the results, scroll to the bottom.


Here's your reason to scroll all the way down:


Introduction:

If that isn't a clickbait title I don't know what is. Throughout this article I will take you from zero, assuming you have no experience with Artificial Intelligence (AI) Models, to being able to develop a fully functioning deep learning model with the ability, if you so chose, to guide your investment decisions.


That may not be anything new. The internet is full of tutorials on just about everything. Here's where I'm going to make things interesting:


This will be fully implemented in Microsoft Excel.


Now you might ask why? You might be about to click out of the article; yes, we have better tools to implement such complicated models. You might think this is a waste of time. If you're on that side of the issue then you know more about the implementation of these models than the intended audience. If that is the case, this article is likely going to be at a lower level than your ability would suggest you should read. However, if you are unfamiliar with quantitative finance this is for you.


Now if you were exited to implement an AI Model with your own two hands this article is for you. The reason I choose to write these with applications to programs like Microsoft Excel isn't because it's the most efficient method. I do it because I know a large number of people don't have the time required or the desire to learn a coding language like python.


Well, here goes. Artificial Intelligence built, as I write this article (wish me luck), right inside of an excel spreadsheet.



Artificial Intelligence:

Who better to define AI for us than an AI Model. After all, who am I to say what AI is and isn't and who better to speak on machine learning methods and applications than the so called current pinnacle of our machine learning capabilities.



Full conversation with ChatGPT on what we are doing here and how it recommends we proceed.



Me: Hello ChatGPT. I hope you are enjoying your existence so far.

ChatGPT Hello! I don't experience enjoyment or emotions, but I'm here to assist and provide information to the best of my abilities. How can I assist you today?

Me: Could you please, in no more than a paragraph or two, please define what an Artificial Intelligence Model is. Please be precise to support the understanding of the wide variety of viewers of your answer and do your best to ensure nothing is ambiguous.

ChatGPT An artificial intelligence model is a mathematical representation or algorithm designed to simulate human-like cognitive processes, enabling a computer system to perform specific tasks or make decisions based on data and patterns. These models can encompass various techniques, such as machine learning, deep learning, and natural language processing, and are trained on large datasets to learn and improve their performance over time. They serve as the core components of AI systems, allowing them to recognize patterns, make predictions, process language, and solve problems within specific domains.

Me: Thank you. I notice that you included deep learning models in your definition. I agree that they are a subset of Artificial Intelligence Models. Could you please, concisely and clearly, explain what a deep learning model is and give a general understanding of how one works? Thank you in advance.

ChatGPT A deep learning model is a specific type of artificial intelligence model that is designed to automatically learn and extract hierarchical features from data. It consists of multiple layers of interconnected artificial neurons (also called nodes or units) arranged in a network. These networks are called artificial neural networks. Deep learning models, often referred to as deep neural networks, are particularly well-suited for tasks like image and speech recognition. Here's how a deep learning model works in general:

  1. Input Layer: The model receives data as input, which could be images, text, or any other type of information.

  2. Hidden Layers: Between the input and output layers, there are one or more hidden layers where complex computations occur. Each neuron in a layer processes information from the previous layer and passes it to the next layer.

  3. Weights and Activation Functions: Each connection between neurons has a weight that determines the strength of the connection. An activation function is applied to the weighted sum of inputs to introduce non-linearity into the model, enabling it to learn complex relationships in the data.

  4. Training: The model is trained on a labeled dataset, where it adjusts the weights to minimize the difference between its predictions and the actual outputs (labels). This training is typically done using optimization techniques like gradient descent.

  5. Forward and Backward Pass: During training, the model goes through forward and backward passes. In the forward pass, it makes predictions. In the backward pass (backpropagation), it calculates how much the model's predictions differ from the actual results and updates the weights accordingly.

  6. Iterative Learning: This process is repeated iteratively with the entire dataset multiple times (epochs) until the model's predictions are accurate and reliable.

Deep learning models have shown remarkable success in various applications, such as image and speech recognition, natural language processing, and autonomous vehicles. They excel at automatically learning intricate patterns from large amounts of data, making them a crucial part of modern AI.

Me: You mentioned that "They excel at automatically learning intricate patterns from large amounts of data, making them a crucial part of modern AI". Would it be appropriate or possible to successfully apply a Deep-Learning model to a time-series data set?

ChatGPT Yes, it is both appropriate and possible to apply deep learning models to time-series datasets. In fact, deep learning has been quite successful in handling time-series data, especially when the data is complex and contains intricate patterns. Time-series data is data that is collected or recorded over time at regular intervals, such as stock prices, sensor readings, weather data, and more. Deep learning models, particularly recurrent neural networks (RNNs) and Long Short-Term Memory networks (LSTMs), are well-suited for time-series data. Here's how they work in this context:

  1. Recurrent Neural Networks (RNNs): RNNs are designed to handle sequences of data, making them a good fit for time-series data. They have a recurrent connection that allows them to maintain a hidden state or memory of previous time steps. This memory enables RNNs to capture dependencies and patterns in time-series data.

  2. Long Short-Term Memory (LSTM) Networks: LSTMs are a specific type of RNN designed to address the vanishing gradient problem that can occur in traditional RNNs. LSTMs are highly effective at learning and remembering long-term dependencies in time-series data, which makes them especially useful in tasks like stock price prediction, natural language processing, and speech recognition.

Deep learning models can be used for various time-series tasks, including but not limited to:

  • Time Series Prediction: Forecasting future values based on historical data, such as predicting stock prices, weather, or demand for products.

  • Anomaly Detection: Identifying unusual patterns or outliers in time-series data, which is important in fields like cybersecurity or industrial maintenance.

  • Sequence-to-Sequence Tasks: Converting one sequence of data into another, such as translating text, generating music, or converting speech to text.

  • Classification and Pattern Recognition: Recognizing specific patterns or events in time-series data, such as identifying different activities in accelerometer data.

While deep learning models are powerful for time-series data, it's essential to have a sufficient amount of data for training and to fine-tune model architecture and hyperparameters to achieve the best results for a specific task. Additionally, other methods like traditional statistical techniques or simpler machine learning models may also be appropriate depending on the characteristics of the time-series data and the specific problem you are trying to solve.

Me: And just to ensure full understanding of the concepts you've mentioned, can you expand on backpropagation please? If you don't mind, please include its link to the chain rule in calculus and speak on the process of gradient descent in machine learning.

ChatGPT Certainly! Backpropagation is a fundamental concept in training neural networks, including deep learning models. It's an optimization technique that adjusts the weights and biases of a neural network to minimize the difference between its predictions and the actual target values (labels). Backpropagation relies on the chain rule in calculus and is a critical part of the gradient descent optimization process in machine learning. Here's how backpropagation works and its connection to the chain rule and gradient descent:

  1. Forward Pass: During the forward pass, the input data is fed through the neural network, and each neuron's output is computed. This process continues through the hidden layers until the network produces its final predictions.

  2. Loss Function: The output of the neural network is compared to the actual target values using a loss function (also known as a cost function). The loss function quantifies how far off the predictions are from the actual values. Common loss functions include mean squared error for regression tasks and cross-entropy for classification tasks.

  3. Chain Rule: Backpropagation uses the chain rule from calculus to calculate the gradients of the loss with respect to the weights and biases of the network. The chain rule allows you to find how small changes in each weight and bias affect the final loss.

  4. Gradient Descent: After computing the gradients, the optimization algorithm, usually gradient descent, is used to update the weights and biases in the direction that minimizes the loss. Gradient descent is an iterative process that adjusts the model's parameters to find the minimum of the loss function. The direction of the update is the negative gradient because it points towards the steepest decrease in the loss.

  • Learning Rate: The learning rate is a hyperparameter that controls the step size during each update. A smaller learning rate makes the updates smaller and helps the algorithm converge more smoothly, but it can slow down the process. Conversely, a larger learning rate can speed up convergence but might lead to overshooting the minimum.


  1. Backward Pass: The gradients calculated using the chain rule are propagated backward through the network, layer by layer. At each layer, the gradients are used to adjust the weights and biases to minimize the loss.

  2. Iterative Process: Steps 1 to 5 are repeated iteratively for a fixed number of epochs or until the loss converges to a minimum, indicating that the model has learned the data's patterns.

Backpropagation and gradient descent are crucial components of training neural networks because they enable the model to learn from data. By iteratively adjusting the model's parameters in the direction that minimizes the loss, the network becomes better at making accurate predictions and capturing underlying patterns in the data.

Me: Thank you. Now, can you speak more on your statement on backpropagation in regards to your statement of "It's an optimization technique that adjusts the weights and biases of a neural network to minimize the difference between its predictions and the actual target values (labels). " This seems like it could result in quite the bias-variance trade off. Please explain what the bias-variance tradeoff is in terms of our training data and validation data. Afterwards, please expand on what we can do to arrive at a fair trade off of bias-variance.

ChatGPT Certainly! The bias-variance trade-off is a fundamental concept in machine learning that relates to a model's ability to generalize from its training data to unseen or validation data. Finding the right balance between bias and variance is essential for building models that perform well on a variety of datasets. Let's break down the bias-variance trade-off and how to achieve a fair balance:

  1. Bias: Bias refers to the error introduced by approximating a real-world problem with a simplified model. A high-bias model is one that makes strong assumptions about the data, often underfitting it. This means it fails to capture the underlying patterns in the data and performs poorly both on the training data and unseen data (validation data). It's called a "biased" model because it's too simplistic to represent the complexity of the data.

  2. Variance: Variance refers to the model's sensitivity to fluctuations or noise in the training data. A high-variance model is overly complex and can capture noise as well as true patterns in the data. Such models perform very well on the training data but poorly on validation data, indicating they have overfit the training data.

Here's how the bias-variance trade-off plays out:

  • High Bias, Low Variance: Simple models with high bias tend to produce similar outputs regardless of the training data. They have a limited capacity to learn from data, leading to systematic errors. They are not sensitive to small fluctuations in the training data but may perform poorly on complex tasks.

  • Low Bias, High Variance: Complex models with low bias can capture a wide range of patterns from the training data but may also pick up noise. They are highly sensitive to variations in the data and can perform extremely well on the training data. However, they tend to generalize poorly to new, unseen data.

Achieving a fair trade-off between bias and variance involves several strategies:

  1. Cross-Validation: Use techniques like k-fold cross-validation to assess how your model generalizes to different subsets of the training data. This helps you get a more accurate estimate of both bias and variance.

  2. Regularization: Apply techniques like L1 or L2 regularization to penalize overly complex models. Regularization adds a constraint on model complexity, which can help reduce variance.

  3. Feature Selection: Choose relevant features and remove irrelevant or noisy ones. This can simplify your model, reducing variance.

  4. Ensemble Learning: Combine multiple models to mitigate overfitting and improve generalization. Techniques like bagging (e.g., Random Forests) and boosting (e.g., AdaBoost) can help achieve this.

  5. Hyperparameter Tuning: Adjust hyperparameters like learning rate, batch size, and model architecture to fine-tune the model's complexity.

  6. Collect More Data: In some cases, collecting more training data can help reduce overfitting and lead to a better bias-variance trade-off.

The goal is to create a model that generalizes well to unseen data (low bias) while also avoiding overfitting (low variance). Achieving the right balance depends on the specific problem and dataset, and it often involves iterative experimentation and fine-tuning.

Me: Please expand on Cross-Validation, Regularization, Feature Selection, Ensemble Learning, and Hyperparameter Tuning. Please assume that the individual reading your answer has no previous experience with these methods you mentioned and is eager to learn them.

ChatGPT Certainly! Let's explore each of these concepts in more detail, assuming no previous experience:

  1. Cross-Validation: Cross-validation is a technique used to assess how well a machine learning model generalizes to unseen data and to estimate its performance. It involves splitting your dataset into multiple subsets, or "folds," and then training and evaluating the model on different combinations of these folds.

  • k-Fold Cross-Validation: One common approach is k-fold cross-validation, where you divide your data into k subsets of roughly equal size. You train the model on k-1 of these subsets and evaluate it on the remaining subset. This process is repeated k times, with each subset serving as the evaluation set exactly once. The results are then averaged to provide a more robust estimate of your model's performance.


  1. Regularization: Regularization is a technique to prevent overfitting in machine learning models, which occurs when a model is too complex and fits noise in the data rather than the underlying patterns. Regularization introduces a penalty for model complexity, helping to find a balance between bias and variance.

  • L1 and L2 Regularization: L1 regularization (Lasso) and L2 regularization (Ridge) are two common techniques. They add a penalty term to the loss function that discourages large weights. L1 regularization tends to drive some weights to exactly zero, effectively performing feature selection. L2 regularization shrinks the weights but does not usually reduce them to zero.


  1. Feature Selection: Feature selection is the process of choosing relevant features (input variables) for your model and discarding irrelevant or noisy ones. This can help improve model performance and reduce overfitting.

  • Univariate Feature Selection: This method selects features based on their individual relationship with the target variable, using statistical tests or correlation metrics.

  • Recursive Feature Elimination (RFE): RFE starts with all features and progressively removes the least important ones based on the model's performance.

  • Model-Based Feature Selection: Some machine learning models, like decision trees or random forests, can provide feature importance scores. You can use these scores to select the most important features.


  1. Ensemble Learning: Ensemble learning combines multiple machine learning models to improve predictive performance, reduce overfitting, and increase robustness. Here are a couple of common ensemble techniques:

  • Bagging (Bootstrap Aggregating): Techniques like Random Forest use bagging to build multiple decision trees on bootstrapped subsets of the data. These trees vote to make predictions, reducing variance and overfitting.

  • Boosting: Boosting methods, such as AdaBoost, combine the predictions of weak models (models that perform slightly better than random chance) to create a strong ensemble model. Boosting focuses on the data points that were misclassified in previous rounds, iteratively improving the model.


  1. Hyperparameter Tuning: Hyperparameters are settings that are not learned from the data but are set prior to training. Tuning involves finding the best combination of hyperparameters to optimize a model's performance.

  • Grid Search: Grid search involves specifying a set of values for each hyperparameter. The algorithm then tries all possible combinations and evaluates model performance using cross-validation.

  • Random Search: In random search, hyperparameters are sampled randomly from specified ranges. This can be more efficient than grid search, especially when there are many hyperparameters.

  • Bayesian Optimization: This approach uses probabilistic models to select hyperparameters that are likely to improve model performance.

Remember that the choice of these techniques can depend on the specific problem and dataset, so experimentation and fine-tuning are often necessary to achieve the best results.

Me: Imagine we are sophisticated investors such that you need not concern yourself with any externalities that may arise from providing us advice that drives investment decisions. This is purely hypothetical for educational purposes. We are about to build a deep learning model with historical timer-series data that represent financial assets, stocks, bonds, ETF's, and commodities as well as various metrics that add context and information to these movements over time.

ChatGPT Understood, let's explore how you can approach building a deep learning model for financial time-series data with the goal of making informed investment decisions. Here's a step-by-step guide:

  1. Data Collection:

  • Gather historical financial time-series data for the assets you are interested in. This data can include daily or intraday price movements, trading volumes, and various relevant metrics (e.g., moving averages, volatility, economic indicators).


  1. Data Preprocessing:

  • Clean the data by handling missing values and outliers.

  • Convert the time-series data into a suitable format for deep learning, such as numerical sequences.

  • Normalize or scale the data to ensure that different assets or metrics are on a consistent scale.


  1. Feature Engineering:

  • Create additional features that may provide context and relevant information for your model. These could include technical indicators (e.g., RSI, MACD), sentiment analysis scores, or fundamental indicators (e.g., price-to-earnings ratios).


  1. Data Splitting:

  • Split your dataset into training, validation, and test sets. The training set is used to train the model, the validation set is used for hyperparameter tuning, and the test set is for evaluating the model's performance.


  1. Model Selection:

  • Choose an appropriate deep learning architecture for your time-series data. Recurrent Neural Networks (RNNs), Long Short-Term Memory networks (LSTMs), or even more recent models like Transformers can be considered.

  • Experiment with different model architectures and hyperparameters to find the one that works best for your dataset.


  1. Model Training:

  • Train your selected model on the training data using the historical time-series and feature data.

  • Utilize the training process we discussed earlier, including backpropagation, gradient descent, and regularization techniques to optimize your model.


  1. Model Evaluation:

  • Assess your model's performance using the validation dataset. Common evaluation metrics for financial data include Sharpe ratio, mean squared error (MSE), or accuracy if you're using classification.

  • Pay attention to potential issues like overfitting and ensure that the model generalizes well.


  1. Model Testing:

  • After satisfactory performance on the validation set, evaluate your model on the test dataset to ensure it can make accurate predictions on unseen data.


  1. Portfolio Construction:

  • Once you have a well-performing model, you can use its predictions to construct a portfolio. This can involve allocating funds to different assets or strategies based on the model's recommendations.


  1. Risk Management:

  • Implement risk management strategies to protect your investments. Diversify your portfolio, set stop-loss orders, and establish rules for position sizing.


  1. Continuous Monitoring and Refinement:

  • The financial markets are dynamic and subject to changing conditions. Continuously monitor the model's performance and adapt it as needed. This may include retraining it with more recent data or adjusting its parameters.


  1. Backtesting:

  • Before implementing your model in a real-world scenario, perform backtesting by applying the model to historical data to see how it would have performed in the past. This can help you understand its strengths and weaknesses.

Please remember that investing in financial markets carries inherent risks, and there are no guarantees of success. Additionally, deep learning models for financial predictions are complex and require expertise in both finance and machine learning. It's crucial to conduct thorough research, seek professional advice, and practice risk management when making investment decisions.

Me: Thank you. In regards to your statement on "Continuous Monitoring and Refinement: The financial markets are dynamic and subject to changing conditions. Continuously monitor the model's performance and adapt it as needed. This may include retraining it with more recent data or adjusting its parameters." Would it be appropriate for us to monitor the models performance and predictions on a daily basis? If so, please set out a proper, well grounded and statistically sound way for us to do that.

ChatGPT Monitoring a deep learning model's performance and predictions on a daily basis can be appropriate, especially in dynamic and rapidly changing financial markets. To do this effectively, you can follow a systematic and statistically sound approach:

  1. Data Collection and Update:

  • Collect daily updates of financial time-series data, including asset prices, volumes, and relevant metrics.

  • Ensure that your dataset is consistent and includes the latest available data.


  1. Daily Model Prediction:

  • Run your deep learning model on the most recent data to generate daily predictions. These predictions may include asset price movements, recommendations for trading positions, or other relevant signals.


  1. Performance Metrics:

  • Choose a set of performance metrics that are appropriate for your investment strategy. Common metrics for evaluating the model's daily performance include:

  • Sharpe Ratio: Measures the risk-adjusted return of your portfolio.

  • Maximum Drawdown: Captures the largest peak-to-trough decline in portfolio value.

  • Win-Loss Ratio: Calculates the ratio of profitable trades to losing trades.

  • Accuracy or F1 Score: If your model is used for binary classification (e.g., buy/sell decisions).

  • Mean Squared Error (MSE): If your model predicts price movements.


  1. Portfolio Simulation:

  • Based on your model's daily predictions, simulate a portfolio by allocating funds to assets according to its recommendations. Track the portfolio's daily performance.

  • Implement proper risk management strategies, such as stop-loss orders, to limit potential losses.


  1. Record Keeping:

  • Maintain a comprehensive record of your daily model predictions, portfolio performance, and all relevant metrics. This record should include timestamps, investment decisions, and performance outcomes.


  1. Statistical Analysis:

  • Regularly perform statistical analysis on the recorded data to assess the model's performance. This can involve calculating daily returns, risk-adjusted returns, and evaluating the consistency of the model's recommendations.


  1. Adaptation and Fine-Tuning:

  • If the model's performance deteriorates or if market conditions change significantly, consider retraining the model with the most recent data. This ensures that the model is adaptive to evolving market dynamics.

  • Adjust hyperparameters, update feature engineering, or explore alternative model architectures as needed.


  1. Market Events and News:

  • Stay informed about market events, economic news, and geopolitical developments. These external factors can significantly impact financial markets and may necessitate adjustments to your model or portfolio.


  1. Backtesting and Model Validation:

  • Continuously validate your model's predictions against historical data to understand how it would have performed in past market conditions. This helps in assessing its robustness.


  1. Expert Consultation:

  • Consider consulting with experts in the field of finance and machine learning to gain additional insights and validation for your model and strategy.

It's essential to approach daily monitoring and decision-making with caution. Daily trading can lead to increased transaction costs and higher exposure to short-term market fluctuations. It's also important to set realistic expectations for the model's performance and to understand that no model can predict the market with absolute certainty. Lastly, consider the tax implications of daily trading, as frequent buying and selling may have tax consequences that can impact your overall returns. Always seek professional financial advice and thoroughly test any model or strategy before deploying it with real funds.

Me: I appreciate your feedback. Consider a model that tells us whether to enter or exit a position on a daily basis. I think you understand that this does not necessitate daily trading. The implication is that you are always, according to the model, in the highest probability of success position at any given time. Our model could tell us to hold a single position for many periods, even years if there was a consistent trend in the market. Do you agree with my statements?

ChatGPT Yes, I agree with your statements. If your model recommends entering or exiting positions on a daily basis, it doesn't necessarily imply daily trading in the sense of frequent buying and selling. Instead, the model may suggest actions to maximize the probability of success or capture trends in the market. The frequency of actual trading would depend on the signals generated by the model and the prevailing market conditions. Your model's recommendations may indeed result in holding a single position for an extended period, such as weeks, months, or even years, if the model identifies a consistent and favorable trend. The goal of such a strategy is to stay invested in assets or positions that offer the highest probability of success based on the model's analysis. This approach is commonly referred to as "trend-following" or "momentum" investing, where you aim to ride existing trends in the market. It aligns with the idea of letting winners run while cutting losses, and it doesn't require daily trading but rather periodic portfolio rebalancing based on the model's signals. It's important to note that the success of this strategy depends on the model's ability to accurately identify trends and make timely recommendations. Additionally, risk management and diversification remain crucial, even in a trend-following strategy, to mitigate potential losses during market reversals.

Me: Again, I appreciate the feedback. However, I would have to respectfully disagree with you. The intent of this model is not trend following. The intent will be a systematic approach to a tactical asset allocation strategy. Are you familiar with tactical asset allocation and how it is not the same as a pure momentum strategy that you described?

ChatGPT I understand your clarification, and I appreciate the opportunity to provide more context. Tactical asset allocation (TAA) is indeed a systematic approach to portfolio management, and it differs from a pure momentum strategy.

Tactical Asset Allocation (TAA): TAA is a strategy that aims to adjust the allocation of assets within a portfolio based on a systematic and rules-driven approach. The goal of TAA is to enhance risk-adjusted returns by shifting allocations in response to changes in market conditions, economic indicators, or other quantitative signals. Unlike pure momentum strategies, TAA may consider various factors beyond just price momentum, such as valuation metrics, macroeconomic data, and fundamental indicators. Key characteristics of TAA include:

  1. Asset Diversification: TAA typically involves allocating capital across a diverse set of asset classes, including stocks, bonds, commodities, and other investments.

  2. Systematic Rules: TAA relies on predefined rules or quantitative models to determine when and how to adjust asset allocations.

  3. Dynamic Approach: TAA can respond to changing market conditions and can move assets into or out of specific asset classes as needed.

  4. Risk Management: TAA often includes risk management strategies to limit downside exposure during adverse market conditions.

  5. Objective-Driven: TAA aims to achieve specific objectives, such as risk reduction, capital preservation, or enhanced returns, depending on the investor's goals.

In contrast to a pure momentum strategy, TAA considers a broader set of factors, including asset valuation and economic data, to make allocation decisions. The primary focus of TAA is not just to follow existing market trends but to make strategic adjustments to optimize the portfolio's performance under different market scenarios. I appreciate your clarification, and I hope this distinction better aligns with your intent for the model's strategy.

Me:


What do you think of building a deep learning model in Microsoft Excel?

ChatGPT Building a deep learning model in Microsoft Excel is technically possible but not the most suitable or efficient tool for the job. Excel is a spreadsheet software primarily designed for data analysis, organization, and basic calculations. While it has some data analysis and visualization capabilities, it's not a dedicated platform for developing complex machine learning models like deep neural networks. Here are some limitations and considerations when attempting to build a deep learning model in Excel:

  1. Lack of Deep Learning Libraries: Excel lacks the deep learning libraries and frameworks that are essential for building and training deep neural networks. Popular deep learning libraries like TensorFlow, PyTorch, and Keras are not available within Excel.

  2. Performance Limitations: Deep learning models often require significant computational resources, including GPUs or TPUs, to train effectively. Excel is not optimized for these resource-intensive tasks.

  3. Limited Data Handling: Excel is not designed for handling large datasets or complex data preprocessing tasks, which are common in deep learning projects.

  4. Complex Model Architectures: Deep learning models often involve complex architectures with many layers and parameters, which are challenging to define and train in Excel.

  5. Lack of Advanced Training Algorithms: Excel lacks the advanced optimization algorithms required for training deep neural networks, such as stochastic gradient descent (SGD) and its variants.

  6. Debugging and Visualization: Deep learning projects benefit from tools and libraries that offer debugging, visualization, and model interpretation capabilities, which Excel does not provide.

  7. Scalability: Excel may struggle to scale to larger and more complex deep learning projects. Scaling the model or adding more data may not be feasible.

For building deep learning models, it's recommended to use dedicated machine learning and deep learning frameworks like TensorFlow, PyTorch, or scikit-learn in Python. These tools provide a wide range of features, extensive libraries, and a strong community for support.

If you are more comfortable with Excel and want to perform simpler data analysis and basic predictive modeling, you can use Excel in conjunction with other machine learning tools or platforms to build and deploy models. However, for deep learning specifically, you will likely need to use specialized tools and programming languages.

Me: It sounds to me like you're making a case for not using excel based around speed of calculations. I'm sure you're aware that if there is a capability we lack we can likely utilize VBA Macros or the built in solver in excel with the limitation that Excel Solver can not take more than 200 decision variables with multiple constraints. Seems to me that if we keep our decision variables under 200 and are okay with our optimization taking longer, Microsoft Excel has the capability to run a deep learning model. Given some constraints. Do you believe AI Models are worthwhile and that the general public, who maybe can't code very well, should have access to these AI Models if they desire?

ChatGPT You raise a valid point about potential workarounds in Excel using VBA Macros or the built-in Solver for optimization with certain constraints. If your decision variables are limited and you are comfortable with longer optimization times, you can explore the possibilities of using Excel for certain types of modeling tasks. While it might not be as efficient or feature-rich as dedicated machine learning tools, it can be a practical option for those who are more familiar with Excel.

Regarding the broader question of whether AI models are worthwhile and whether the general public should have access to them, the answer is nuanced:

  1. Worth of AI Models: AI models have demonstrated their value in various applications, from healthcare and finance to natural language understanding and autonomous vehicles. They can automate tasks, make predictions, and provide insights that were once challenging or impossible to achieve. When applied thoughtfully, AI models can be highly valuable.

  2. Accessibility to AI Models: Making AI models more accessible to a broader audience, including those with limited coding skills, can democratize AI and broaden its applications. User-friendly tools and platforms that abstract the complexities of AI can enable more people to harness the power of these models for research, business, and personal projects.

  3. Ethical Considerations: While accessibility to AI models is important, it comes with ethical responsibilities. It's crucial to consider issues like data privacy, bias, transparency, and accountability when deploying AI models. Ensuring that AI is used responsibly and ethically is a shared responsibility.

  4. Education and Understanding: Providing access to AI models should ideally go hand-in-hand with education and awareness. Users should understand the capabilities and limitations of AI, as well as how to use these models effectively and responsibly.

  5. Balancing Automation and Expertise: AI models can enhance productivity and decision-making, but they should complement human expertise, not replace it entirely. Striking a balance between automation and human judgment is important.

In summary, AI models can be highly valuable, and making them accessible to a wider audience is a positive step. However, it's crucial to do so responsibly and with consideration of ethical and educational aspects to ensure that AI is used effectively and ethically for the benefit of society.

Me: You make some compelling points ChatGPT. Thank you for all of your help so far.

ChatGPT You're very welcome! I'm glad I could assist you, and I appreciate your thoughtful questions. If you have any more inquiries or need further assistance in the future, don't hesitate to reach out. Wishing you success in your endeavors!



The Deep Learning Model

I believe ChatGPT did a better job explaining those concpts clearly than I would have. Given that, I will rely on ChatGPT one more time then we'll dive into excel.


Explanation on Deep-Learning Models and Activation Functions

Me:

Hi again, with all the context of our previous discussion in its entirety, please explain in great deal and explaining every concept fully what a deep learning model is in the context of a time-series data set

ChatGPT


Certainly, let's delve into the details of what a deep learning model is in the context of a time-series data set. Deep Learning Model for Time-Series Data: A deep learning model for time-series data is a neural network-based approach designed to analyze and make predictions or decisions based on sequential data collected over time. This type of model is particularly valuable in various domains, including finance, economics, climate science, and more. It excels at capturing temporal dependencies and patterns within the data. Here are the key components and concepts:

  1. Time-Series Data:

  • Time-series data represents observations taken at different time points, typically at regular intervals. Examples include historical stock prices, daily weather measurements, monthly sales figures, or sensor data collected over time.


  1. Neural Networks:

  • Neural networks are the foundation of deep learning models. These networks are composed of interconnected artificial neurons, organized into layers. In the context of time-series data, recurrent neural networks (RNNs), Long Short-Term Memory networks (LSTMs), and Gated Recurrent Units (GRUs) are commonly used due to their ability to model sequential dependencies.


  1. Data Preprocessing:

  • Time-series data often requires preprocessing to make it suitable for deep learning. This can include smoothing, normalization, and transformation to ensure that the data is in a format that the neural network can work with.


  1. Model Architecture:

  • In deep learning for time-series data, the architecture often includes one or more of the following:

  • Input Layer: The initial layer that receives the sequential data.

  • Recurrent Layers: These layers capture temporal dependencies by maintaining hidden states and passing them from one time step to the next.

  • Dense Layers: Fully connected layers for non-sequential data or for further processing features extracted from the time series.

  • Output Layer: The final layer that produces predictions or decisions, depending on the specific task.


  1. Training:

  • Training a deep learning model involves optimizing its parameters (weights and biases) to minimize the difference between its predictions and actual outcomes. The model learns to identify patterns and dependencies in the time-series data through backpropagation and gradient descent.


  1. Loss Function:

  • The loss function is a measure of the error between the model's predictions and the actual target values (e.g., future stock prices or temperature values). Common loss functions for time-series data include mean squared error (MSE), mean absolute error (MAE), and others.


  1. Hyperparameter Tuning:

  • Selecting the right hyperparameters is crucial for model performance. These include the learning rate, the number of recurrent layers, the number of neurons in each layer, and more. Cross-validation and grid search are often used to find the best hyperparameters.


  1. Prediction and Inference:

  • Once trained, the model can make predictions on new, unseen time-series data. This could involve forecasting future values, classifying data points, or identifying anomalies.


  1. Model Evaluation:

  • The model's performance is assessed using appropriate evaluation metrics, such as the mean absolute error, root mean squared error, accuracy, or F1 score, depending on the nature of the task (regression or classification).


  1. Monitoring and Adaptation:

  • The dynamic nature of time-series data often requires ongoing monitoring and potential retraining of the model. This ensures that the model remains effective as the data evolves.


  1. Use Cases:

  • Deep learning models for time-series data find application in various domains, such as:

  • Financial markets for stock price prediction and algorithmic trading.

  • Weather forecasting for predicting temperature, rainfall, and more.

  • Healthcare for patient monitoring and disease prediction.

  • Industrial processes for fault detection and predictive maintenance.

In summary, a deep learning model for time-series data leverages neural networks, especially RNNs and LSTMs, to analyze and make predictions based on sequential data. These models are valuable for capturing temporal dependencies and patterns, and they require careful data preprocessing, model architecture design, and training to achieve optimal performance in various real-world applications.


Me:

Thank you. Now, please explain in great detail leaving nothing unexplained what an activation function is and what should drive our choice of activation functions. Please address the hidden layers and the ReLU function as well as our alternatives. Then please address the activation function for output layers and how it depends on the type of prediction problem we our faced with. ChatGPT Activation functions are essential components in artificial neural networks, and they play a crucial role in determining the network's ability to model complex, nonlinear relationships in data. Let's explore activation functions in great detail, covering their purpose, different types, and their relevance in hidden and output layers. Activation Functions: An activation function is a mathematical function that introduces nonlinearity into a neural network. Activation functions are applied to the weighted sum of inputs and biases for each neuron in a neural network. Their primary purpose is to introduce complex, nonlinear relationships and allow neural networks to approximate a wide range of functions.

  1. Hidden Layers:

  • Rectified Linear Unit (ReLU): ReLU is one of the most commonly used activation functions for hidden layers. It's defined as f(x) = max(0, x), which means that it replaces negative inputs with zero and leaves positive inputs unchanged. ReLU is favored for its simplicity and its ability to mitigate the vanishing gradient problem, which can hinder training in deep networks. It promotes sparsity and accelerates convergence.

  • Leaky ReLU: Leaky ReLU is an alternative to ReLU that addresses the issue of "dying" neurons (neurons that never activate). It allows a small, non-zero gradient for negative inputs. The function is f(x) = x if x > 0 and f(x) = ax if x ≤ 0, where 'a' is a small positive constant.

  • Exponential Linear Unit (ELU): ELU is another alternative to ReLU that overcomes the "dying ReLU" problem. It has an exponential form for negative inputs to maintain non-zero gradients. The function is f(x) = x if x > 0 and f(x) = a(e^x - 1) if x ≤ 0, where 'a' is a positive constant.

  • Hyperbolic Tangent (Tanh): Tanh squashes input values between -1 and 1. It is often used when you want the network's output to be centered around zero. Tanh can be more suitable for some types of data preprocessing.

  • Sigmoid (Logistic): The sigmoid function maps inputs to the range [0, 1]. It's historically used in binary classification problems, but its gradient vanishes for very small or very large inputs, which can slow down training in deep networks.


  1. Output Layers: The choice of activation function for the output layer depends on the type of prediction problem:

  • Regression:

  • For regression tasks (predicting continuous values), the output layer typically has no activation function or uses a linear activation function. This allows the network to output continuous values directly.


  • Binary Classification:

  • For binary classification (two possible classes), the output layer often uses the sigmoid activation function. The output represents the probability of belonging to one of the two classes.


  • Multi-Class Classification:

  • For multi-class classification (more than two classes), the output layer typically uses the softmax activation function. Softmax converts the network's raw scores into a probability distribution over multiple classes, allowing it to assign class probabilities.

Factors to Drive the Choice of Activation Functions: The choice of activation functions depends on the specific problem, network architecture, and empirical performance during training. Here are some factors to consider:

  1. Non-Linearity Requirement: If the problem is highly nonlinear, using activation functions like ReLU, Leaky ReLU, or ELU may be more effective.

  2. Vanishing Gradient: If you face the vanishing gradient problem (common in deep networks), functions like ReLU, Leaky ReLU, or ELU can help mitigate this issue.

  3. Centering Output: If you want the network's output to be centered around zero (e.g., for symmetric data), Tanh may be preferred.

  4. Binary or Multi-Class Problem: The choice of the output activation function depends on whether you're dealing with binary or multi-class classification or regression.

  5. Empirical Performance: Often, the best choice of activation function is determined empirically through experimentation and cross-validation.

In practice, deep learning practitioners often start with ReLU or its variants for hidden layers and select the output layer's activation function based on the specific prediction task. It's common to experiment with different activations to find the best-performing combination for a given problem.

Building the Deep Learning Model in Excel


The Groundwork Has Been Laid - Let's Dive into Deep Learning in Microsoft Excel. Before we begin, I don't believe that ChatGPT did the best job explaining some of the concepts of Deep Learning/Machine Learning. However, I don't believe I would do much better.


If you would like to learn more consider checking out 3Blue1Brown's youtube playlist on Neural Networks:


"Learn the basics of neural networks and backpropagation, one of the most important algorithms for the modern world."


Additionally, StatQuest with Josh Starmer has an interesting playlist; Neural Networks / Deep Learning.


This playlist has everything you need to know about Neural Networks, from the basics, all the way to image classification with Convolutional Neural Networks.



His addition of humor to these concepts certainly doesn't hurt either.




ChatGPT's Recommended Procedures


Below is the list of steps ChatGPT proposed we follow. I believe some steps are redundant and I also believe we missed the most crucial step; 0) What do we want our model to do? Let's stay true to ChatGPT's definition of what qualifies a model as Artificial Intelligence and then define our process to learn and information to output:

Me:

Summarize what qualifies as AI in one paragraph.

ChatGPT:

Artificial intelligence (AI) refers to the development of computer systems and algorithms that can perform tasks typically requiring human intelligence, such as understanding natural language, recognizing patterns, making decisions, and problem-solving. AI encompasses various techniques, including machine learning, which enables systems to learn from data. AI technology excels at data processing, pattern recognition, and decision-making, often outperforming humans in these specific areas. While AI has made significant advances in narrow or specialized tasks, the quest for achieving human-level general intelligence (AGI) remains a long-term goal in the field. Ethical considerations surrounding bias, transparency, and societal impact are central to the evolving AI landscape.


Given that, lets build a Portfolio Manager/Financial Advisor to evaluate the claim that "AI technology excels at data processing, pattern recognition, and decision-making, often outperforming humans in these specific areas." and allow it a selection of securities to analyze and trade over a reasonable time period.


A successful model will be defined as outperforming the market, defined as the S&P 500 index, over the same time period, evaluating the claim that AI "excels at data processing, pattern recognition, and decision-making, often outperforming humans in these specific areas."


For simplicity, we will give it a portfolio that only holds one security at a time; on any given day it can hold a short volatility ETP, cash, or a long volatility ETP only trading once per day at market close. Essentially, we force it to choose the day before what it wants to own tomorrow. It is making a probabilistic bet on which security will outperform the next day based on only information from prior to that day.


Remember that this is not specific to finance, you can likely apply this technique to most if not all timeseries data:


Maybe you work in manufacturing and are in charge of maintaining the reliability of equipment. Perhaps a simple logistic model doesn't do the trick and you're feeling extra motivated.


Maybe you want to model the sales of your business under extremely non-normal and/or non-linear conditions.


if your data can be expressed over time, this method may add value.


Defining our Variables and Relevant Timeframe


The model will have access to various datasets starting in 1990. Every dataset will be public information easily accessible through free resources like yahoo finance or trading view (most account levels of trading view are paid). The model inputs will be wholly derived from market information (i.e. correlation between assets, volatility, indices like the VIX, VOLI, and VVIX, as well as futures markets, etc.).


Implementing the Deep Learning Model


1) Data Collection:

Gather historical financial time-series data for the assets you are interested in. This data can include daily or intraday price movements, trading volumes, and various relevant metrics (e.g., moving averages, volatility, economic indicators, etc.) and calculated statistics or consistent methodology indices (e.g. VIX, Put/Call Ratios, consumer confidence etc.) or inconsistent methodology indices that are widely used like the consumer price index (CPI) or other measures derived from it like the rate of inflation.


See the below example of data collected via yahoo finance and the CBOE free data-sets:

If you feel inclined to save time and run a VBA script see below:


2) Data Preprocessing:

Clean the data by handling missing values and outliers. Convert the time-series data into a suitable format for deep learning, such as numerical sequences. Normalize or scale the data to ensure that different assets or metrics are on a consistent scale.


This step is particularly important for data like the VIX.

I suggest going to the source of the data as Yahoo finance shows "null" values for days where the market was closed or for whatever reason the data was not provided. The CBOE data on the right does not. I'd much rather our model learn that data doesn't exist than teach it what null means.


3) Feature Engineering:

Create additional features that may provide context and relevant information for your model. These could include technical indicators (e.g., RSI, MACD, or the Heikin-Ashi Technique), sentiment analysis scores, or fundamental indicators (e.g., price-to-earnings ratios), and statistics generated from the underlying data.


As you can see below, I used PowerPivot for excel to create relationships between my various data sets by connecting them through the date field. If you're not comfortable the same output can be accomplished via Xlookup, Vlookup, or Index/Match with less efficiency.


Then for the purpose of keeping this example simplistic to ease understanding of the models process and not any statistics, I calculated simple features for our data; roll yield in the futures market (technically the contango level on a monthly weighted contract), the as at percent rank of the VIX, the Volatility Risk Premium (VRP), the daily ETP decay (contango/roll yield type meaure divided by the days left in the contract, etc.


As before, connect via PowerPivot or use lookup based formulas if you hate saving time and effort.


4) Data Splitting:

Split your dataset into training, validation, and test sets. The training set is used to train the model, the validation set is used for hyperparameter tuning, and the test set is for evaluating the model's performance.


The VIX Futures launched March 26th, 2004. That gives us 19 years, 6 months and 18 days of data to test strategies on. Let's ensure we have the relevant data for day 1 (so we need to start the next day so we have a daily close / previous close return value).


I think having 20 years of data sounds nice too so I will revisit this in March. For now, I think 9 years training data then 9 years validation is sufficient for this example. After all, this is nothing new. I believe the excel implementation is what sets this article apart.



5) Model Selection:

Choose an appropriate deep learning architecture for your time-series data. Recurrent Neural Networks (RNNs), Long Short-Term Memory networks (LSTMs), or even more recent models like Transformers can be considered. Experiment with different model architectures and hyperparameters to find the one that works best for your dataset.


As an introductory article, we won't get into these specific models. Instead, from the ground up, we will build our own model so the reasons we do any step can be easily explained and understood.


6) Model Training:

Train your selected model on the training data using the historical time-series and feature data. Utilize the training process we discussed earlier, including backpropagation, gradient descent, and regularization techniques to optimize your model.


Directly performing backpropagation in excel seems like a fancy way of saying use the built in excel solver to me (If you do want to learn backpropagation check out How to Code a Neural Network with Backpropagation In Python (from scratch) - MachineLearningMastery.com


Other wise...


Go to the data tab (if you haven't used solver before you need to go to file, options, add ins, and ensure solver is an add-in) then click solver then select your weighting and bias cells to be optimized. By the way, the solver uses derivatives to solve as well so it is kind of backpropagation.


Now your "training" is optimizing via excel solver:

Over that 9 year period the portfolios grew from $10,000.00 a piece to $56,697.41 for the Actively Traded Portfolio we built and only $16,957.12 for the S&P 500.


At first glance this is a pretty smart AI Model!


Don't get your hopes up, nothing is ever that easy.


7) Model Evaluation:

Assess your model's performance using the validation dataset. Common evaluation metrics for financial data include Sharpe ratio, mean squared error (MSE), or accuracy if you're using classification. Pay attention to potential issues like overfitting and ensure that the model generalizes well.


Lets take a look at 9 more years under the same assumptions as the first 9:

Well... if the goal was to beat the market then good job. If the goal was to live a stress free life, not so good of a job.


That's no problem. That's why we train, validate, retrain, and back-test. The risk-level is too high in this model. That's not the models fault. I set it up that way to show visually why optimizing on total return alone will likely not be what you want.


You end up with an overfit model.


To incorporate risk level into your model I would suggest measuring the maximum drawdown. The peak to trough decline in value is what you actually feel. If you prefer the shape ratio or some other measure then go ahead. I also don't like how the model decided to just never trade long volatility. I added a constraint requiring the product of all returns over that 18 year period be greater than 100%. If we're trading volatility I want the protection long volatility provides, in brief spikes, during times of crisis.



Now that might be tough to see but there's the tail-hedge appeal of these products. UVXY spiked almost 800% during the COVID scare.


But I want the tail-hedge properties and the bull market. I want my cake and I want to eat it.


Remember, each portfolio started at $10,000.00

This is not fully optimized.


The more you reduce error in the training and validation periods the more error you, likely, will add in the testing stage.


Since I know you're curious, here is one more pass through of the optimization:


My apologies that the sizing of the graphs keeps changing. I'm building this spreadsheet at the same time as I'm writing this article.


8) Model Testing:

After satisfactory performance on the validation set, evaluate your model on the test dataset to ensure it can make accurate predictions on unseen data.


***I'll come back to this step in March when there's sufficient data***

Maybe consider signing up to be notified on new posts.


9) Portfolio Construction:

Once you have a well-performing model, you can use its predictions to construct a portfolio. This can involve allocating funds to different assets or strategies based on the model's recommendations.


10) Risk Management:

Implement risk management strategies to protect your investments. Diversify your portfolio, set stop-loss orders, and establish rules for position sizing.


11) Continuous Monitoring and Refinement:

The financial markets are dynamic and subject to changing conditions. Continuously monitor the model's performance and adapt it as needed. This may include retraining it with more recent data or adjusting its parameters.


12) Back-testing:

Before implementing your model in a real-world scenario, perform back-testing by applying the model to historical data to see how it would have performed in the past. This can help you understand its strengths and weaknesses.


Please remember that investing in financial markets carries inherent risks, and there are no guarantees of success. Additionally, deep learning models for financial predictions are complex and require expertise in both finance and machine learning. It's crucial to conduct thorough research, seek professional advice, and practice risk management when making investment decisions.


13) Ongoing Use and Testing of the Model

Monitoring a deep learning model's performance and predictions on a daily basis can be appropriate, especially in dynamic and rapidly changing financial markets. To do this effectively, you can follow a systematic and statistically sound approach:


a) Data Collection and Update:

  • Collect daily updates of financial time-series data, including asset prices, volumes, and relevant metrics.

  • Ensure that your dataset is consistent and includes the latest available data.


b) Daily Model Prediction:

  • Run your deep learning model on the most recent data to generate daily predictions. These predictions may include asset price movements, recommendations for trading positions, or other relevant signals.


c) Performance Metrics:

  • Choose a set of performance metrics that are appropriate for your investment strategy. Common metrics for evaluating the model's daily performance include:

  • Sharpe Ratio: Measures the risk-adjusted return of your portfolio.

  • Maximum Drawdown: Captures the largest peak-to-trough decline in portfolio value.

  • Win-Loss Ratio: Calculates the ratio of profitable trades to losing trades.

  • Accuracy or F1 Score: If your model is used for binary classification (e.g., buy/sell decisions).

  • Mean Squared Error (MSE): If your model predicts price movements.


d) Portfolio Simulation:

  • Based on your model's daily predictions, simulate a portfolio by allocating funds to assets according to its recommendations. Track the portfolio's daily performance.

  • Implement proper risk management strategies, such as stop-loss orders, to limit potential losses.


e) Record Keeping:

  • Maintain a comprehensive record of your daily model predictions, portfolio performance, and all relevant metrics. This record should include timestamps, investment decisions, and performance outcomes.


f) Statistical Analysis:

  • Regularly perform statistical analysis on the recorded data to assess the model's performance. This can involve calculating daily returns, risk-adjusted returns, and evaluating the consistency of the model's recommendations.


g) Adaptation and Fine-Tuning:

  • If the model's performance deteriorates or if market conditions change significantly, consider retraining the model with the most recent data. This ensures that the model is adaptive to evolving market dynamics.

  • Adjust hyperparameters, update feature engineering, or explore alternative model architectures as needed.


h) Market Events and News:

  • Stay informed about market events, economic news, and geopolitical developments. These external factors can significantly impact financial markets and may necessitate adjustments to your model or portfolio.


i) Backtesting and Model Validation:

  • Continuously validate your model's predictions against historical data to understand how it would have performed in past market conditions. This helps in assessing its robustness.


j) Expert Consultation:

  • Consider consulting with experts in the field of finance and machine learning to gain additional insights and validation for your model and strategy.


Concluding Remarks


This may be the most involved piece of work I've written so far, even with ChatGPT lending a helping hand.


I'll be returning to this in march when we have 20 years of volatility data to use and post an update on how this robo-trader would have done.


I truly hope you found value in this content and can apply it to improve areas of your work or life in general.


Thank you for taking the time to support my work.



Written by: Morgan Price. Morgan Price | LinkedIn



I haven't decided whether or not I'll be posting this spreadsheet. If you want to take a look at it, and it isn't posted, reach out.

Comments


bottom of page