Predicting the NASDAQ 100 with Hyperparameter Tuning

Pressure on the markets has intensified ahead of the Wednesday deadline for trade deals. We will model monthly Nasdaq 100 data with the Federal Funds Effective Rate and the Unemployment Rate. We will use Boosted trees with hyperparameter tuning.

library(tidyverse)
library(tidymodels)
library(modeltime)
library(timetk)
library(tidyquant)

#Unemployment Rate (UNRATE)
df_unrate <- 
  tq_get("UNRATE", get = "economic.data") %>% 
  select(date, unrate = price)
  
#Federal Funds Effective Rate (FEDFUNDS)
df_fedfunds <- 
  tq_get("FEDFUNDS", get = "economic.data") %>% 
  select(date, fedfunds = price)

#Nasdaq 100
df_nasdaq <- 
  tq_get("^NDX") %>% 
  tq_transmute(select = close,
               mutate_fun = to.monthly,
               col_rename = "nasdaq") %>% 
  mutate(date = as.Date(date))


#Merging the datasets
df_merged <- 
  df_unrate %>% 
  left_join(df_fedfunds) %>% 
  left_join(df_nasdaq) %>% 
  drop_na() 


#Split Data 
splits <- 
  time_series_split(
  df_merged,
  assess     = "1 year",
  cumulative = TRUE
)

df_train <- training(splits)
df_test <- testing(splits)



#Recipe
recipe_ml <- 
  recipe(nasdaq ~ ., df_train) %>%
  step_date(date, features = "month", ordinal = FALSE) %>%
  step_dummy(all_nominal_predictors(), one_hot = TRUE) %>% 
  step_mutate(date_num = as.numeric(date)) %>%
  step_normalize(all_numeric_predictors()) %>%
  step_rm(date)

#Model spec
mod_spec <- 
  boost_tree(trees = tune(),
             min_n = tune(),
             tree_depth = tune(),
             learn_rate = tune()) %>%
  set_engine("xgboost") %>% 
  set_mode("regression")

#Hyperparameter Tuning
mod_param <- extract_parameter_set_dials(mod_spec)

set.seed(1234)
model_tbl <- 
  mod_param %>% 
  grid_random(size = 50) %>%
  create_model_grid(
    f_model_spec = boost_tree,
    engine_name  = "xgboost",
    mode         = "regression"
  )

#Extracting the model list
model_list <- model_tbl$.models

#Workflowsets
model_wfset <- 
  workflow_set(
  preproc = list(recipe_ml),
  models = model_list, 
  cross = TRUE
)


#Fitting Using Parallel Backend
model_parallel_tbl <- 
  model_wfset %>%
  modeltime_fit_workflowset(
    data    = df_train,
    control = control_fit_workflowset(
      verbose   = TRUE,
      allow_par = TRUE
    )
  )



#Accuracy 
model_parallel_tbl %>% 
  modeltime_calibrate(new_data = df_test) %>% 
  modeltime_accuracy() %>%
  table_modeltime_accuracy()



#Calibration to the test set for the best model
calibration_tbl <- 
  model_parallel_tbl %>%
  filter(.model_desc == "RECIPE_BOOST_TREE_27") %>% 
  modeltime_calibrate(df_test)
  


#Prediction Intervals
calibration_tbl %>% 
  modeltime_forecast(new_data = df_test, 
                     actual_data = df_merged %>%
                                   filter(date >= as.Date("2024-07-01"))) %>%
  plot_modeltime_forecast(.interactive = FALSE,
                          .legend_show = FALSE,
                          .line_size = 1.5,
                          .color_lab = "",
                          .title = "NASDAQ 100") +
  
  labs(subtitle = "<span style = 'color:dimgrey;'>Predictive Intervals</span><br><span style = 'color:red;'>ML Model</span>") + 
  scale_x_date(expand = expansion(mult = c(.1, .15)),
               labels = scales::label_date(format = "%b'%y")) +
  scale_y_continuous(labels = scales::label_currency()) +
  theme_minimal(base_family = "Roboto Slab", base_size = 20) +
  theme(legend.position = "none",
        plot.background = element_rect(fill = "azure", 
                                       color = "azure"),
        plot.title = element_text(face = "bold"),
        axis.text = element_text(face = "bold"),
        plot.subtitle = ggtext::element_markdown(face = "bold"))

According to the model, it would be better to wait until tariff uncertainty comes to an end to enter the market.

One response to “Predicting the NASDAQ 100 with Hyperparameter Tuning”

  1. Foma Ovolevor Avatar

    Nice work….The libraries should include tidyquant package inorder for the function tq_gt(
    ) to run in the code.

    Liked by 1 person

Leave a reply to Foma Ovolevor Cancel reply

I’m Selcuk Disci

Welcome to DataGeeek.com, dedicated to data science and machine learning with R, mostly based on financial data.

Let’s connect