Predicting Gold Prices: Backtesting of ML Models – DataGeeek

Economics, finance, ML, R, Time Series Machine Learning

Predicting Gold Prices: Backtesting of ML Models

Published by

Selcuk Disci

on

Predicting Gold Prices: Backtesting of ML Models

Fitch projects a decline of about 30% in gold in 2026. Easing the trade war and the Israel-Iran conflict may support this idea. We will project how the prices could go by the end of the year.

We will use the modeltime.resample package for forecasting modeling.

library(tidymodels)
library(modeltime)
library(modeltime.resample)
library(tidyverse)
library(tidyquant)
library(timetk)

#Gold Futures
df_gold <- 
  tq_get("GC=F") %>% 
  select(date, close) %>% 
  drop_na()


#Make a Cross-Validation Training Plan
resamples_tscv <- 
  time_series_cv(
  data        = df_gold,
  assess      = "6 months",
  initial     = "5 years",
  skip        = "1 years",
  slice_limit = 4
)

#Begin with a Cross Validation Strategy
resamples_tscv %>%
  tk_time_series_cv_plan() %>%
  plot_time_series_cv_plan(date,
                           close, 
                           .facet_ncol = 2, 
                           .interactive = FALSE)


#Model 1: auto_arima
model_arima <- 
  arima_reg() %>%
  set_engine(engine = "auto_arima") %>% 
  fit(close ~ date, data = df_gold)

#Model 2: prophet
model_prophet <- 
  prophet_reg() %>%
  set_engine(engine = "prophet") %>% 
  fit(close ~ date, data = df_gold)

#Model 3: glmnet
model_glmnet <- 
  linear_reg(penalty = 0.2) %>%
  set_engine("glmnet")

rec_glmnet <- 
  recipe(close ~ ., data = df_gold) %>% 
  step_mutate(date_num = as.numeric(date)) %>% 
  step_date(date, features = "month") %>% 
  step_rm(date) %>% 
  step_dummy(all_nominal_predictors(), one_hot = TRUE) %>% 
  step_normalize(all_numeric_predictors()) 

glmnet_fit <- 
  workflow() %>% 
  add_recipe(rec_glmnet) %>% 
  add_model(model_glmnet) %>% 
  fit(df_gold)
  
#Modeltime Table
gold_models <- 
  modeltime_table(
    model_arima,
    model_prophet,
    glmnet_fit
  )

#Generate Resample Predictions
resamples_fitted <- 
  gold_models %>%
  modeltime_fit_resamples(
    resamples = resamples_tscv,
    control   = control_resamples(verbose = FALSE)
  )

#Accuracy Table
resamples_fitted %>%
  modeltime_resample_accuracy(summary_fns = mean) %>%
  table_modeltime_accuracy(.interactive = FALSE)


#Calibration for the Prophet Model 
calibration_prophet <- 
  model_prophet %>% 
  modeltime_calibrate(new_data = df_gold)

#Accuracy of the finalized model
calibration_prophet %>%
  modeltime_accuracy(metric_set = metric_set(rmse, rsq, mape))


#Forecast Forward
calibration_prophet %>% 
  modeltime_forecast(h = "6 months", 
                     actual_data = df_gold %>% 
                                   filter(date>= as.Date("2025-01-01"))) %>%
  plot_modeltime_forecast(.interactive = FALSE,
                          .legend_show = FALSE,
                          .line_size = 1.5,
                          .color_lab = "",
                          .title = "Gold Futures") +
  
  labs(subtitle = "<span style = 'color:dimgrey;'>Predictive Intervals</span><br><span style = 'color:red;'>ML Model</span>") + 
  scale_x_date(expand = expansion(mult = c(.1, .1)),
               labels = scales::label_date(format = "%b'%y")) +
  scale_y_continuous(labels = scales::label_currency()) +
  theme_minimal(base_family = "Roboto Slab", base_size = 20) +
  theme(legend.position = "none",
        plot.background = element_rect(fill = "azure", 
                                       color = "azure"),
        plot.title = element_text(face = "bold"),
        axis.text = element_text(face = "bold"),
        plot.subtitle = ggtext::element_markdown(face = "bold"))

According to the Prophet model, the gold price seems to be difficult to reach the peak again by the end of the year.

2 responses to “Predicting Gold Prices: Backtesting of ML Models”

Alan

June 30, 2025

Hi Selcuk, I think your code is missing the tidyquant library.

Keep up the great work 🙂

Alan

LikeLike

Reply
1. Selcuk Disci
  
  June 30, 2025
  
  Thank you.
  
  LikeLike
  
  Reply

Leave a comment Cancel reply

Hello,

I’m Selcuk Disci

The DataGeeek focuses on machine learning, deep learning, and Generative AI in data science using financial data for educational and informational purposes.

Let’s connect

Join the fun!

Stay updated with our latest tutorials and ideas by joining our newsletter.