Differential Machine Learning with Twin Networks in R: Bitcoin Forecasting with Volatility Proxies
7 mins read

Differential Machine Learning with Twin Networks in R: Bitcoin Forecasting with Volatility Proxies

[This article was first published on DataGeeek, and kindly contributed to R-bloggers]. (You can report a problem with the content of this page here)


Do you want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

Introduction

Differential machine learning (DML), as introduced in the recent arXiv article (Differential machine learning for 0DTE options with stochastic volatility and jumps)extends supervised learning by incorporating not only the values ​​of functions but also their derivatives. In financial contexts, this often means sensitivities such as those of the Greeks. However, when direct derivatives are not available, we can estimate market dynamics using volatility indicators.

In this project, we adapt the DML to Bitcoin price prediction. Instead of derivatives we use RSI, MACD and Bollinger Bands as indicators of volatility. These indicators capture momentum, trend strength and price dispersion, providing a practical way to incorporate uncertainty into the learning process. To implement this, we design a dual network architecture in Keras: one network learns price dynamics from temporal features, while the other learns volatility signals. Finally, we combine them via a stacking ensemble to obtain robust forecasts with confidence intervals.

Why volatility variables rather than derivatives?

  • RSI (Relative Strength Index): Measures momentum and overbought/oversold conditions.
  • MACD (Moving Average Convergence Divergence): Captures the direction and strength of the trend.
  • Bollinger Bands (upper/lower bands, %B): Quantifies price dispersion and volatility.

These indicators act as empirical substitutes for theoretical derivatives. Although DML in its pure form requires sensitivities, in practice these volatility indicators provide similar information about how prices respond to market forces.

Why twin networks?

The idea is to separate the learning tasks:

  • THE main network models the continuous component of the pricing process.
  • THE auxiliary network models the volatility/jump component. Together they mimic the decomposition found in stochastic models such as Bates or Heston, but implemented in a flexible neural framework.

Together via Stacking

Once the two networks are trained, their predictions are combined using a linear regression meta-model. This stacking set learns the optimal weighting between primary and auxiliary outputs. The result is a forecast that integrates both trend and volatility signals, significantly improving accuracy compared to either network alone.

Assessment

  • Metrics: RMSE and MAPE, calculated with the yardstick pack.
  • Results:
    • Individual networks → RMSE ~76,000, MAPE ~99%.
    • Stacking set → RMSE ~ 3,030, MAPE ~ 3.65.

This demonstrates the power of combining price and volatility signals into a unified framework.

Confidence intervals

To quantify the uncertainty, we calculate confidence intervals based on residuals around point forecasts:

Yes^t±1.96presidue

This approach uses the standard deviation of the training residuals to generate 95% confidence bands. It provides interpretable uncertainty estimates without requiring explicit probabilistic modeling.

Visualization

The forecasts are visualized with ggplot2:

  • Gray ribbon → confidence intervals.
  • Red line → stackable ensemble forecast.
  • Black line → real BTC prices.

This design clearly communicates both the central forecast and the uncertainty range. The chart you’ll include at the end of the blog shows exactly this: a red forecast line, black actuals, and a gray confidence band, illustrating how the whole package incorporates volatility information into the predictive intervals.

Keras3 in R: Flexible Deep Learning for Financial Forecasting

What is Keras3?

Difficult 3 is the modern R interface to the Keras deep learning library, built on top of TensorFlow. It allows R users to define, train, and evaluate neural networks with concise syntax while leveraging the computational power of TensorFlow. Unlike previous versions, Keras3 is fully aligned with TensorFlow 2.x, ensuring long-term support and compatibility.

How we used Keras3

In our workflow, Keras3 was the backbone of the implementation of the dual network architecture:

Why ReLU?

  • ReLU (Rectified Linear Unit) is the activation function used in hidden layers.
  • Formula: resume(x)=maximum(0,x).
  • Benefits:
    • Introduces nonlinearity, allowing the network to learn complex relationships.
    • Effective and helps prevent the disappearance of gradients.
    • Well suited to financial data where signals can be sparse and directional.

Why Adam?

  • Adam (adaptive moment estimation) is the chosen optimizer.
  • Combined momentum (using past gradients to speed up learning) and adaptive learning rate (step size adjustment by parameter).
  • Benefits:
    • Robust for noisy and non-stationary data such as cryptocurrency prices.
    • Requires minimal adjustment, making it ideal for plug-and-play workflows.
    • Widely adopted in academic and applied machine learning.

Contribution to the R ecosystem

Keras3 bridges the gap between R Tidyverse/Tidymodels ecosystem and modern deep learning:

  • Integrates seamlessly into data preprocessing pipelines (recipes, timetk).
  • Allows financial analysts and data scientists to stay in R while accessing the deep learning capabilities of TensorFlow.
  • Encourages reproducibility: models can be defined, trained and evaluated entirely in R, without switching to Python.
  • Expands the role of R beyond traditional statistical modeling cutting-edge AI applications.

Why it matters to DML

Using Keras3:

  • We could separate learning tasks into a primary network (trend/seasonality) and an auxiliary network (volatility/momentum).
  • Both networks were trained on ReLU activations and Adam optimization, ensuring stability and efficiency.
  • Their results were combined into a stackable package, producing forecasts incorporating both price dynamics and volatility signals.

This demonstrates how Keras3 allows R users to implement advanced architectures such as twin networks, making differential machine learning concepts practical in financial forecasting.

Conclusion

This case study shows how differential machine learning concepts can be adapted for financial forecasting in R:

  • Volatility indicators are convenient substitutes for derivatives.
  • Keras’ dual network architecture captures both trends and volatility.
  • Ensemble stacking significantly improves predictive performance.
  • Confidence intervals based on residuals provide interpretable uncertainty estimates.

By combining academic ideas with repeatable R workflows, we can create robust forecasting pipelines that connect theory and practice.


PakarPBN

A Private Blog Network (PBN) is a collection of websites that are controlled by a single individual or organization and used primarily to build backlinks to a “money site” in order to influence its ranking in search engines such as Google. The core idea behind a PBN is based on the importance of backlinks in Google’s ranking algorithm. Since Google views backlinks as signals of authority and trust, some website owners attempt to artificially create these signals through a controlled network of sites.

In a typical PBN setup, the owner acquires expired or aged domains that already have existing authority, backlinks, and history. These domains are rebuilt with new content and hosted separately, often using different IP addresses, hosting providers, themes, and ownership details to make them appear unrelated. Within the content published on these sites, links are strategically placed that point to the main website the owner wants to rank higher. By doing this, the owner attempts to pass link equity (also known as “link juice”) from the PBN sites to the target website.

The purpose of a PBN is to give the impression that the target website is naturally earning links from multiple independent sources. If done effectively, this can temporarily improve keyword rankings, increase organic visibility, and drive more traffic from search results.

Jasa Backlink

Download Anime Batch