Distance to Default (DtD) using the Merton model in R

4 minute read

Published: April 03, 2025

The Merton model, introduced by Robert C. Merton in 1974, conceptualizes a company’s equity as a call option on its assets, with the debt’s face value serving as the strike price. This framework is instrumental in assessing a firm’s credit risk by estimating the probability of default.

In this post I will explain how to calculate the Distance to Default (DtD) using the Merton model in R. The DD measures how many standard deviations the firm’s asset value is away from the default point, providing a summary of the firm’s credit risk.

To implement the Merton DtD model in R, I have drawn inspiration from the SAS code provided by Bharath and Shumway (2008) and the methodologies discussed by Mingze Gao. The process involves estimating the firm’s asset value and its volatility iteratively, then calculating the DtD based on these estimates.

Step 1: Setting Up the Environment

The script starts by clearing the environment and setting options for numerical precision:

rm(list=ls())  # Clear all user-defined objects
options(digits=4, stringsAsFactors = FALSE, scipen=999)  # Set numerical precision

Then, it loads necessary libraries:

library(tidyverse)
library(dplyr)
library(tidyr)
library(roll)

These packages are essential for data manipulation, time series calculations, and rolling-window statistics.

Step 2: Loading and Preparing Data

2.1 Load CRSP/COMPUSTAT Daily Data

The script loads daily stock market data from CRSP/COMPUSTAT and calculates market equity (E) and rolling volatility:

ccm_data <- read.csv("crsp_compustat_daily_sascha.csv") %>%
  mutate(datadate = as.Date(datadate, format = "%d/%m/%Y")) %>%
  mutate(E = cshoc * prccd) %>%  # Market value of equity
  group_by(GVKEY) %>%
  mutate(returns = (prccd - lag(prccd, 1)) / lag(prccd, 1)) %>%
  mutate(sigma_E = roll::roll_sd(returns, width = 252, min_obs = 30))

The sigma_E variable represents the rolling standard deviation of stock returns, which approximates equity volatility.

2.2 Load Quarterly COMPUSTAT Data

Quarterly firm liabilities (debt) are imported and interpolated:

cmp_data <- read.csv("compustat_qtrl_sascha.csv") %>%
  mutate(date = as.Date(datadate, format = "%d/%m/%Y")) %>%
  mutate(F = (dlcq + dlttq) * 1000000) %>%  # Total firm debt
  fill(F, .direction = "down") %>%  # Forward-fill missing values
  replace_na(list(F = 0))  # Set missing values to zero

The debt values are joined with the daily dataset and forward-filled between quarters.

2.3 Load Risk-Free Rate Data

The script loads and processes 3-month Treasury bill rates:

frb_rates_monthly <- read.csv("frb_rates_monthly.csv") %>%
  mutate(date = as.Date(date, format = "%d/%m/%Y"), r = as.numeric(RIFLGFCM03_N.B) / 100) %>%
  fill(r, .direction = "down") %>%
  mutate(r = ifelse(r == 0, 0.01, r))  # Set zero rates to 0.1%

This provides the risk-free interest rate (r), a key input for the Merton model.

2.4 Data Cleaning and Finalization

The dataset is cleaned and filtered:

panel_data <- ccm_data %>%
  filter(!is.na(sigma_E) & !is.na(E) & !is.na(F) & F > 0) %>%
  mutate(E = E/1000000, F = F/1000000)  # Convert values to millions

Step 3: Implementing the Iterative KMV-Merton Model

The Merton model treats equity as a call option on the firm’s assets. The script estimates asset value (V) and volatility (sigma_V) iteratively:

3.1 Defining the Iteration Function

The function compute_group estimates asset values and default probabilities:

compute_group <- function(hist_data) {
  end_obs <- hist_data %>% filter(date == max(date)) %>% slice(1)
  E_end <- end_obs$E; F_val <- end_obs$F; r_val <- end_obs$r; sigma_E_val <- end_obs$sigma_E
  sigma_V <- sigma_E_val * E_end / (E_end + F_val)  # Initial guess

  iter <- 0; tol <- 1e-3; max_iter <- 100
  repeat {
    iter <- iter + 1
    solve_V <- function(E_daily) {
      f_equity <- function(V) {
        d1 <- (log(V/F_val) + (r_val + 0.5*sigma_V^2)) / (sigma_V)
        d2 <- d1 - sigma_V
        V * pnorm(d1) - exp(-r_val) * F_val * pnorm(d2) - E_daily
      }
      uniroot(f_equity, lower=F_val, upper=10*E_daily)$root
    }
    V_series <- map_dbl(hist_data$E, solve_V)
    sigma_V_new <- sqrt(252) * sd(diff(log(V_series)), na.rm=TRUE)
    if (abs(sigma_V_new - sigma_V) < tol || iter >= max_iter) break
    sigma_V <- sigma_V_new
  }

  V_end <- tail(V_series, 1)
  DD_val <- (log(V_end/F_val) + (r_val - 0.5 * sigma_V^2)) / sigma_V
  EDF_val <- 100 * pnorm(-DD_val)
  tibble(V=V_end, sigma_V=sigma_V, DD=DD_val, EDF=EDF_val, iterations=iter)
}

This function estimates:

V: Firm’s asset value
sigma_V: Asset volatility
DD: Distance to Default (DtD)
EDF: Expected Default Frequency

A safe wrapper ensures errors return NA values.

3.2 Applying the Model Firm-by-Firm

The iterative procedure is applied to all firms:

results <- panel_data %>%
  group_by(gvkey) %>%
  arrange(date) %>%
  group_modify(~ {
    tibble(date = unique(.x$date)) %>%
      mutate(calc = map(date, ~ safe_compute_group(filter(.x, date <= . & date >= . - 365)))) %>%
      unnest(calc)
  }) %>%
  ungroup()

The function calculates DtD for each firm, using a 365-day rolling window.

Step 4: Exporting the Results

Finally, the computed results are saved:

write.csv(final_results, "updated_dtd.csv")

This file contains firm-level DtD values over time

Share on

Twitter Facebook LinkedIn

Alessandro Spina