Distance to Default (DtD) using the Merton model in R
Published:
The Merton model, introduced by Robert C. Merton in 1974, conceptualizes a company’s equity as a call option on its assets, with the debt’s face value serving as the strike price. This framework is instrumental in assessing a firm’s credit risk by estimating the probability of default.
In this post I will explain how to calculate the Distance to Default (DtD) using the Merton model in R. The DD measures how many standard deviations the firm’s asset value is away from the default point, providing a summary of the firm’s credit risk.
To implement the Merton DtD model in R, I have drawn inspiration from the SAS code provided by Bharath and Shumway (2008) and the methodologies discussed by Mingze Gao. The process involves estimating the firm’s asset value and its volatility iteratively, then calculating the DtD based on these estimates.
Step 1: Setting Up the Environment
The script starts by clearing the environment and setting options for numerical precision:
rm(list=ls()) # Clear all user-defined objects
options(digits=4, stringsAsFactors = FALSE, scipen=999) # Set numerical precision
Then, it loads necessary libraries:
library(tidyverse)
library(dplyr)
library(tidyr)
library(roll)
These packages are essential for data manipulation, time series calculations, and rolling-window statistics.
Step 2: Loading and Preparing Data
2.1 Load CRSP/COMPUSTAT Daily Data
The script loads daily stock market data from CRSP/COMPUSTAT and calculates market equity (E) and rolling volatility:
ccm_data <- read.csv("crsp_compustat_daily_sascha.csv") %>%
mutate(datadate = as.Date(datadate, format = "%d/%m/%Y")) %>%
mutate(E = cshoc * prccd) %>% # Market value of equity
group_by(GVKEY) %>%
mutate(returns = (prccd - lag(prccd, 1)) / lag(prccd, 1)) %>%
mutate(sigma_E = roll::roll_sd(returns, width = 252, min_obs = 30))
The sigma_E
variable represents the rolling standard deviation of stock returns, which approximates equity volatility.
2.2 Load Quarterly COMPUSTAT Data
Quarterly firm liabilities (debt) are imported and interpolated:
cmp_data <- read.csv("compustat_qtrl_sascha.csv") %>%
mutate(date = as.Date(datadate, format = "%d/%m/%Y")) %>%
mutate(F = (dlcq + dlttq) * 1000000) %>% # Total firm debt
fill(F, .direction = "down") %>% # Forward-fill missing values
replace_na(list(F = 0)) # Set missing values to zero
The debt values are joined with the daily dataset and forward-filled between quarters.
2.3 Load Risk-Free Rate Data
The script loads and processes 3-month Treasury bill rates:
frb_rates_monthly <- read.csv("frb_rates_monthly.csv") %>%
mutate(date = as.Date(date, format = "%d/%m/%Y"), r = as.numeric(RIFLGFCM03_N.B) / 100) %>%
fill(r, .direction = "down") %>%
mutate(r = ifelse(r == 0, 0.01, r)) # Set zero rates to 0.1%
This provides the risk-free interest rate (r), a key input for the Merton model.
2.4 Data Cleaning and Finalization
The dataset is cleaned and filtered:
panel_data <- ccm_data %>%
filter(!is.na(sigma_E) & !is.na(E) & !is.na(F) & F > 0) %>%
mutate(E = E/1000000, F = F/1000000) # Convert values to millions
Step 3: Implementing the Iterative KMV-Merton Model
The Merton model treats equity as a call option on the firm’s assets. The script estimates asset value (V) and volatility (sigma_V) iteratively:
3.1 Defining the Iteration Function
The function compute_group
estimates asset values and default probabilities:
compute_group <- function(hist_data) {
end_obs <- hist_data %>% filter(date == max(date)) %>% slice(1)
E_end <- end_obs$E; F_val <- end_obs$F; r_val <- end_obs$r; sigma_E_val <- end_obs$sigma_E
sigma_V <- sigma_E_val * E_end / (E_end + F_val) # Initial guess
iter <- 0; tol <- 1e-3; max_iter <- 100
repeat {
iter <- iter + 1
solve_V <- function(E_daily) {
f_equity <- function(V) {
d1 <- (log(V/F_val) + (r_val + 0.5*sigma_V^2)) / (sigma_V)
d2 <- d1 - sigma_V
V * pnorm(d1) - exp(-r_val) * F_val * pnorm(d2) - E_daily
}
uniroot(f_equity, lower=F_val, upper=10*E_daily)$root
}
V_series <- map_dbl(hist_data$E, solve_V)
sigma_V_new <- sqrt(252) * sd(diff(log(V_series)), na.rm=TRUE)
if (abs(sigma_V_new - sigma_V) < tol || iter >= max_iter) break
sigma_V <- sigma_V_new
}
V_end <- tail(V_series, 1)
DD_val <- (log(V_end/F_val) + (r_val - 0.5 * sigma_V^2)) / sigma_V
EDF_val <- 100 * pnorm(-DD_val)
tibble(V=V_end, sigma_V=sigma_V, DD=DD_val, EDF=EDF_val, iterations=iter)
}
This function estimates:
V
: Firm’s asset valuesigma_V
: Asset volatilityDD
: Distance to Default (DtD)EDF
: Expected Default Frequency
A safe wrapper ensures errors return NA
values.
3.2 Applying the Model Firm-by-Firm
The iterative procedure is applied to all firms:
results <- panel_data %>%
group_by(gvkey) %>%
arrange(date) %>%
group_modify(~ {
tibble(date = unique(.x$date)) %>%
mutate(calc = map(date, ~ safe_compute_group(filter(.x, date <= . & date >= . - 365)))) %>%
unnest(calc)
}) %>%
ungroup()
The function calculates DtD for each firm, using a 365-day rolling window.
Step 4: Exporting the Results
Finally, the computed results are saved:
write.csv(final_results, "updated_dtd.csv")
This file contains firm-level DtD values over time