Title: | Regularization Ensemble for Robust Portfolio Optimization |
---|---|
Description: | Portfolio optimization is achieved through a combination of regularization techniques and ensemble methods that are designed to generate stable out-of-sample return predictions, particularly in the presence of strong correlations among assets. The package includes functions for data preparation, parallel processing, and portfolio analysis using methods such as Mean-Variance, James-Stein, LASSO, Ridge Regression, and Equal Weighting. It also provides visualization tools and performance metrics, such as the Sharpe ratio, volatility, and maximum drawdown, to assess the results. |
Authors: | Hardik Dixit [aut], Shijia Wang [aut], Bonsoo Koo [aut, cre], Cash Looi [aut], Hong Wang [aut] |
Maintainer: | Bonsoo Koo <[email protected]> |
License: | AGPL (>= 3) |
Version: | 0.1.0 |
Built: | 2024-12-10 07:00:35 UTC |
Source: | CRAN |
This function performs hierarchical clustering on asset correlations and returns the clustered groups.
buh.clust(x)
buh.clust(x)
x |
A numeric matrix of asset returns. |
A list of asset clusters.
A dataset containing financial data for portfolio analysis with 25 portfolios formed on size and book-to-market ratios.
FF25
FF25
A data frame with 25,670 rows and 27 variables:
Date in the format YYYYMMDD (int)
Another column (check what this is; update description as needed)
Portfolio returns for small firms with low book-to-market ratios (numeric)
Portfolio returns for medium-sized firms with book-to-market group 2 (numeric)
Portfolio returns for medium-sized firms with book-to-market group 3 (numeric)
Portfolio returns for medium-sized firms with book-to-market group 4 (numeric)
Portfolio returns for small firms with high book-to-market ratios (numeric)
Portfolio returns for medium-sized firms with book-to-market group 1 (numeric)
Portfolio returns for medium-sized firms with book-to-market group 2 (numeric)
Portfolio returns for medium-sized firms with book-to-market group 3 (numeric)
Portfolio returns for medium-sized firms with book-to-market group 4 (numeric)
Portfolio returns for medium-sized firms with book-to-market group 5 (numeric)
Portfolio returns for large firms with book-to-market group 1 (numeric)
Portfolio returns for large firms with book-to-market group 2 (numeric)
Portfolio returns for large firms with book-to-market group 3 (numeric)
Portfolio returns for large firms with book-to-market group 4 (numeric)
Portfolio returns for large firms with book-to-market group 5 (numeric)
Portfolio returns for firms in the fourth size and first book-to-market group (numeric)
Portfolio returns for firms in the fourth size and second book-to-market group (numeric)
Portfolio returns for firms in the fourth size and third book-to-market group (numeric)
Portfolio returns for firms in the fourth size and fourth book-to-market group (numeric)
Portfolio returns for firms in the fourth size and fifth book-to-market group (numeric)
Portfolio returns for large firms with low book-to-market ratios (numeric)
Portfolio returns for large firms with book-to-market group 2 (numeric)
Portfolio returns for large firms with book-to-market group 3 (numeric)
Portfolio returns for large firms with book-to-market group 4 (numeric)
Portfolio returns for large firms with high book-to-market ratios (numeric)
<Provide any relevant source here, e.g., URL or description of data origin>
data(FF25) head(FF25)
data(FF25) head(FF25)
This function inserts specified values at given positions in a vector.
insert.at(a, pos, ...)
insert.at(a, pos, ...)
a |
A vector. |
pos |
A numeric vector specifying the positions to insert the values. |
... |
Values to be inserted at the specified positions. |
A vector with the inserted values.
This function performs portfolio analysis using various methods such as Mean-Variance (MV), James-Stein (JM), LASSO, Ridge Regression, Equal Weighting (EW), among others. It calculates weights, turnover, returns, Sharpe ratios, volatility, and maximum drawdown for each method.
perform_analysis(x, mon, count, Date, num_cores = 7)
perform_analysis(x, mon, count, Date, num_cores = 7)
x |
A numeric matrix where each column represents asset returns and rows represent time periods. |
mon |
A numeric vector representing the number of months since the start date for each time period. |
count |
A numeric vector indicating the number of entries per month. |
Date |
A vector of Date objects representing the dates of the time periods. |
num_cores |
The number of cores to use for parallel processing. Default is 7. |
The function iterates through different time periods and calculates portfolio weights, turnover, and returns for multiple methods including Mean-Variance (MV), James-Stein (JM), and various regularization techniques. It also computes performance metrics like the Sharpe ratio, volatility, maximum drawdown, and cumulative turnover for each method. Visualization of the cumulative returns and turnover is generated using ggplot2.
A list containing the following components:
A ggplot object representing the cumulative returns for each method.
A ggplot object representing the cumulative turnover for each method.
A numeric vector of the mean turnover for each method.
A numeric vector of the Sharpe ratio for each method.
A numeric vector of the annualized volatility for each method.
A numeric vector of the maximum drawdown for each method.
The mean turnover for the volume-weighted (VW) portfolio.
The Sharpe ratio for the VW portfolio.
The annualized volatility for the VW portfolio.
The maximum drawdown for the VW portfolio.
# Create a larger example dataset that aligns with the function's expectations set.seed(123) x <- matrix(runif(700), ncol = 10) # 10 columns (assets), 70 rows (observations) mon <- rep(1:10, each = 7) # Example month identifiers, 7 observations per month count <- rep(7, 10) # Example count per month (7 entries per month) Date <- as.Date('2020-01-01') + 0:69 # Example date sequence (70 days) # Run the analysis with 2 cores result <- perform_analysis(x, mon, count, Date, num_cores = 2) # Display results print(result$cumulative_return_plot) print(result$cumulative_turnover_plot)
# Create a larger example dataset that aligns with the function's expectations set.seed(123) x <- matrix(runif(700), ncol = 10) # 10 columns (assets), 70 rows (observations) mon <- rep(1:10, each = 7) # Example month identifiers, 7 observations per month count <- rep(7, 10) # Example count per month (7 entries per month) Date <- as.Date('2020-01-01') + 0:69 # Example date sequence (70 days) # Run the analysis with 2 cores result <- perform_analysis(x, mon, count, Date, num_cores = 2) # Display results print(result$cumulative_return_plot) print(result$cumulative_turnover_plot)
This function performs LASSO, Ridge, or Elastic Net regression for portfolio optimization.
po.avg(y0, x0, method = "LASSO")
po.avg(y0, x0, method = "LASSO")
y0 |
A numeric vector of response values. |
x0 |
A numeric matrix of predictors. |
method |
The regularization method: "LASSO", "RIDGE", or "EN" (Elastic Net). |
A numeric vector of optimized portfolio weights.
This function performs portfolio optimization using clustering and LASSO regularization.
po.bhu(y0, x0, group, rep)
po.bhu(y0, x0, group, rep)
y0 |
A numeric vector of response values. |
x0 |
A numeric matrix of predictors. |
group |
A list of asset clusters. |
rep |
The number of repetitions for optimization. |
A numeric vector of optimized portfolio weights.
This function uses simple linear regression to perform portfolio optimization.
po.cols(y0, x0)
po.cols(y0, x0)
y0 |
A numeric vector of response values. |
x0 |
A numeric matrix of predictors. |
A numeric vector of optimized portfolio weights.
This function uses covariance shrinkage techniques for portfolio optimization.
po.covShrink(y0, x0)
po.covShrink(y0, x0)
y0 |
A numeric vector of response values. |
x0 |
A numeric matrix of predictors. |
A numeric vector of optimized portfolio weights.
This function performs gross exposure portfolio optimization using LASSO.
po.grossExp(y0, x0, method = "NOSHORT")
po.grossExp(y0, x0, method = "NOSHORT")
y0 |
A numeric vector of response values. |
x0 |
A numeric matrix of predictors. |
method |
The regularization method: "NOSHORT" or "EQUAL". |
A numeric vector of optimized portfolio weights.
This function uses James-Stein estimation for portfolio optimization.
po.JM(x0)
po.JM(x0)
x0 |
A numeric matrix of asset returns. |
A numeric vector of optimized portfolio weights.
This function performs stochastic weight portfolio optimization.
po.SW(x0, b, sample)
po.SW(x0, b, sample)
x0 |
A numeric matrix of asset returns. |
b |
Number of assets to select in each sample. |
sample |
Number of random samples to generate. |
A numeric vector of optimized portfolio weights.
This function performs portfolio optimization using LASSO regularization and stochastic weight selection.
po.SW.lasso(y0, x0, b, sample)
po.SW.lasso(y0, x0, b, sample)
y0 |
A numeric vector of response values. |
x0 |
A numeric matrix of predictors. |
b |
Number of assets to select in each sample. |
sample |
Number of random samples to generate. |
A numeric vector of optimized portfolio weights.
This function performs portfolio optimization using the TZT method.
po.TZT(x0, gamma)
po.TZT(x0, gamma)
x0 |
A numeric matrix of asset returns. |
gamma |
A numeric parameter for the TZT method. |
A numeric vector of optimized portfolio weights.
This function prepares the input data by filtering based on a specified date range, removing the date column, and handling missing values. It also generates time-related columns and returns the processed data.
prepare_data( dat, date_column_index = 1, start_date = "19990101", end_date = "20231231" )
prepare_data( dat, date_column_index = 1, start_date = "19990101", end_date = "20231231" )
dat |
A data frame or matrix where the first column is the date and the remaining columns are the data. |
date_column_index |
The index of the date column in the input data. Default is 1. |
start_date |
A character string specifying the start date for filtering the data in 'YYYYMMDD' format. Default is '19990101'. |
end_date |
A character string specifying the end date for filtering the data in 'YYYYMMDD' format. Default is '20231231'. |
A list containing the following components:
A matrix of the filtered data with missing values handled.
A vector of integers representing the number of months from the first date in the data.
A vector of the number of entries per month.
A vector of Date objects representing the filtered dates.
data <- data.frame(Date = c("19990101", "19990115", "19990201", "19990301", "19990315", "19990401"), Var1 = c(1, 2, -99.99, 4, 5, -99.99), Var2 = c(3, -99.99, 6, 7, 8, 9), Var3 = c(10, 11, 12, 13, -99.99, 15)) result <- prepare_data(data, date_column_index = 1, start_date = '19990101', end_date = '19990430') print(result)
data <- data.frame(Date = c("19990101", "19990115", "19990201", "19990301", "19990315", "19990401"), Var1 = c(1, 2, -99.99, 4, 5, -99.99), Var2 = c(3, -99.99, 6, 7, 8, 9), Var3 = c(10, 11, 12, 13, -99.99, 15)) result <- prepare_data(data, date_column_index = 1, start_date = '19990101', end_date = '19990430') print(result)
This function integrates data preparation, parallel setup, and portfolio analysis. It takes raw data as input, prepares it using 'prepare_data', sets up parallel processing using 'setup_parallel', and performs the analysis using 'perform_analysis'.
ren( dat, date_column_index = 1, start_date = "19990101", end_date = "20231231", num_cores = 2 )
ren( dat, date_column_index = 1, start_date = "19990101", end_date = "20231231", num_cores = 2 )
dat |
A data frame or matrix where the first column is the date and the remaining columns are the data. |
date_column_index |
The index of the date column in the input data. Default is 1. |
start_date |
A character string specifying the start date for filtering the data in 'YYYYMMDD' format. Default is '19990101'. |
end_date |
A character string specifying the end date for filtering the data in 'YYYYMMDD' format. Default is '20231231'. |
num_cores |
The number of cores to use for parallel processing. Default is 2. |
The results from 'perform_analysis', including plots and performance metrics.
## Not run: # load the sample data dat(FF25) # Run the main function result <- ren(FF25) # Display results print(result$cumulative_return_plot) print(result$cumulative_turnover_plot) ## End(Not run)
## Not run: # load the sample data dat(FF25) # Run the main function result <- ren(FF25) # Display results print(result$cumulative_return_plot) print(result$cumulative_turnover_plot) ## End(Not run)
This function sets up parallel processing by loading necessary libraries, allowing the user to specify the number of cores to use, and creating a parallel backend for faster computation.
setup_parallel(num_cores = 7)
setup_parallel(num_cores = 7)
num_cores |
The default number of cores to use for parallel processing. Default is 7. |
This function allows the user to specify the number of cores for parallel processing either through the
argument num_cores
or via interactive user input. The function also loads a set of libraries required for
portfolio analysis.
A parallel cluster object that can be used with functions that support parallel computation.
# Set up parallel processing with a specified number of cores cl <- setup_parallel(num_cores = 2) # Use 2 cores for the example print(cl) # Print the cluster information parallel::stopCluster(cl) # Stop the cluster after use to clean up
# Set up parallel processing with a specified number of cores cl <- setup_parallel(num_cores = 2) # Use 2 cores for the example print(cl) # Print the cluster information parallel::stopCluster(cl) # Stop the cluster after use to clean up