• 1 The building blocks
    • 1.1 Introduction
    • 1.2 The portfolio weights
    • 1.3 The portfolio return
    • 1.4 PerformanceAnalytics
  • 2 Analyzing performance
  • 3 Performance drivers
  • 4 Optimizing the portfolio

1 The building blocks

1.1 Introduction

1.1.1 Getting a grasp of the basics

The goal of every investor is to make profits while limiting the amount of risk involved. Investors create strategies to reduce these risks. Diversified portfolios reduce risk by offsetting loss with a potential gain in another asset!

1.1.2 Get a feel for the data

The choice of investment matters even when the underlying risky assets are similar. As a first example, let us consider the stock price of the Coca Cola Company and the PepsiCo company from January 2003 until the end of August 2016.

The time series plot shows you the value evolution of one dollar invested in each company. As an exercise, plot the time series showing the relative value of an investment in the Coca Cola company, compared to the value of an investment in PepsiCo. To do this exercise, you can use the corresponding price series, available as the variables ko and pep in your workspace.

Three important packages that you will use in this course are the xts and zoo packages (these are popular when working with time series data), and the PerformanceAnalytics package. Define ko_pep as the ratio expressing the value of the share price of the Coca Cola company in terms of the share price of PepsiCo.

Use plot.zoo() to visualize the variation in this ratio over time. Note that when the value of the ratio is larger than 1, the performance of ko since January 2003 is higher than that of pep.

As a reference line, use abline() to include a horizontal line at h=1. Note that where the value of the ratio is larger than one, the Coca Cola Company outperforms Pepsico and vice versa.

library(zoo)
package 㤼㸱zoo㤼㸲 was built under R version 4.0.3
Attaching package: 㤼㸱zoo㤼㸲

The following objects are masked from 㤼㸱package:base㤼㸲:

    as.Date, as.Date.numeric
library(xts)
package 㤼㸱xts㤼㸲 was built under R version 4.0.3
stock01 <- readRDS("data/re_prices.RData")
stock02 <- readRDS("data/eq_prices.RData")
plot(stock01)
lines(stock02, col = "red")

# Define ko_pep 
ko_pep <- stock02/stock01

# Make a time series plot of ko_pep
plot.zoo(ko_pep)  
  
# Add as a reference, a horizontal line at 1
abline(h = 1)  

The horizontal reference indicates when Coca Cola Company and PepsiCo are performing equally.

1.2 The portfolio weights

1.2.1 Calculating portfolio weights when component values are given

In the video, it was shown that you can easily compute portfolio weights if you have a given amount of money invested in certain assets. If you want to start investing in a portfolio, but you have budget restraints, you can also impose weights yourself. Depending on what these are, you will invest a certain amount of money in each of the assets based on their weight.

When given asset values, calculating the weights is quite simple. Recall from the video that weights are calculated by taking the value of an asset divided by the sum of values from all assets.

In this exercise, you will learn to calculate weights when individual asset values are given! For this example, an investor has 4000 USD invested in equities, 4000 USD invested in bonds, and 2000 USD invested in commodities. Compute the weights as the proportion invested in each of those three assets.

# Define the vector values
values <- c(4000, 4000, 2000)

# Define the vector weights
weights <- values/sum(values)

# Print the resulting weights
print(weights)
[1] 0.4 0.4 0.2

Remember that the weight of an asset is calculated by taking the value of the asset divided by the sum of values of all assets. You can also pre-define weights as a vector. For example weights <- c(0.2, 0.2, 0.6).

1.2.2 The weights of a market capitalization-weighted portfolio

In a market capitalization-weighted portfolio, the weights are given by the individual assets’ market capitalization (or market value), divided by the sum of the market capitalizations of all assets. A typical example is the S&P 500 portfolio invested in the 500 largest companies listed on the US stock exchanges (NYSE, Nasdaq). Note that by dividing by the sum of asset values across all portfolio assets, the portfolio weights sum to unity (one).

As an exercise, inspect the distribution of market capitalization based weights when the portfolio is invested in 10 stocks. For this exercise, you can use market capitalizations of 5, 8, 9, 20, 25, 100, 100, 500, 700, and 2000 million USD.

# Define marketcaps
marketcaps <- c(5, 8, 9, 20, 25, 100, 100, 500, 700, 2000)
  
# Compute the weights
weights <- marketcaps/sum(marketcaps)
  
# Inspect summary statistics
summary(weights)
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
0.001442 0.003389 0.018027 0.100000 0.115374 0.576868 
  
# Create a bar plot of weights
barplot(weights)

1.3 The portfolio return

1.3.1 Calculation of portfolio returns

For your first exercise on calculating portfolio returns, you will verify that a portfolio return can be computed as the weighted average of the portfolio component returns. In other words, this means that a portfolio return is calculated by taking the sum of simple returns multiplied by the portfolio weights. Remember that simple returns are calculated as the final value minus the initial value, divided by the initial value.

Assume that you invested in three assets. Their initial values are 1000 USD, 5000 USD, 2000 USD, respectively. Over time, the values change to 1100 USD, 4500 USD, and 3000 USD.

# Vector of initial value of the assets
in_values <- c(1000, 5000, 2000)
  
# Vector of final values of the assets
fin_values <- c(1100, 4500, 3000)

# Weights as the proportion of total value invested in each asset
weights <- in_values/sum(in_values)

# Vector of simple returns of the assets 
returns <- (fin_values - in_values)/in_values
returns
[1]  0.1 -0.1  0.5
# Compute portfolio return using the portfolio return formula
preturns <- sum(weights*returns)
preturns
[1] 0.075

Note that the weighted average portfolio return equals the percentage change in the total portfolio value. Remember that portfolio returns are calculated by taking the sum of simple returns, multiplied by the portfolio weights.

1.3.2 From simple to gross and multi-period returns

The simple return R expresses the percentage change in the value of an investment. The corresponding so-called “gross” return is defined as the future value of 1 USD invested in the asset for one period and is thus equal to 1+R.

The gross return over two periods is obtained similarly. Let R1 be the return in the first period and R2 the return in the second period. Then the end-value of a 1 USD investment is (1+R1)∗(1+R2).

The corresponding simple return over those two periods is: (1+R1)∗(1+R2)−1.

Suppose that you have an investment horizon of two periods. In the first period, you make a 10% return. But in the second period, you take a loss of 5%.

1000*((1 + 0.1)*(1 - 0.05) - 1)
[1] 45

End-value = 1045

1.3.3 The asymmetric impact of gains and losses

It is important to be aware of the fact that a positive and negative return of the same relative magnitude do not compensate each other in terms of terminal wealth. Mathematically, this can be seen from the identity (1+x)∗(1−x)=1−x2, which is less than one. A 50% loss is thus not compensated by a 50% gain. After a loss of 50%, what is the return needed to be at par again? 100%

1.4 PerformanceAnalytics

1.4.1 Buy-and-hold versus (daily) rebalancing

The choice of investment matters, even when the underlying risky assets are similar. As an example, you will now consider the stock price of Apple and Microsoft from January 2006 until the end of August 2016. The time series plot shows you the value evolution of one dollar invested in each of them. For this exercise, your portfolio approach will be to invest half of your budget in Apple stock, and the other half of your budget in Microsoft stock. Over time, the portfolio weights will change. You will have two choices as an investor. The first choice is to be passive and not trade any further. This is called a buy and hold strategy. The second choice is to buy and trade at the close of each day that results in a rebalance of the portfolio such that your portfolio is equally invested in shares of Microsoft and Apple. This is a rebalanced portfolio.

  • By investing in a portfolio of risky assets, the portfolio payoff will be in between the minimum and maximum of the payoff of the underlying risky assets.
  • By investing in a portfolio of risky assets, the risk of the portfolio payoff will be less than the maximum risk of the underlying risky assets.
  • Because the prices of Microsoft and Apple have evolved differently, the investor who spent initially half of his wealth on Microsoft and Apple ends up with a portfolio that is no longer equally invested.

1.4.2 The time series of asset returns

Calculating the returns for one period is pretty straightforward to do in R. When the returns need to be calculated for different dates the functions Return.calculate() and Return.portfolio(), in the R package PerformanceAnalytics are extremely helpful. They require the input data to be of the xts-time series class, which is pre-loaded. You will explore the functionality of the PerformanceAnalytics package in this exercise.

The object prices which contains the Apple (aapl) and Microsoft (msft) stocks is available in the workspace.

prices <- readRDS("data/aapl_msft.RData")
plot(prices$AdjClose.aapl)
lines(prices$AdjClose.msft, col = "red")

# Load package PerformanceAnalytics 
library(PerformanceAnalytics)
package 㤼㸱PerformanceAnalytics㤼㸲 was built under R version 4.0.3
Attaching package: 㤼㸱PerformanceAnalytics㤼㸲

The following objects are masked _by_ 㤼㸱.GlobalEnv㤼㸲:

    prices, weights

The following object is masked from 㤼㸱package:graphics㤼㸲:

    legend
# Print the first six rows and last six rows of prices
head(prices)
           AdjClose.aapl AdjClose.msft
2006-01-03     1.0000000      1.000000
2006-01-04     1.0029432      1.004843
2006-01-05     0.9950502      1.005589
2006-01-06     1.0207357      1.002608
2006-01-09     1.0173913      1.000745
2006-01-10     1.0817391      1.005961
tail(prices)
           AdjClose.aapl AdjClose.msft
2016-08-24      11.04996      2.766979
2016-08-25      11.00291      2.777483
2016-08-26      10.93847      2.770798
2016-08-29      10.92620      2.774141
2016-08-30      10.84232      2.764114
2016-08-31      10.85255      2.743582
# Create the variable returns using Return.calculate()  
returns <- Return.calculate(prices)
  
# Print the first six rows of returns. Note that the first observation is NA because there is no prior price.
head(returns)
           AdjClose.aapl AdjClose.msft
2006-01-03            NA            NA
2006-01-04   0.002943179  0.0048434781
2006-01-05  -0.007869844  0.0007415587
2006-01-06   0.025813336 -0.0029640370
2006-01-09  -0.003276507 -0.0018580304
2006-01-10   0.063247893  0.0052122172
  
# Remove the first row of returns
returns <- returns[-1, ]
head(returns)
           AdjClose.aapl AdjClose.msft
2006-01-04   0.002943179  0.0048434781
2006-01-05  -0.007869844  0.0007415587
2006-01-06   0.025813336 -0.0029640370
2006-01-09  -0.003276507 -0.0018580304
2006-01-10   0.063247893  0.0052122172
2006-01-11   0.037595789  0.0107407490

Notice that you removed the first row of your returns because they were NA values!

1.4.3 The time series of portfolio returns

In the previous exercise, you created a variable called returns from the daily prices of stocks of Apple and Microsoft. In this exercise, you will create two portfolios using the return series you previously created. The two portfolios will differ in one way, and that is the weighting of the assets.

In the last video, you were introduced to two weighting strategies: the buy and hold strategy, and a monthly rebalancing strategy. In this exercise, you will create a portfolio in which you don’t rebalance, and one where you rebalance monthly. You will then visualize the portfolio returns of both.

You will use the function Return.portfolio() for your calculations. For this function, you will provide three arguments: R, weights, and rebalance_on. R is a time series of returns, weights is a vector containing asset weights, and rebalance_on specifies which calendar-period to rebalance on. If you need help, be sure to check the documentation.

# Create the weights
eq_weights <- c(0.5, 0.5)

# Create a portfolio using buy and hold
pf_bh <- Return.portfolio(R = returns, weights = eq_weights)

# Create a portfolio rebalancing monthly 
pf_rebal <- Return.portfolio(R = returns, weights = eq_weights, rebalance_on = "months")

# Plot the time-series
par(mfrow = c(2, 1), mar = c(2, 4, 2, 2))
plot.zoo(pf_bh)
plot.zoo(pf_rebal)

You will see based on the plots that rebalancing a portfolio can lead to less extreme returns in either direction.

1.4.4 The time series of weights

In the previous exercise, you explored the functionality of the Return.portfolio() function and created portfolios using two strategies. However, by setting the argument verbose = TRUE in Return.portfolio() you can create a list of beginning of period (BOP) and end of period (EOP) weights and values in addition to the portfolio returns, and contributions.

You can access these from the resultant list-object created from Return.portfolio(). The resultant list contains $returns, $contributions, $BOP.Weight, $EOP.Weight, $BOP.Value, and $EOP.Value.

In this exercise, you will recreate the portfolios you made in the last exercise but extend it by creating a list of calculations using verbose = TRUE. You will then visualize the end of period weights of Apple.

# Create the weights
eq_weights <- c(0.5, 0.5)

# Create a portfolio using buy and hold
pf_bh <- Return.portfolio(returns, weights = eq_weights, verbose = TRUE )

# Create a portfolio that rebalances monthly
pf_rebal <- Return.portfolio(returns, weights = eq_weights, rebalance_on = "months", verbose = TRUE  )

# Create eop_weight_bh
eop_weight_bh <- pf_bh$EOP.Weight

# Create eop_weight_rebal
eop_weight_rebal <- pf_rebal$EOP.Weight

# Plot end of period weights
par(mfrow = c(2, 1), mar=c(2, 4, 2, 2))
plot.zoo(eop_weight_bh$AdjClose.aapl)
plot.zoo(eop_weight_rebal$AdjClose.aapl)

Look at the two plots. Notice how the weights the rebalanced portfolio stay close to the initial weight of 50%, and how they buy and hold strategy has Apple stock with about 80% weight

2 Analyzing performance

2.1 Dimensions of portfolio performance

2.1.1 Exploring the monthly S&P 500 returns

For the upcoming exercises, you will examine the monthly performance of the S&P 500. A picture is worth a thousand words. This is why most analyses of performance start by studying the time series plot of the value of an investment.

A plot of the S&P 500 for the period of 1986 until August 2016 is shown to your right. Each observation on this plot is an end-of-day value. This chart shows a number of booms and busts. Take a look at the chart, do you see why the 2000s are so often called the lost decade of investing? Your job is to describe the monthly performance of the S&P 500. To do so, you will first need to aggregate daily price series to end-of-month prices. You will then calculate the monthly returns and visualize them in a table.

sp500 <- readRDS("data/sp500.RData")
sp500 <- as.xts(sp500)
plot(sp500)

# Convert the daily frequency of sp500 to monthly frequency: sp500_monthly
#sp500_monthly <- to.monthly(sp500, indexAt = "endof")

# Print the first six rows of sp500_monthly
#head(sp500_monthly)

# Create sp500_returns using Return.calculate using the closing prices
sp500_returns <- Return.calculate(sp500) 
sp500_returns <- sp500_returns[-1,]
# Time series plot
plot.zoo(sp500_returns)


# Produce the year x month table
table.CalendarReturns(sp500_returns)
ABCDEFGHIJ0123456789
 
 
Jan
<dbl>
Feb
<dbl>
Mar
<dbl>
Apr
<dbl>
May
<dbl>
Jun
<dbl>
Jul
<dbl>
Aug
<dbl>
Sep
<dbl>
19860.27.15.3-1.45.01.4-5.97.1-8.5
198713.23.72.6-1.10.64.84.83.5-2.4
19884.04.2-3.30.90.34.3-0.5-3.94.0
19897.1-2.92.15.03.5-0.88.81.6-0.7
1990-6.90.92.4-2.79.2-0.9-0.5-9.4-5.1
19914.26.72.20.03.9-4.84.52.0-1.9
1992-2.01.0-2.22.80.1-1.73.9-2.40.9
19930.71.01.9-2.52.30.1-0.53.4-1.0
19943.3-3.0-4.61.21.2-2.73.13.8-2.7
19952.43.62.72.83.62.13.20.04.0

2.1.2 The monthly mean and volatility

You have now created month to month returns for the S&P 500. Next, you will need to do a descriptive analysis of the returns.

In this exercise, you will calculate the mean (arithmetic and geometric) and volatility (standard deviation) of the returns. These returns are available in your workspace as the variable sp500_returns.

# Compute the mean monthly returns
mean(sp500_returns)
[1] 0.007320322
# Compute the geometric mean of monthly returns
mean.geometric(sp500_returns)
                  AdjClose
Geometric Mean 0.006350886
# Compute the standard deviation
sd(sp500_returns)
[1] 0.04364617

The volatility and mean return are very important metrics in every investor’s decision making!

2.2 The (annualized) Sharpe ratio

2.2.1 Excess returns and the portfolio’s Sharpe ratio

You just learned how to create descriptive statistics of your portfolio returns. Now you will learn how to evaluate your portfolio’s performance!

Performance evaluation involves comparing your investment choices with an alternative investment choice. Most often, the performance is compared with investing in an (almost) risk-free asset, such as the U.S. Treasury Bill. The return from a U.S. Treasury Bill is known as a risk-free rate because Treasury Bills (T-Bills) are backed by the U.S. Government.

In this exercise, you will be asked to annualize the risk-free rate by using the compound interest formula. The yearly compounded interest rate is given by (1+y)121. The annual rate is used to estimate a yearly return and is very useful for forecasting.

The Sharpe Ratio is an important metric that tells us the return-to-volatility ratio. It is calculated by taking the mean of excess returns (returns - risk-free rate), divided by the volatility of the returns.

#download 13 week Ttreasury Bill (ticker: ^IRX)
library(tseries)
package 㤼㸱tseries㤼㸲 was built under R version 4.0.3Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 

    㤼㸱tseries㤼㸲 version: 0.10-47

    㤼㸱tseries㤼㸲 is a package for time series analysis and computational finance.

    See 㤼㸱library(help="tseries")㤼㸲 for details.
con <- url("https://finance.yahoo.com")
if(!inherits(try(open(con), silent = TRUE), "try-error")) {
  close(con)
  tb <- get.hist.quote(instrument = "^IRX", start = "1986-01-01", end = "2016-08-31",
                      quote = "Close", compression = "d")
  plot(tb)
}
㤼㸱getSymbols㤼㸲 currently uses auto.assign=TRUE by default, but will
use auto.assign=FALSE in 0.5-0. You will still be able to use
㤼㸱loadSymbols㤼㸲 to automatically load data. getOption("getSymbols.env")
and getOption("getSymbols.auto.assign") will still be checked for
alternate defaults.

This message is shown once per session and may be disabled by setting 
options("getSymbols.warning4.0"=FALSE). See ?getSymbols for details.

^IRX contains missing values. Some functions will not work if objects contain missing values in the middle of the series. Consider using na.omit(), na.approx(), na.fill(), etc to remove or replace them.
time series starts 1986-01-02
time series ends   2016-08-30

tb_monthly <- to.monthly(tb, indexAt = "startof")
missing values removed from data
# monthly interest for risk free purposes
rf <- tb_monthly$tb.Close/12/100
rf <- as.xts(rf)
# Compute the annualized risk free rate
annualized_rf <- (1 + rf)^12 - 1

# Plot the annualized risk-free rate
plot(annualized_rf)


# Compute the series of excess portfolio returns 
sp500_excess <- sp500_returns - rf

# Compare the mean of sp500_excess and sp500_returns 
mean(sp500_excess)
[1] 0.004555307
mean(sp500_returns)
[1] 0.007320322
# Compute the Sharpe ratio
sp500_sharpe <- mean(sp500_excess) / sd(sp500_returns)
sp500_sharpe
[1] 0.104369

Remember that the Sharpe Ratio is calculated by taking the mean of excess returns and dividing by the volatility of those returns.

2.2.2 Annualized mean and volatility

The mean and volatility of monthly returns correspond to the average and standard deviation over a monthly investment horizon. Investors annualize those statistics to show the performance over an annual investment horizon.

To do so, the package PerformanceAnalytics has the function Return.annualized() and StdDev.annualized() to compute the (geometrically) annualized mean return and annualized standard deviation for you. Remember that the Sharpe ratio is found by taking the mean excess returns subtracted by the risk-free rate, and then divided by the volatility!

# Compute the annualized mean
Return.annualized(sp500_returns)
                    AdjClose
Annualized Return 0.07892983
# Compute the annualized standard deviation
StdDev.annualized(sp500_returns)
                               AdjClose
Annualized Standard Deviation 0.1511948
# Compute the annualized Sharpe ratio: ann_sharpe
ann_sharpe <- Return.annualized(sp500_returns) / StdDev.annualized(sp500_returns)

# Compute all of the above at once using table.AnnualizedReturns()
table.AnnualizedReturns(sp500_returns)
ABCDEFGHIJ0123456789
 
 
AdjClose
<dbl>
Annualized Return0.0789
Annualized Std Dev0.1512
Annualized Sharpe (Rf=0%)0.5220

Annualizing returns allows you to look at performance over a longer period of time.

2.3 Rolling annualized mean and volatility

In previous exercises, you have already familiarized yourself with the Return.annualized() and StdDev.annualized() functions. In this exercise, you will also use the function SharpeRatio.annualized() to calculate the annualized Sharpe Ratio. This function takes the arguments R, and Rf. The R argument takes an xts, vector, matrix, data.frame, timeSeries, or zoo object of asset returns. The Rf argument is necessary in SharpeRatio.annualized(), as it takes into account the risk-free rate in the same period of your returns. For this example, you can use the object rf to fulfill the Rf argument.

The function chart.RollingPerformance() makes it easy to visualize the rolling estimates of performance in R. Familiarize yourself first with the syntax of this function. It requires you to specify the time series of portfolio returns (by setting the argument R), the length of the window (width), and the function used to compute the performance (argument FUN). To see all three plots together, PerformanceAnalytics provides a shortcut function charts.RollingPerformance(). Note the charts instead of chart. This function creates all of the previous charts at once and does not use the argument FUN!

# Calculate the mean, volatility, and Sharpe ratio of sp500_returns
returns_ann <- Return.annualized(sp500_returns)
sd_ann <- StdDev.annualized(sp500_returns)
sharpe_ann <-SharpeRatio.annualized(sp500_returns, Rf = rf)

# Plotting the 12-month rolling annualized mean
chart.RollingPerformance(R = sp500_returns, width = 12, FUN = "Return.annualized")


# Plotting the 12-month rolling annualized standard deviation
chart.RollingPerformance(R = sp500_returns, width = 12, FUN = "StdDev.annualized")


# Plotting the 12-month rolling annualized Sharpe ratio
chart.RollingPerformance(R = sp500_returns, width = 12, FUN = "SharpeRatio.annualized", Rf = rf)

Note that the dynamics in the return, risk, and risk-adjusted returns are recurrent. Investors can thus compare current rolling risk and return statistics with past values to pinpoint the past periods in which performance has been similar.

2.3.1 Subperiod performance analysis and the function window

In the previous exercise, you computed the performance measure on each possible sample of a fixed size by rolling through time. Often investors are interested in the performance of a specific subwindow. You can create subsets of a time series in R using the function window(). The first argument is the return series that needs subsetting. The second argument is the starting date of the subset in the form “YYYY-MM-DD”, and the third argument is the ending date in the same format.

con <- url("https://finance.yahoo.com")
if(!inherits(try(open(con), silent = TRUE), "try-error")) {
  close(con)
  sp500 <- get.hist.quote(instrument = "^GSPC", start = "1986-01-01", end = "2016-08-31",
                      quote = "Close", compression = "d")
}
time series starts 1986-01-02
time series ends   2016-08-30
head(sp500)
            Close
1986-01-02 209.59
1986-01-03 210.88
1986-01-06 210.65
1986-01-07 213.80
1986-01-08 207.97
1986-01-09 206.11
sp500_returns  <- Return.calculate(sp500$Close)
sp500_returns <- sp500_returns[-1,]
# Fill in window for 2008
sp500_2008 <- window(sp500_returns, start = "2008-01-01", end = "2008-12-31")

# Create window for 2014
sp500_2014 <- window(sp500_returns, start = "2014-01-01", end = "2014-12-31")

# Plotting settings
par(mfrow = c(1, 2) , mar=c(3, 2, 2, 2))
names(sp500_2008) <- "sp500_2008"
names(sp500_2014) <- "sp500_2014"

Plot histogram of returns in 2008 using the function chart.Histogram(). Set the argument methods = c(“add.density”, “add.normal”) to visualize the non-parametric estimate of the density and the density under an assumed normal distribution.

# Plot histogram of 2008
chart.Histogram(sp500_2008, methods = c("add.density", "add.normal"))


# Plot histogram of 2014
chart.Histogram(sp500_2014, methods = c("add.density", "add.normal"))

2.4 Non-normality of the return distribution

2.4.1 Balancing risk and reward

The x-axis corresponds to the return of an asset, and the y-axis is the corresponding density, indicating how likely that return value is. Take a moment to look over these charts.

Notice that the distributions are all symmetric, and the center of the distribution corresponds to the average return.

Return distributions A and B have the same expected return. But distribution B has higher volatility and therefore has a wider range of possible outcomes. This indicates that the risk of a large deviation from the mean return is higher for investing in B than for investing in A. Distribution C has the same level of volatility as Distribution A but a higher mean return. Distribution C has the highest average return with the lowest volatility.

2.4.2 Detecting non-normality using skewness and kurtosis

Returns are most often non-normal in nature. Two metrics key to understanding the distribution of non-normal returns are skewness and kurtosis. The skewness will help you identify whether or not negative or positive returns occur more frequently. Negative skewness indicates that large negative returns occur more often than large positive ones, and vice versa.

Kurtosis will be positive if there are fat tails in your distribution. This means that large positive or negative returns will happen more often than can be assumed under a normal distribution.

sp500_daily <- sp500_returns
sp500_monthly <- to.monthly(sp500$Close, indexAt = "endof")
sp500_monthly <- CalculateReturns(sp500_monthly)
sp500_monthly <- sp500_monthly[-1, 4]
par(mfrow = c(1, 2) , mar=c(3, 2, 2, 2))
# Plot histogram of 2008
chart.Histogram(sp500_daily, methods = c("add.density", "add.normal"))
# Plot histogram of 2008
chart.Histogram(sp500_monthly, methods = c("add.density", "add.normal"))

The histograms in the plot environment compare the daily and monthly returns of the S&P 500 over the period of 1986 until today. There seems to be a negative skewness() in these plots, and a somewhat greater than normal kurtosis(). Note that, by default, kurtosis() reports the excess kurtosis (that is, the kurtosis minus three). Let’s see if the numbers match our observations!

#  Compute the skewness 
skewness(sp500_daily)
[1] -0.8254765
skewness(sp500_monthly) 
[1] -0.7919844
# Compute the excess kurtosis 
kurtosis(sp500_daily)
[1] 20.88271
kurtosis(sp500_monthly)
[1] 2.423106

The excess kurtosis can be thought of as how fat the tails of the return distribution are compared to the normal distribution.

2.4.3 Downside risk measures

The standard deviation gives equal weight to positive and negative returns in calculating the return variability. When the return distribution is asymmetric (skewed), investors use additional risk measures that focus on describing the potential losses. One such measure of downside risk is the Semi-Deviation. The Semi-Deviation is the calculation of the variability of returns below the mean return.

Another more popular measure is the so-called Value-at-Risk (or VaR). Loosely speaking, the VaR corresponds to the 5% quantile of the return distribution, meaning that a more negative return can only happen with a probability of 5%. For example you might ask: “what is the largest loss I could potentially take within the next quarter such that I only have 5% probability of observing an even larger loss?”

The expected shortfall is another measure of risk that focuses on the average loss below the 5% VaR quantile.

In this exercise, you will examine the potential risk of the monthly returns of the S&P 500. You will use the functions SemiDeviation(), VaR(), and ES(). These functions all require the argument R, which is an xts, vector, matrix, data.frame, timeSeries, or zoo object of asset returns. However the VaR(), and ES() require another argument p, or loss probability level, which is p = 0.05 in case of the 5% VaR and ES.

Use the function VaR() to estimate the 5% and 2.5% value-at-risk of a monthly investment in S&P 500 returns.

Use the function ES() to estimate the 5% and 2.5% expected shortfall of a monthly investment in S&P 500 returns.

# Calculate the SemiDeviation
SemiDeviation(sp500_monthly)
               sp500$Close.Close
Semi-Deviation        0.03339926
# Calculate the value at risk
VaR(sp500_monthly, p = 0.05)
    sp500$Close.Close
VaR       -0.07162533
VaR(sp500_monthly, p = 0.025)
    sp500$Close.Close
VaR       -0.09783902
# Calculate the expected shortfall
ES(sp500_monthly, p = 0.05)
   sp500$Close.Close
ES        -0.1180273
ES(sp500_monthly, p = 0.025)
   sp500$Close.Close
ES        -0.1653079

You now know how to calculate the potential risk of your investments! Each statistic tells you something slightly different. It is important that you remember what your risk measures are!

2.4.4 Drawdowns due to buying high, selling low

The volatility, semi-deviation, value-at-risk, and expected shortfall are all measures that describe risk over one period. These metrics do not do a great job at describing the worst-case risk of buying at a peak and selling at a trough. This sort of risk can be quantified by analyzing the portfolio’s drawdowns, or peak-to-trough decline in cumulative returns. The function table.Drawdowns() in PerformanceAnalytics reports the five largest drawdown episodes over a sample period. The package also has another function chart.Drawdown() to visualize the evolution of the drawdowns from each peak over time. You will now explore the historic drawdown episodes of the monthly returns of the S&P 500. The returns are loaded into your workspace as sp500_monthly.

# Table of drawdowns
table.Drawdowns(sp500_monthly)
ABCDEFGHIJ0123456789
From
<date>
Trough
<date>
To
<date>
Depth
<dbl>
Length
<dbl>
To Trough
<dbl>
Recovery
<dbl>
2007-11-302009-02-272013-03-28-0.5256651649
2000-09-292002-09-302007-05-31-0.4628812556
1987-09-301987-11-301989-07-31-0.301723320
1990-06-291990-10-311991-02-28-0.1584954
1998-07-311998-08-311998-11-30-0.1557523

# Plot of drawdowns
chart.Drawdown(sp500_monthly)

3 Performance drivers

3.1 Drivers in the case of two assets

3.1.1 Driver 1: The assets’ individual performance

three types of drivers for a portfolio’s performance: (i) the individual performance of the assets in terms of risk and return, (ii) the portfolio weight of each of the assets, (iii) the correlation between the asset returns.

Now let’s analyze some data to get a better grasp of this result! In this example, you will consider investing – with monthly frequency – in U.S. equities and U.S. bonds. Each assets’ returns are stored in your workspace as returns_equities, and returns_bonds. In addition, the portfolio is invested 60/40, meaning that every month you invest 60% in equities and 40% bonds. These portfolio returns are stored as returns_6040.

eq_prices <- readRDS("data/eq_prices.RData")
head(eq_prices)
           AdjClose
2006-07-03 1.000000
2006-08-01 1.021822
2006-09-01 1.049413
2006-10-02 1.082487
2006-11-01 1.104013
2006-12-01 1.118775
bond_prices <- readRDS("data/bond_prices.RData")
head(bond_prices)
           AdjClose
2006-07-03 1.000000
2006-08-01 1.015910
2006-09-01 1.026186
2006-10-02 1.033223
2006-11-01 1.044195
2006-12-01 1.037925
returns <- merge(eq_prices, bond_prices)
weights <- c(0.60, 0.40)
portfolio <- eq_prices * 0.60 + bond_prices *0.40
head(portfolio)
           AdjClose
2006-07-03 1.000000
2006-08-01 1.019457
2006-09-01 1.040122
2006-10-02 1.062781
2006-11-01 1.080086
2006-12-01 1.086435
plot(eq_prices)
lines(bond_prices, col = "red")
lines(as.zoo(portfolio), col = "blue")
legend(x = "topleft", legend = c("Equities", "Bonds", "60/40 Portfolio"),
       col = c("black", "red", "blue"),
       lty = c(1, 1, 1))

The portfolio mean return equals the position weighted sum of the component mean returns. You can test this using: all.equal(mean(returns_6040), 0.6 * mean(returns_equities) + 0.4 * mean(returns_bonds))

3.1.2 Driver 2: The choice of portfolio weights

Investors can optimize the choice of weight to obtain the highest risk-adjusted return, as measured by the portfolio Sharpe ratio.

In the special case of investing the total portfolio value in only two assets, there is only one weight to determine, because the weight on the second asset equals one minus the weight of the first asset.

Let us do this in the case of a portfolio invested in U.S. equities and U.S. bonds. We will be using the brute force approach of trying a large number of possible weights and keeping the weight that yields the highest value for the portfolio Sharpe ratio (assuming a zero risk-free rate).

Create a vector called grid using seq() that begins at 0, ends at 1, by an increment of 0.01. Initialize empty vector vsharpe with the same length as grid. A popular way of doing this is by creating a vector that contains NA’s using function rep(). You will replace these NA’s in the loop that you will create next. In the for loop, you will compute the Sharpe ratio for each of the possible weights in grid. The first command in the for-loop selects the i-th element of grid and stores it in object weight, which changes in each iteration. You want to see how the portfolio return changes with a changing weight. Create an object preturns that equals the sum of weight times returns_equities, and (1-weight) times returns_bonds. Next, you will replace the NAs in vsharpe with the annualized Sharpe ratio (SharpeRatio.annualized()) of preturns. Fill in the plot function where potential weights (grid) is plotted on the x-axis and the Sharpe ratios on the y-axis.

returns_equities <- Return.calculate(eq_prices)
returns_bonds <- Return.calculate(bond_prices)
# Create a grid
grid <- seq(from = 0, to = 1, by = 0.01)

# Initialize an empty vector for Sharpe ratios
vsharpe <- rep(NA, times = length(grid))

# Create a for loop to calculate Sharpe ratios
for(i in 1:length(grid)) {
    weight <- grid[i]
    preturns <- weight * returns_equities + (1 - weight) * returns_bonds
    vsharpe[i] <- SharpeRatio.annualized(preturns)
}

# Plot weights and Sharpe ratio
plot(grid, vsharpe, xlab = "Weights", ylab= "Ann. Sharpe ratio")
abline(v = grid[vsharpe == max(vsharpe)], lty = 3)

The portfolio that is 11% invested in equities, 89% in bonds has the largest possible Sharpe ratio.

3.1.3 Driver 3: The correlation between the asset returns

The third driver of portfolio performance is the correlation between asset returns. Generally speaking, the correlation tells you how two asset returns tend to move together.

The correlation of assets has important consequences in overall portfolio performance. This correlation is important because it can reduce volatility through diversification, or reducing overall correlation. In fact, the lower the correlation, the more successful the portfolio tends to be in regards to partially offsetting large losses in one asset with only a minor loss, or even a gain, in another asset.

In the extreme case of two identical asset returns, the correlation will be 1, and there is no diversification potential. In the other extreme case where, if one asset return is above average, and the other is almost always below average, the correlation is negative. The correlation is 0 when the asset returns are linearly independent of each other. Note that interdependency can still exist on a non-linear level even when the correlation is 0. As an exercise, suppose you have an equally weighted portfolio of two assets. Their correlation jumps from 0 to 0.5. What happens with the variance?

returns_equities <- Return.calculate(eq_prices)
returns_equities <- returns_equities[-1,]
returns_bonds <- Return.calculate(bond_prices)
returns_bonds <- returns_bonds[-1,]

pf_var <- function (x){
    return((0.5^2) * var(returns_equities) + (0.5^2) * var(returns_bonds) + 
        2 * 0.5 * 0.5 * sd(returns_equities) * sd(returns_bonds) * x)
}

function pf_var(x) that calculates the portfolio variance for the equally weighted portfolio of equities and bonds, when their correlation is assumed to be equal to x. Having correlated assets can increase potential risk. Correlated asset prices rise and fall together. So when you suffer a loss in one asset, there is a greater chance of suffering a loss on the other asset!

pf_var(0)
[1] 0.0005106223
pf_var(0.5)
[1] 0.0006317595

3.1.4 Interpreting correlation

Now you will learn how to compute the correlation between bond returns and equity returns. Just like volatilities, these correlations are dynamic. Therefore you need to distinguish between a static analysis that calculates correlations over a complete sample and a dynamic analysis that calculates correlations over a rolling sample. This is a similar analysis as you did for the time-varying performance evaluation in terms of mean return and volatility.

In this exercise you will learn 3 new functions from the PerformanceAnalytics package: chart.Scatter(), chart.Correlation(), and chart.RollingCorrelation().

# Create a scatter plot
chart.Scatter(returns_bonds, returns_equities, xlab = "bond returns", ylab = "equity returns", main = "bond-equity returns")


# Find the correlation
cor(returns_bonds, returns_equities)
[1] 0.06211472
# Merge returns_bonds and returns_equities 
returns <- merge(returns_bonds, returns_equities)

# Find and visualize the correlation using chart.Correlation
chart.Correlation(returns) 


# Visualize the rolling estimates using chart.RollingCorrelation
chart.RollingCorrelation(returns_bonds, returns_equities, width = 24)

3.2 Using matrix notation

3.2.1 Making a risk-reward scatter diagram

In this example, you have decided to extend your investment opportunity by creating a portfolio that consists of US equities ETF (SPY), US bonds ETF (AGG), a real estate investment trust (VNQ), and an ETF tracking in the GSCI commodities index (GSG). The plot in the environment displays the performance of these investments.

comm_prices <- readRDS("data/comm_prices.RData")
re_prices <- readRDS("data/re_prices.RData")
plot(eq_prices)
lines(bond_prices, col = "red")
lines(re_prices, col = "blue")
lines(comm_prices, col = "purple")
legend(x = "topleft", legend = c("Equities", "Bonds", "Real estate", "Commodities"),
       col = c("black", "red", "blue", "purple"),
       lty = c(1, 1, 1, 1))

You can also visualize “relative” attractiveness of the investments by making a scatter plot of the average returns against the portfolio volatilities. In order to do this, you need to calculate the averages and volatilities for each asset. This corresponds to each column in the return series returns.

These calculations are made easy by using the function apply() with the first argument as the return data, the second argument is the value of 2 indicating that the calculation should be column-wise, while the third argument is the name of the function that needs to be applied on each column.

eq_returns <- Return.calculate(eq_prices)
eq_returns <- eq_returns[-1,]
bond_returns <- Return.calculate(bond_prices)
bond_returns <- bond_returns[-1,]
comm_returns <- Return.calculate(comm_prices)
comm_returns <- comm_returns[-1,]
re_returns <- Return.calculate(re_prices)
re_returns <- re_returns[-1,]
returns <- merge(eq_returns, bond_returns, comm_returns, re_returns)
colnames(returns) <- c("Equities", "Bonds", "Commodities", "Real Estate")
# Create a vector of returns 
means <- apply(returns, 2, "mean")
  
# Create a vector of standard deviation
sds <- apply(returns, 2, "sd")

# Create a scatter plot
plot(sds, means)
text(sds, means, labels = colnames(returns), cex = 0.7)
abline(h = 0, lty = 3)

3.2.2 The covariance matrix

The covariance matrix is crucial in determining the portfolio variance in the general case of N assets. Remember that an element on row i and column j correspond to the covariance of the i th and j th return. Recall also that the covariance of two return series is the product between their volatilities and their correlation, and that the covariance of an asset return with itself is its variance.

In this exercise, you will compute and analyze the covariance and correlation matrix on the monthly returns of the four asset classes from the previous exercise. To refresh, these asset classes are equities, bonds, real estate, and commodities. To create these matrices, you will use the standard functions cov() and cor().

In your workspace are the monthly investments as returns, and the vector of standard deviations sds that you created previously. Create a diagonal matrix that contains the variances on the diagonal. You can use the function diag() to do this, using a squared sds^2 as the only argument. Call this diag_cov.

sds <- c(sd(returns[,1]), sd(returns[,2]), sd(returns[,3]), sd(returns[,4]))
# Create a matrix with variances on the diagonal
diag_cov <- diag(sds^2)
diag_cov
            [,1]         [,2]        [,3]        [,4]
[1,] 0.001920218 0.0000000000 0.000000000 0.000000000
[2,] 0.000000000 0.0001222713 0.000000000 0.000000000
[3,] 0.000000000 0.0000000000 0.004797972 0.000000000
[4,] 0.000000000 0.0000000000 0.000000000 0.005447762
# Create a covariance matrix of returns
cov_matrix <- cov(returns)
cov_matrix
                Equities         Bonds   Commodities  Real Estate
Equities    1.920218e-03  3.009761e-05  1.510994e-03 0.0024063778
Bonds       3.009761e-05  1.222713e-04 -6.860821e-05 0.0002632232
Commodities 1.510994e-03 -6.860821e-05  4.797972e-03 0.0013291423
Real Estate 2.406378e-03  2.632232e-04  1.329142e-03 0.0054477625
# Create a correlation matrix of returns
cor_matrix <- cor(returns)
cor_matrix
              Equities       Bonds Commodities Real Estate
Equities    1.00000000  0.06211472  0.49780426   0.7440112
Bonds       0.06211472  1.00000000 -0.08957463   0.3225172
Commodities 0.49780426 -0.08957463  1.00000000   0.2599762
Real Estate 0.74401117  0.32251718  0.25997618   1.0000000
# Verify covariances equal the product of standard deviations and correlation
all.equal(cov_matrix[1,2], cor_matrix[1,2] * sds[1] * sds[2])
[1] TRUE

These matrices can help you in your calculations, and are also a good reference to see the relationship between your assets.

3.2.3 Matrix-based calculation of portfolio mean and variance

When w is the column-matrix of portfolio weights, μ the column-matrix of expected returns, and Σ the return covariance matrix. Then the portfolio expected return is wμ, and the portfolio variance is wΣw. Remember that the portfolio’s volatility is the square root of its variance.

You will practice matrix multiplication in R using the %*% function, instead of the standard *. In addition, you will transpose the matrices by using the standard function t(). Remember that transposing a matrix is simply changing the rows of the matrix to the columns.

The weights, vector of means, and the covariance matrix are pre-loaded in your workspace as weights, vmeans, and sigma, respectively.

weights <- c(0.4, 0.4, 0.1, 0.1)
vmeans <- c(c(equities = 0.00706924247456836, bonds = 0.00400950651113212, 
realestate = 0.00894954808678432, commodities = -0.0082960697150902
))
sigma <- matrix(c(0.00192021805592667, 3.00976104465011e-05, 0.00240637780809405, 
0.00151099397208086, 3.00976104465011e-05, 0.000122271264776651, 
0.000263223185975344, -6.86082072578803e-05, 0.00240637780809405, 
0.000263223185975344, 0.00544776248878884, 0.0013291423299705, 
0.00151099397208086, -6.86082072578803e-05, 0.0013291423299705, 
0.00479797179115701), ncol = 4)
# Create a weight matrix w
w <- as.matrix(weights)

# Create a matrix of returns
mu <- as.matrix(vmeans)

# Calculate portfolio mean monthly returns
t(w) %*% mu

# Calculate portfolio volatility
sqrt(t(w) %*% sigma %*% w)

3.3 Portfolio risk budget

3.3.1 Who did it?

You saw the difference between a capital allocation budget and a risk budget. In this exercise, you will construct a risk budget, and discover how large each asset’s percent risk contribution is in the total portfolio volatility. For this last exercise, you will calculate the risk contributions for a portfolio that is again invested 40% in equities, 40% in bonds, 10% in real estate, and 10% in commodities. The function StdDev() plays an important role in this exercise. The StdDev() function creates a list of the assets’ standard deviation ($StdDev), their risk contribution ($contribution), and their percent risk contribution ($pct_contrib_StdDev). You will be using three arguments in the StdDev() function to do this calculation. The first is R, a vector, matrix, data frame, time series, or zoo object of returns. The second is portfolio_method, which you will set to component, and the third is weights.

# Create portfolio weights
weights <- c(0.4, 0.4, 0.1, 0.1)

# Create volatility budget
vol_budget <- StdDev(returns, portfolio_method = "component", weights = weights)

# Make a table of weights and risk contribution
weights_percrisk <- cbind(weights, vol_budget$pct_contrib_StdDev)
colnames(weights_percrisk) <- c("weights", "perc vol contrib")

# Print the table
weights_percrisk
            weights perc vol contrib
Equities        0.4       0.59004087
Bonds           0.4       0.04048648
Commodities     0.1       0.14975111
Real Estate     0.1       0.21972154

Note that the percentage volatility risk caused by the bonds is much less than their portfolio weight.

4 Optimizing the portfolio

4.1 Modern portfolio theory of Harry Markowitz

Mean-variance based investing in DJIA stocks In this chapter, you will explore portfolio optimization. For the upcoming exercises, you will focus on the Dow Jones Industrial Average (DJIA) stocks. Like the S&P 500 portfolio studied in Chapter 2, the Dow Jones Industrial Average portfolio is an important reference portfolio for the U.S. equity market. It is invested in 30 large publicly owned companies based in the United States.

You will be using the sample of monthly returns from January 1991 until December 2015. These return data are available in the workspace as the variable returns.

Assume you are an investor who likes high returns and dislikes risk. You are also constrained to be fully invested in the 30 DJIA stocks, and only positive weights are allowed. Which of the following statements is false:

djia <- readRDS("data/prices.rds")
head(djia)
                 AA     AAPL      AXP       BA      BAC      CAT      CVX        DD      DIS       GE
1990-12-31 4.534478 1.345204 3.445906 13.77601 2.740819 3.259666 7.440307  7.680170 6.428398 2.355960
1991-01-31 5.115716 1.736251 3.800936 14.99043 3.369860 3.473104 7.312247  7.575678 6.866336 2.627997
1991-02-28 5.095843 1.794772 4.009782 14.71930 3.504654 3.813431 7.795698  7.953312 7.825717 2.812929
1991-03-31 5.205114 2.131782 4.824266 14.33797 4.199830 3.342211 7.976993  7.847620 7.548209 2.880146
1991-04-30 5.399917 1.724235 4.210390 13.95664 4.472743 3.346168 8.102942  8.798847 7.313112 2.926684
1991-05-31 5.689909 1.477213 4.315652 15.02803 5.109541 3.635996 7.736412 10.086981 7.368695 3.217015
                 HD      HPQ     INTC      IBM      JNJ      JPM       KO      MCD      MMM      MRK
1990-12-31 2.042869 1.004518 0.841387 18.27009 5.103177 1.651136 3.398343 4.583788 10.70052 6.468881
1991-01-31 2.287483 1.225118 0.999835 20.49322 5.441018 2.015923 3.562779 4.485423 10.59133 6.585841
1991-02-28 2.578379 1.469351 1.043544 21.01702 5.866060 2.534304 3.827706 4.977246 11.13890 7.350589
1991-03-31 2.857695 1.579714 1.021690 18.58884 6.848207 2.706823 3.983028 5.483410 11.13890 7.635325
1991-04-30 3.042916 1.615256 1.076322 16.81362 6.749990 3.193658 3.872898 5.286165 11.21757 7.798163
1991-05-31 3.486123 1.713990 1.218378 17.52841 6.501354 3.329975 4.203288 5.537626 12.09569 8.603312
               MSFT      NKE      PFE       PG      TRV      UTX       VZ      WMT      XOM        T
1990-12-31 0.723928 1.024851 1.674406 5.991295 7.683583 3.471237 7.835626 5.455634 6.233811 4.643250
1991-01-31 0.943993 1.215822 1.863621 5.515581 7.806031 3.444045 7.167901 5.951600 6.218753 4.387233
1991-02-28 0.998104 1.222185 2.191377 5.654775 8.112150 3.641321 7.241796 6.379935 6.728324 4.513183
1991-03-31 1.020955 1.174849 2.233116 5.924462 8.607386 3.549830 7.592801 6.996564 7.140262 4.712602
1991-04-30 0.952409 1.209873 2.274859 5.871020 8.715557 3.366849 7.480882 7.312538 7.262318 4.539377
1991-05-31 1.055830 1.012474 2.471658 5.958511 8.004714 3.466449 7.050730 7.741360 7.191589 4.401176
returns <- Return.calculate(djia)
returns <- returns[-1,]
# Create a vector of returns 
means <- apply(returns, 2, "mean")
  
# Create a vector of standard deviation
sds <- apply(returns, 2, "sd")

# Create a scatter plot
plot(sds, means)
text(sds, means, labels = colnames(returns), cex = 0.7)
abline(h = 0, lty = 3)

You would rather invest all of your capital in Home Depot (ticker: HD) than in Caterpillar (ticker: CAT) because Home Depot has a higher average return and lower risk.

By combining investments, it may be possible to obtain a portfolio with lower variance than the variance of the Exxon Mobil Corporation’s stock (ticker: XOM).

4.1.1 Exploring monthly returns of the 30 DJIA stocks

The 1991-2015 monthly returns on the 30 DJIA stocks are available in the workspace as the variable returns. This exercise will help you get comfortable with the data that you will be using for the remainder of the exercises.

Recall that if we compute the means column by column using colMeans(returns), or apply(returns, 2, “mean”), you then obtain the average return per asset. In this exercise, you will be computing the mean per row. You can do this similarly using the function rowMeans() and supplying the argument 1 instead of two to indicate row-wise calculations in the apply() function. By doing so, you obtain the time series of returns for the equally weighted portfolio.

# Verify the class of returns 
class(returns)
[1] "xts" "zoo"
# Investigate the dimensions of returns
dim(returns)
[1] 300  30
# Create a vector of row means
ew_preturns <- rowMeans(returns)

# Cast the numeric vector back to an xts object
ew_preturns <- xts(ew_preturns, order.by = time(returns))

# Plot ew_preturns
plot.zoo(ew_preturns)

The chart you created visualizes the returns for an equally weighted portfolio.

4.1.2 Finding the mean-variance efficient portfolio

A mean-variance efficient portfolio can be obtained as the solution of minimizing the portfolio variance under the constraint that the portfolio expected return equals a target return. A convenient R function for doing so is the function portfolio.optim() in the R package tseries. Its default implementation finds the mean-variance efficient portfolio weights under the constraint that the portfolio return equals the return on the equally-weighted portfolio. The only argument needed is the monthly return data on the portfolio components for which the weights need to be determined.

The variable returns containing the monthly returns of the DJIA stocks is already loaded in the console.

Create a mean-variance efficient portfolio of monthly returns using the default of portfolio.optim() targeting the equally-weighted portfolio return, and assign the output to the variable opt.

# Load tseries
library(tseries)

# Create an optimized portfolio of returns
opt <- portfolio.optim(returns)

# Create pf_weights
pf_weights <- opt$pw

# Assign asset names
names(pf_weights) <- colnames(returns)

# Select optimum weights opt_weights
opt_weights <- pf_weights[pf_weights >= 0.01]

# Bar plot of opt_weights
barplot(opt_weights)


# Print expected portfolio return and volatility
opt$pm
[1] 0.01249652
opt$ps
[1] 0.0355865

The portfolio you created has an average return of 1.2%!

4.1.3 Effect of the return target

This exercise will show the effect of increasing your target return on the volatility of your mean-variance efficient portfolio.

The function portfolio.optim has arguments that allow for more general specifications. The arguments are as follows:

portfolio.optim(x, pm = mean(x), shorts = FALSE, reshigh = NULL)

The argument pm sets the target return, the argument reshigh specifies the upper constraints on the portfolio weights, and the argument shorts is a logical statement specifying whether negative weights are forbidden or not, by default shorts = FALSE.

You will create a portfolio that is optimized for a target return that equals the average value of the return series returns. Then you will create a portfolio that has a target return that is 10% higher than the mean return series and calculate the proportion change in risk.

# Create portfolio with target return of average returns 
pf_mean <- portfolio.optim(returns, pm = mean(returns))

# Create portfolio with target return 10% greater than average returns
pf_10plus <- portfolio.optim(returns, pm = 1.1 * mean(returns))

# Print the standard deviations of both portfolios
print(pf_mean$ps)
[1] 0.0355865
print(pf_10plus$ps)
[1] 0.03842654
# Calculate the proportion increase in standard deviation
(pf_10plus$ps - pf_mean$ps) / (pf_mean$ps)
[1] 0.07980684

By increasing your target return, your portfolio’s volatility increased by 8%!

4.1.4 Imposing weight constraints

Investors are often constrained by the maximum values allowed for the portfolio weights. These constraints can actually be an advantage. The advantage of a maximum weight constraint is that the subsequent portfolio will be less concentrated in certain assets. There is a disadvantage to this, though. The disadvantage is that the same target return may no longer be possible or will be obtained at the expense of higher volatility.

Remember from the previous exercise that the function portfolio.optim() allows you to set weight constraints within the reshigh argument. reshigh requires a vector of maximum weights for each asset.

In this exercise, you will create three portfolios with different maximum weight constraints. For this exercise, it is important to know the output of the portfolio.optim() function. This function creates a list containing four components: (i) $pw: the portfolio weights, (ii) $px: the returns of the overall portfolio, (iii) $pm: the expected return portfolio, (iv) $ps: the standard deviation of the portfolio returns. Create three vectors of maximum weights for each asset (column) in returns using the rep() function. The first vector will contain maximum weights of 100%, the second 10%, and the third 5%. Call these max_weights1, max_weights2, max_weights3, respectively.

# Create vectors of maximum weights
max_weights1 <- rep(1, ncol(returns))
max_weights2 <- rep(0.1, ncol(returns))
max_weights3 <- rep(0.05, ncol(returns))

# Create an optimum portfolio with max weights of 100%
opt1 <- portfolio.optim(returns, reshigh = max_weights1)

# Create an optimum portfolio with max weights of 10%
opt2 <- portfolio.optim(returns, reshigh = max_weights2)

# Create an optimum portfolio with max weights of 5%
opt3 <- portfolio.optim(returns, reshigh = max_weights3)

# Calculate how many assets have a weight that is greater than 1% for each portfolio
sum(opt1$pw > .01)
[1] 15
sum(opt2$pw > .01)
[1] 17
sum(opt3$pw > .01)
[1] 22
# Print portfolio volatilites 
opt1$ps
[1] 0.0355865
opt2$ps
[1] 0.03616089
opt3$ps
[1] 0.03798736

By imposing weight constraints you can ensure that your portfolio isn’t completely dominated by just a few assets.

4.2 The efficient frontier

4.2.1 Computing the efficient frontier using a grid of target returns

As you have seen, one approach to compute the efficient frontier is to first define the grid of target returns and then, for each target return, find the portfolio that has an expected return equal to the target return at the lowest possible variance.

But what is a reasonable grid of target returns? You will set the maximum target return to the maximum average return of the stocks. Ideally, you would set the minimum target return to the return of the minimum variance portfolio. Because you don’t know this minimum variance portfolio return yet, you create a grid using the minimum of average returns from all stocks.

In this exercise, you will use a for loop to calculate your grid of potential portfolio means, deviations, and weights.

# Calculate each stocks mean returns
stockmu <- colMeans(returns)

# Create a grid of target values
grid <- seq(from = 0.01, to = max(stockmu), length.out = 50)

# Create empty vectors to store means and deviations
vpm <- vpsd <- rep(NA, length(grid))

# Create an empty matrix to store weights
mweights <- matrix(NA, 50, 30)

# Create your for loop
for(i in 1:length(grid)) {
  opt <- portfolio.optim(x = returns, pm = grid[i])
  vpm[i] <- opt$pm
  vpsd[i] <- opt$ps
  mweights[i, ] <- opt$pw
}

The for loop that you just created is extremely helpful in being able to explore how target returns can affect weights, volatility, and portfolio returns.

4.2.2 Interpreting the efficient frontier

efficient frontier

The portfolio with the greatest excess return per unit of portfolio volatility is the tangency portfolio.

4.2.3 Properties of the efficient frontier

The curve shown in the previous plot shows the solution of minimizing the variance under the constraint that the portfolio expected return equals the target return. When the target return is below the minimum variance return, the obtained portfolio is not efficient because a higher return at lower risk can be obtained by investing in the minimum variance portfolio. The optimized portfolios often are also not realistic because they tend to invest large weights in only a few assets. A practical solution to avoid this is to impose weight constraints.

The effect of such a weight constraint is shown in the workspace, where you can see the efficient frontier plots obtained for the DJIA stocks under a maximum 100% weight constraint (black line), a maximum 10% weight constraint (red line) and a maximum 5% weight constraint (blue).

  • Imposing a more strict weight constraint leads to a shift to the right of the efficient frontier.
  • The portfolios on the dashed lines are not mean-variance efficient.

4.2.4 The minimum variance and maximum Sharpe ratio portfolio

In the previous exercises, you computed the efficient frontier using a grid of target returns. The output of your calculation of the efficient frontier was a series of two vectors, vpm (vector of portfolio means), and vpsd (vector of standard deviations, or volatilities), and a matrix of weights called mweights. You will use these outputs to identify the portfolios with the least volatility, and the greatest Sharpe ratio, and then plot their weight allocation.

As a reminder, the Sharpe Ratio is found by taking the excess returns less than the risk-free rate, divided by the portfolio volatility.

colnames(mweights) <- colnames(returns)
# Create weights_minvar as the portfolio with the least risk
weights_minvar <- mweights[vpsd == min(vpsd), ]

# Calculate the Sharpe ratio
vsr <- (vpm - 0.0075) / vpsd

# Create weights_max_sr as the portfolio with the maximum Sharpe ratio
weights_max_sr <- mweights[vsr == max(vsr)]

# Create bar plot of weights_minvar and weights_max_sr
par(mfrow = c(2, 1), mar = c(3, 2, 2, 1))
barplot(weights_minvar[weights_minvar > 0.01])
barplot(weights_max_sr[weights_max_sr > 0.01])

4.3 In-sample vs. out-of-sample evaluation

4.3.1 Split-sample evaluation

In chapter 2, you used the function window() to subset your returns for graphical purposes. In this exercise, you will use the window() to create two samples: an estimation sample and an evaluation sample. This exercise will illustrate how portfolio weights can differ when changing the estimation window.

To remind you, the function window() has argument of x, start, and end. Where start and end are in the format “YYYY-MM-DD”.

# Create returns_estim 
returns_estim <- window(returns, start = "1991-01-01", end = "2003-12-31")

# Create returns_eval
returns_eval <- window(returns, start = "2004-01-01", end = "2015-12-31")

# Create vector of max weights
max_weights <- rep(0.1, ncol(returns))

# Create portfolio with estimation sample 
pf_estim <- portfolio.optim(returns_estim, reshigh = max_weights)

# Create portfolio with evaluation sample
pf_eval <- portfolio.optim(returns_eval, reshigh = max_weights)

# Create a scatter plot with evaluation portfolio weights on the vertical axis
plot(pf_estim$pw, pf_eval$pw)
abline(a = 0, b = 1, lty = 3)

If portfolio weights are identical, they should be on the 45-degree line. By creating two samples using the window() function you can backtest, or evaluate the performance of a portfolio on historical data!

4.3.2 Out of sample performance evaluation

This example will illustrate how your returns can change based on the weighting created by an optimized portfolio. You will use the estimation portfolio (pf_estim) to evaluate the performance of your portfolio on the estimation sample of returns (returns_eval).

How severe is the optimality loss? Let’s compare, for the portfolio weights in pf_estim, the performance you expected using the evaluation sample (returns_estim) with the actual return on the out-of-sample period (returns_eval).

Calculate the returns of the portfolio with monthly rebalance weights pf_estim$pw on the estimation sample returns_estim. Call this returns_pf_estim

# Create returns_pf_estim
returns_pf_estim <- Return.portfolio(returns_estim, pf_estim$pw, rebalance_on = "months")


# Create returns_pf_eval
returns_pf_eval <- Return.portfolio(returns_eval, pf_estim$pw, rebalance_on = "months")

# Print a table for your estimation portfolio
table.AnnualizedReturns(returns_pf_estim)
ABCDEFGHIJ0123456789
 
 
portfolio.returns
<dbl>
Annualized Return0.1940
Annualized Std Dev0.1325
Annualized Sharpe (Rf=0%)1.4634

# Print a table for your evaluation portfolio
table.AnnualizedReturns(returns_pf_eval)
ABCDEFGHIJ0123456789
 
 
portfolio.returns
<dbl>
Annualized Return0.0864
Annualized Std Dev0.1242
Annualized Sharpe (Rf=0%)0.6954

The results from the pf_eval are what you may expect in a real performance. You have now successfully evaluated a split sample, this wouldn’t be possible without the use of historical data

4.3.3 It ain’t over

This journey showed you how to compute returns, analyze portfolio performance, and optimize portfolios. These are important skills since whenever you hold or manage more than one asset, you have a portfolio to consider.

It is now time to compound and get an extra return by exploring other useful R packages for portfolio analysis like PortfolioAnalytics. Instead of optimizing the portfolio by minimizing the variance under a return target constraint, this package allows you to optimize virtually all objective functions under all possible constraints. Even if the solution to the problem is not feasible, it will give you the best possible solution using heuristics, such as differential evolution in the R package DEoptim.

---
title: "Introduction to Portfolio Analysis in R"
output:
  html_notebook:
    toc: true
    toc_float: true
    toc_collapsed: false
    number_sections: true
    
toc_depth: 3
---
# The building blocks

## Introduction

### Getting a grasp of the basics

The goal of every investor is to make profits while limiting the amount of risk involved. Investors create strategies to reduce these risks.
Diversified portfolios reduce risk by offsetting loss with a potential gain in another asset!

### Get a feel for the data

The choice of investment matters even when the underlying risky assets are similar. As a first example, let us consider the stock price of the Coca Cola Company and the PepsiCo company from January 2003 until the end of August 2016.

The time series plot shows you the value evolution of one dollar invested in each company. As an exercise, plot the time series showing the relative value of an investment in the Coca Cola company, compared to the value of an investment in PepsiCo. To do this exercise, you can use the corresponding price series, available as the variables ko and pep in your workspace.

Three important packages that you will use in this course are the xts and zoo packages (these are popular when working with time series data), and the PerformanceAnalytics package.
Define ko_pep as the ratio expressing the value of the share price of the Coca Cola company in terms of the share price of PepsiCo.

Use plot.zoo() to visualize the variation in this ratio over time. Note that when the value of the ratio is larger than 1, the performance of ko since January 2003 is higher than that of pep.

As a reference line, use abline() to include a horizontal line at h=1. Note that where the value of the ratio is larger than one, the Coca Cola Company outperforms Pepsico and vice versa.
```{r}
library(zoo)
library(xts)
```

```{r}
stock01 <- readRDS("data/re_prices.RData")
stock02 <- readRDS("data/eq_prices.RData")
plot(stock01)
lines(stock02, col = "red")
```
```{r}
# Define ko_pep 
ko_pep <- stock02/stock01

# Make a time series plot of ko_pep
plot.zoo(ko_pep)  
  
# Add as a reference, a horizontal line at 1
abline(h = 1)  
```
The horizontal reference indicates when Coca Cola Company and PepsiCo are performing equally.

## The portfolio weights

### Calculating portfolio weights when component values are given

In the video, it was shown that you can easily compute portfolio weights if you have a given amount of money invested in certain assets. If you want to start investing in a portfolio, but you have budget restraints, you can also impose weights yourself. Depending on what these are, you will invest a certain amount of money in each of the assets based on their weight.

When given asset values, calculating the weights is quite simple. Recall from the video that weights are calculated by taking the value of an asset divided by the sum of values from all assets.

In this exercise, you will learn to calculate weights when individual asset values are given! For this example, an investor has 4000 USD invested in equities, 4000 USD invested in bonds, and 2000 USD invested in commodities. Compute the weights as the proportion invested in each of those three assets.
```{r}
# Define the vector values
values <- c(4000, 4000, 2000)

# Define the vector weights
weights <- values/sum(values)

# Print the resulting weights
print(weights)
```
Remember that the weight of an asset is calculated by taking the value of the asset divided by the sum of values of all assets. You can also pre-define weights as a vector. For example weights <- c(0.2, 0.2, 0.6).

### The weights of a market capitalization-weighted portfolio

In a market capitalization-weighted portfolio, the weights are given by the individual assets' market capitalization (or market value), divided by the sum of the market capitalizations of all assets. A typical example is the S&P 500 portfolio invested in the 500 largest companies listed on the US stock exchanges (NYSE, Nasdaq). Note that by dividing by the sum of asset values across all portfolio assets, the portfolio weights sum to unity (one).

As an exercise, inspect the distribution of market capitalization based weights when the portfolio is invested in 10 stocks. For this exercise, you can use market capitalizations of 5, 8, 9, 20, 25, 100, 100, 500, 700, and 2000 million USD.
```{r}
# Define marketcaps
marketcaps <- c(5, 8, 9, 20, 25, 100, 100, 500, 700, 2000)
  
# Compute the weights
weights <- marketcaps/sum(marketcaps)
  
# Inspect summary statistics
summary(weights)
  
# Create a bar plot of weights
barplot(weights)
```
## The portfolio return

### Calculation of portfolio returns

For your first exercise on calculating portfolio returns, you will verify that a portfolio return can be computed as the weighted average of the portfolio component returns. In other words, this means that a portfolio return is calculated by taking the sum of simple returns multiplied by the portfolio weights. Remember that simple returns are calculated as the final value minus the initial value, divided by the initial value.

Assume that you invested in three assets. Their initial values are 1000 USD, 5000 USD, 2000 USD, respectively. Over time, the values change to 1100 USD, 4500 USD, and 3000 USD.
```{r}
# Vector of initial value of the assets
in_values <- c(1000, 5000, 2000)
  
# Vector of final values of the assets
fin_values <- c(1100, 4500, 3000)

# Weights as the proportion of total value invested in each asset
weights <- in_values/sum(in_values)

# Vector of simple returns of the assets 
returns <- (fin_values - in_values)/in_values
returns
# Compute portfolio return using the portfolio return formula
preturns <- sum(weights*returns)
preturns
```
Note that the weighted average portfolio return equals the percentage change in the total portfolio value. Remember that portfolio returns are calculated by taking the sum of simple returns, multiplied by the portfolio weights.

### From simple to gross and multi-period returns

The simple return R expresses the percentage change in the value of an investment. The corresponding so-called "gross" return is defined as the future value of 1 USD invested in the asset for one period and is thus equal to 1+R.

The gross return over two periods is obtained similarly. Let $R1$ be the return in the first period and $R2$ the return in the second period. Then the end-value of a 1 USD investment is (1+$R1$)∗(1+$R2$).

The corresponding simple return over those two periods is: (1+$R1$)∗(1+$R2$)−1.

Suppose that you have an investment horizon of two periods. In the first period, you make a 10% return. But in the second period, you take a loss of 5%.
```{r}
1000*((1 + 0.1)*(1 - 0.05) - 1)
```
End-value = 1045

### The asymmetric impact of gains and losses

It is important to be aware of the fact that a positive and negative return of the same relative magnitude do not compensate each other in terms of terminal wealth. Mathematically, this can be seen from the identity (1+x)∗(1−x)=1−x2, which is less than one. A 50% loss is thus not compensated by a 50% gain. After a loss of 50%, what is the return needed to be at par again? 100%

## PerformanceAnalytics

### Buy-and-hold versus (daily) rebalancing

The choice of investment matters, even when the underlying risky assets are similar. As an example, you will now consider the stock price of Apple and Microsoft from January 2006 until the end of August 2016. The time series plot shows you the value evolution of one dollar invested in each of them.
For this exercise, your portfolio approach will be to invest half of your budget in Apple stock, and the other half of your budget in Microsoft stock. Over time, the portfolio weights will change. You will have two choices as an investor. The first choice is to be passive and not trade any further. This is called a buy and hold strategy. The second choice is to buy and trade at the close of each day that results in a rebalance of the portfolio such that your portfolio is equally invested in shares of Microsoft and Apple. This is a rebalanced portfolio.

-	By investing in a portfolio of risky assets, the portfolio payoff will be in between the minimum and maximum of the payoff of the underlying risky assets.
-	By investing in a portfolio of risky assets, the risk of the portfolio payoff will be less than the maximum risk of the underlying risky assets.
-	Because the prices of Microsoft and Apple have evolved differently, the investor who spent initially half of his wealth on Microsoft and Apple ends up with a portfolio that is no longer equally invested.

### The time series of asset returns

Calculating the returns for one period is pretty straightforward to do in R. When the returns need to be calculated for different dates the functions Return.calculate() and Return.portfolio(), in the R package PerformanceAnalytics are extremely helpful. They require the input data to be of the xts-time series class, which is pre-loaded. You will explore the functionality of the PerformanceAnalytics package in this exercise.

The object prices which contains the Apple (aapl) and Microsoft (msft) stocks is available in the workspace.
```{r}
prices <- readRDS("data/aapl_msft.RData")
```

```{r}
plot(prices$AdjClose.aapl)
lines(prices$AdjClose.msft, col = "red")
```

```{r}
# Load package PerformanceAnalytics 
library(PerformanceAnalytics)

# Print the first six rows and last six rows of prices
head(prices)
tail(prices)

# Create the variable returns using Return.calculate()  
returns <- Return.calculate(prices)
  
# Print the first six rows of returns. Note that the first observation is NA because there is no prior price.
head(returns)
  
# Remove the first row of returns
returns <- returns[-1, ]
head(returns)
```
Notice that you removed the first row of your returns because they were NA values!

### The time series of portfolio returns

In the previous exercise, you created a variable called returns from the daily prices of stocks of Apple and Microsoft. In this exercise, you will create two portfolios using the return series you previously created. The two portfolios will differ in one way, and that is the weighting of the assets.

In the last video, you were introduced to two weighting strategies: the buy and hold strategy, and a monthly rebalancing strategy. In this exercise, you will create a portfolio in which you don’t rebalance, and one where you rebalance monthly. You will then visualize the portfolio returns of both.

You will use the function Return.portfolio() for your calculations. For this function, you will provide three arguments: R, weights, and rebalance_on. R is a time series of returns, weights is a vector containing asset weights, and rebalance_on specifies which calendar-period to rebalance on. If you need help, be sure to check the documentation.

```{r}
# Create the weights
eq_weights <- c(0.5, 0.5)

# Create a portfolio using buy and hold
pf_bh <- Return.portfolio(R = returns, weights = eq_weights)

# Create a portfolio rebalancing monthly 
pf_rebal <- Return.portfolio(R = returns, weights = eq_weights, rebalance_on = "months")

# Plot the time-series
par(mfrow = c(2, 1), mar = c(2, 4, 2, 2))
plot.zoo(pf_bh)
plot.zoo(pf_rebal)
```
You will see based on the plots that rebalancing a portfolio can lead to less extreme returns in either direction.

### The time series of weights

In the previous exercise, you explored the functionality of the Return.portfolio() function and created portfolios using two strategies. However, by setting the argument verbose = TRUE in Return.portfolio() you can create a list of beginning of period (BOP) and end of period (EOP) weights and values in addition to the portfolio returns, and contributions.

You can access these from the resultant list-object created from Return.portfolio(). The resultant list contains \$returns, \$contributions, \$BOP.Weight, \$EOP.Weight, \$BOP.Value, and \$EOP.Value.

In this exercise, you will recreate the portfolios you made in the last exercise but extend it by creating a list of calculations using verbose = TRUE. You will then visualize the end of period weights of Apple.
```{r}
# Create the weights
eq_weights <- c(0.5, 0.5)

# Create a portfolio using buy and hold
pf_bh <- Return.portfolio(returns, weights = eq_weights, verbose = TRUE )

# Create a portfolio that rebalances monthly
pf_rebal <- Return.portfolio(returns, weights = eq_weights, rebalance_on = "months", verbose = TRUE  )

# Create eop_weight_bh
eop_weight_bh <- pf_bh$EOP.Weight

# Create eop_weight_rebal
eop_weight_rebal <- pf_rebal$EOP.Weight

# Plot end of period weights
par(mfrow = c(2, 1), mar=c(2, 4, 2, 2))
plot.zoo(eop_weight_bh$AdjClose.aapl)
plot.zoo(eop_weight_rebal$AdjClose.aapl)
```
Look at the two plots. Notice how the weights the rebalanced portfolio stay close to the initial weight of 50%, and how they buy and hold strategy has Apple stock with about 80% weight

# Analyzing performance

## Dimensions of portfolio performance

### Exploring the monthly S&P 500 returns

For the upcoming exercises, you will examine the monthly performance of the S&P 500. A picture is worth a thousand words. This is why most analyses of performance start by studying the time series plot of the value of an investment.

A plot of the S&P 500 for the period of 1986 until August 2016 is shown to your right. Each observation on this plot is an end-of-day value. This chart shows a number of booms and busts. Take a look at the chart, do you see why the 2000s are so often called the lost decade of investing?
Your job is to describe the monthly performance of the S&P 500. To do so, you will first need to aggregate daily price series to end-of-month prices. You will then calculate the monthly returns and visualize them in a table.

```{r}
sp500 <- readRDS("data/sp500.RData")
sp500 <- as.xts(sp500)
plot(sp500)
```
```{r}
# Convert the daily frequency of sp500 to monthly frequency: sp500_monthly
#sp500_monthly <- to.monthly(sp500, indexAt = "endof")

# Print the first six rows of sp500_monthly
#head(sp500_monthly)

# Create sp500_returns using Return.calculate using the closing prices
sp500_returns <- Return.calculate(sp500) 
sp500_returns <- sp500_returns[-1,]
# Time series plot
plot.zoo(sp500_returns)

# Produce the year x month table
table.CalendarReturns(sp500_returns)
```
### The monthly mean and volatility

You have now created month to month returns for the S&P 500. Next, you will need to do a descriptive analysis of the returns.

In this exercise, you will calculate the mean (arithmetic and geometric) and volatility (standard deviation) of the returns. These returns are available in your workspace as the variable sp500_returns.
```{r}
# Compute the mean monthly returns
mean(sp500_returns)

# Compute the geometric mean of monthly returns
mean.geometric(sp500_returns)

# Compute the standard deviation
sd(sp500_returns)
```
The volatility and mean return are very important metrics in every investor's decision making!

## The (annualized) Sharpe ratio

### Excess returns and the portfolio's Sharpe ratio

You just learned how to create descriptive statistics of your portfolio returns. Now you will learn how to evaluate your portfolio's performance!

Performance evaluation involves comparing your investment choices with an alternative investment choice. Most often, the performance is compared with investing in an (almost) risk-free asset, such as the U.S. Treasury Bill. The return from a U.S. Treasury Bill is known as a risk-free rate because Treasury Bills (T-Bills) are backed by the U.S. Government.

In this exercise, you will be asked to annualize the risk-free rate by using the compound interest formula. The yearly compounded interest rate is given by $(1+y)^{12}−1$. The annual rate is used to estimate a yearly return and is very useful for forecasting.

The Sharpe Ratio is an important metric that tells us the return-to-volatility ratio. It is calculated by taking the mean of excess returns (returns - risk-free rate), divided by the volatility of the returns.
```{r}
#download 13 week Ttreasury Bill (ticker: ^IRX)
library(tseries)
con <- url("https://finance.yahoo.com")
if(!inherits(try(open(con), silent = TRUE), "try-error")) {
  close(con)
  tb <- get.hist.quote(instrument = "^IRX", start = "1986-01-01", end = "2016-08-31",
                      quote = "Close", compression = "d")
  plot(tb)
}
tb_monthly <- to.monthly(tb, indexAt = "startof")
# monthly interest for risk free purposes
rf <- tb_monthly$tb.Close/12/100
rf <- as.xts(rf)
```

```{r}
# Compute the annualized risk free rate
annualized_rf <- (1 + rf)^12 - 1

# Plot the annualized risk-free rate
plot(annualized_rf)

# Compute the series of excess portfolio returns 
sp500_excess <- sp500_returns - rf

# Compare the mean of sp500_excess and sp500_returns 
mean(sp500_excess)
mean(sp500_returns)

# Compute the Sharpe ratio
sp500_sharpe <- mean(sp500_excess) / sd(sp500_returns)
sp500_sharpe
```
Remember that the Sharpe Ratio is calculated by taking the mean of excess returns and dividing by the volatility of those returns.

### Annualized mean and volatility

The mean and volatility of monthly returns correspond to the average and standard deviation over a monthly investment horizon. Investors annualize those statistics to show the performance over an annual investment horizon.

To do so, the package PerformanceAnalytics has the function Return.annualized() and StdDev.annualized() to compute the (geometrically) annualized mean return and annualized standard deviation for you.
Remember that the Sharpe ratio is found by taking the mean excess returns subtracted by the risk-free rate, and then divided by the volatility!
```{r}
# Compute the annualized mean
Return.annualized(sp500_returns)

# Compute the annualized standard deviation
StdDev.annualized(sp500_returns)

# Compute the annualized Sharpe ratio: ann_sharpe
ann_sharpe <- Return.annualized(sp500_returns) / StdDev.annualized(sp500_returns)

# Compute all of the above at once using table.AnnualizedReturns()
table.AnnualizedReturns(sp500_returns)
```
Annualizing returns allows you to look at performance over a longer period of time.

## Rolling annualized mean and volatility
In previous exercises, you have already familiarized yourself with the Return.annualized() and StdDev.annualized() functions. In this exercise, you will also use the function SharpeRatio.annualized() to calculate the annualized Sharpe Ratio. This function takes the arguments R, and Rf. The R argument takes an xts, vector, matrix, data.frame, timeSeries, or zoo object of asset returns. The Rf argument is necessary in SharpeRatio.annualized(), as it takes into account the risk-free rate in the same period of your returns. For this example, you can use the object rf to fulfill the Rf argument.

The function chart.RollingPerformance() makes it easy to visualize the rolling estimates of performance in R. Familiarize yourself first with the syntax of this function. It requires you to specify the time series of portfolio returns (by setting the argument R), the length of the window (width), and the function used to compute the performance (argument FUN). To see all three plots together, PerformanceAnalytics provides a shortcut function charts.RollingPerformance(). Note the charts instead of chart. This function creates all of the previous charts at once and does not use the argument FUN!
```{r}
# Calculate the mean, volatility, and Sharpe ratio of sp500_returns
returns_ann <- Return.annualized(sp500_returns)
sd_ann <- StdDev.annualized(sp500_returns)
sharpe_ann <-SharpeRatio.annualized(sp500_returns, Rf = rf)

# Plotting the 12-month rolling annualized mean
chart.RollingPerformance(R = sp500_returns, width = 12, FUN = "Return.annualized")

# Plotting the 12-month rolling annualized standard deviation
chart.RollingPerformance(R = sp500_returns, width = 12, FUN = "StdDev.annualized")

# Plotting the 12-month rolling annualized Sharpe ratio
chart.RollingPerformance(R = sp500_returns, width = 12, FUN = "SharpeRatio.annualized", Rf = rf)
```
Note that the dynamics in the return, risk, and risk-adjusted returns are recurrent. Investors can thus compare current rolling risk and return statistics with past values to pinpoint the past periods in which performance has been similar.

### Subperiod performance analysis and the function window

In the previous exercise, you computed the performance measure on each possible sample of a fixed size by rolling through time. Often investors are interested in the performance of a specific subwindow. You can create subsets of a time series in R using the function window(). The first argument is the return series that needs subsetting. The second argument is the starting date of the subset in the form "YYYY-MM-DD", and the third argument is the ending date in the same format.
```{r}
con <- url("https://finance.yahoo.com")
if(!inherits(try(open(con), silent = TRUE), "try-error")) {
  close(con)
  sp500 <- get.hist.quote(instrument = "^GSPC", start = "1986-01-01", end = "2016-08-31",
                      quote = "Close", compression = "d")
}
head(sp500)
```

```{r}
sp500_returns  <- Return.calculate(sp500$Close)
sp500_returns <- sp500_returns[-1,]
# Fill in window for 2008
sp500_2008 <- window(sp500_returns, start = "2008-01-01", end = "2008-12-31")

# Create window for 2014
sp500_2014 <- window(sp500_returns, start = "2014-01-01", end = "2014-12-31")

# Plotting settings
par(mfrow = c(1, 2) , mar=c(3, 2, 2, 2))
names(sp500_2008) <- "sp500_2008"
names(sp500_2014) <- "sp500_2014"
```

Plot histogram of returns in 2008 using the function chart.Histogram(). Set the argument methods = c("add.density", "add.normal") to visualize the non-parametric estimate of the density and the density under an assumed normal distribution.

```{r}
# Plot histogram of 2008
chart.Histogram(sp500_2008, methods = c("add.density", "add.normal"))

# Plot histogram of 2014
chart.Histogram(sp500_2014, methods = c("add.density", "add.normal"))
```
## Non-normality of the return distribution

### Balancing risk and reward


The x-axis corresponds to the return of an asset, and the y-axis is the corresponding density, indicating how likely that return value is. Take a moment to look over these charts.

Notice that the distributions are all symmetric, and the center of the distribution corresponds to the average return.

Return distributions A and B have the same expected return. But distribution B has higher volatility and therefore has a wider range of possible outcomes. This indicates that the risk of a large deviation from the mean return is higher for investing in B than for investing in A. Distribution C has the same level of volatility as Distribution A but a higher mean return.
Distribution C has the highest average return with the lowest volatility.

### Detecting non-normality using skewness and kurtosis


Returns are most often non-normal in nature. Two metrics key to understanding the distribution of non-normal returns are skewness and kurtosis. The skewness will help you identify whether or not negative or positive returns occur more frequently. Negative skewness indicates that large negative returns occur more often than large positive ones, and vice versa.

Kurtosis will be positive if there are fat tails in your distribution. This means that large positive or negative returns will happen more often than can be assumed under a normal distribution.

```{r}
sp500_daily <- sp500_returns
sp500_monthly <- to.monthly(sp500$Close, indexAt = "endof")
sp500_monthly <- CalculateReturns(sp500_monthly)
sp500_monthly <- sp500_monthly[-1, 4]
par(mfrow = c(1, 2) , mar=c(3, 2, 2, 2))
# Plot histogram of 2008
chart.Histogram(sp500_daily, methods = c("add.density", "add.normal"))
# Plot histogram of 2008
chart.Histogram(sp500_monthly, methods = c("add.density", "add.normal"))
```
The histograms in the plot environment compare the daily and monthly returns of the S&P 500 over the period of 1986 until today. There seems to be a negative skewness() in these plots, and a somewhat greater than normal kurtosis(). Note that, by default, kurtosis() reports the excess kurtosis (that is, the kurtosis minus three). Let's see if the numbers match our observations!
```{r}
#  Compute the skewness 
skewness(sp500_daily)
skewness(sp500_monthly) 

# Compute the excess kurtosis 
kurtosis(sp500_daily)
kurtosis(sp500_monthly)
```
The excess kurtosis can be thought of as how fat the tails of the return distribution are compared to the normal distribution.

### Downside risk measures

The standard deviation gives equal weight to positive and negative returns in calculating the return variability. When the return distribution is asymmetric (skewed), investors use additional risk measures that focus on describing the potential losses. One such measure of downside risk is the Semi-Deviation. The Semi-Deviation is the calculation of the variability of returns below the mean return.

Another more popular measure is the so-called Value-at-Risk (or VaR). Loosely speaking, the VaR corresponds to the 5% quantile of the return distribution, meaning that a more negative return can only happen with a probability of 5%. For example you might ask: "what is the largest loss I could potentially take within the next quarter such that I only have 5% probability of observing an even larger loss?"

The expected shortfall is another measure of risk that focuses on the average loss below the 5% VaR quantile.

In this exercise, you will examine the potential risk of the monthly returns of the S&P 500. You will use the functions SemiDeviation(), VaR(), and ES(). These functions all require the argument R, which is an xts, vector, matrix, data.frame, timeSeries, or zoo object of asset returns. However the VaR(), and ES() require another argument p, or loss probability level, which is p = 0.05 in case of the 5% VaR and ES.

Use the function VaR() to estimate the 5% and 2.5% value-at-risk of a monthly investment in S&P 500 returns.

Use the function ES() to estimate the 5% and 2.5% expected shortfall of a monthly investment in S&P 500 returns.
```{r}
# Calculate the SemiDeviation
SemiDeviation(sp500_monthly)

# Calculate the value at risk
VaR(sp500_monthly, p = 0.05)
VaR(sp500_monthly, p = 0.025)

# Calculate the expected shortfall
ES(sp500_monthly, p = 0.05)
ES(sp500_monthly, p = 0.025)
```
You now know how to calculate the potential risk of your investments! Each statistic tells you something slightly different. It is important that you remember what your risk measures are!

### Drawdowns due to buying high, selling low

The volatility, semi-deviation, value-at-risk, and expected shortfall are all measures that describe risk over one period. These metrics do not do a great job at describing the worst-case risk of buying at a peak and selling at a trough. This sort of risk can be quantified by analyzing the portfolio's drawdowns, or peak-to-trough decline in cumulative returns.
The function table.Drawdowns() in PerformanceAnalytics reports the five largest drawdown episodes over a sample period. The package also has another function chart.Drawdown() to visualize the evolution of the drawdowns from each peak over time.
You will now explore the historic drawdown episodes of the monthly returns of the S&P 500. The returns are loaded into your workspace as sp500_monthly.

```{r}
# Table of drawdowns
table.Drawdowns(sp500_monthly)

# Plot of drawdowns
chart.Drawdown(sp500_monthly)
```
# Performance drivers

## Drivers in the case of two assets

### Driver 1: The assets' individual performance

three types of drivers for a portfolio's performance: (i) the individual performance of the assets in terms of risk and return, (ii) the portfolio weight of each of the assets, (iii) the correlation between the asset returns.

Now let's analyze some data to get a better grasp of this result! In this example, you will consider investing – with monthly frequency – in U.S. equities and U.S. bonds. Each assets' returns are stored in your workspace as returns_equities, and returns_bonds. In addition, the portfolio is invested 60/40, meaning that every month you invest 60% in equities and 40% bonds. These portfolio returns are stored as returns_6040.
```{r}
eq_prices <- readRDS("data/eq_prices.RData")
head(eq_prices)
bond_prices <- readRDS("data/bond_prices.RData")
head(bond_prices)
returns <- merge(eq_prices, bond_prices)
weights <- c(0.60, 0.40)
portfolio <- eq_prices * 0.60 + bond_prices *0.40
head(portfolio)
```
```{r}
plot(eq_prices)
lines(bond_prices, col = "red")
lines(as.zoo(portfolio), col = "blue")
legend(x = "topleft", legend = c("Equities", "Bonds", "60/40 Portfolio"),
       col = c("black", "red", "blue"),
       lty = c(1, 1, 1))
```
The portfolio mean return equals the position weighted sum of the component mean returns. You can test this using: all.equal(mean(returns_6040), 0.6 * mean(returns_equities) + 0.4 * mean(returns_bonds))

### Driver 2: The choice of portfolio weights

Investors can optimize the choice of weight to obtain the highest risk-adjusted return, as measured by the portfolio Sharpe ratio.

In the special case of investing the total portfolio value in only two assets, there is only one weight to determine, because the weight on the second asset equals one minus the weight of the first asset.

Let us do this in the case of a portfolio invested in U.S. equities and U.S. bonds. We will be using the brute force approach of trying a large number of possible weights and keeping the weight that yields the highest value for the portfolio Sharpe ratio (assuming a zero risk-free rate).

Create a vector called grid using seq() that begins at 0, ends at 1, by an increment of 0.01.
Initialize empty vector vsharpe with the same length as grid. A popular way of doing this is by creating a vector that contains NA's using function rep(). You will replace these NA's in the loop that you will create next.
In the for loop, you will compute the Sharpe ratio for each of the possible weights in grid. The first command in the for-loop selects the i-th element of grid and stores it in object weight, which changes in each iteration.
You want to see how the portfolio return changes with a changing weight. Create an object preturns that equals the sum of weight times returns_equities, and (1-weight) times returns_bonds.
Next, you will replace the NAs in vsharpe with the annualized Sharpe ratio (SharpeRatio.annualized()) of preturns.
Fill in the plot function where potential weights (grid) is plotted on the x-axis and the Sharpe ratios on the y-axis.

```{r}
returns_equities <- Return.calculate(eq_prices)
returns_bonds <- Return.calculate(bond_prices)
# Create a grid
grid <- seq(from = 0, to = 1, by = 0.01)

# Initialize an empty vector for Sharpe ratios
vsharpe <- rep(NA, times = length(grid))

# Create a for loop to calculate Sharpe ratios
for(i in 1:length(grid)) {
    weight <- grid[i]
    preturns <- weight * returns_equities + (1 - weight) * returns_bonds
    vsharpe[i] <- SharpeRatio.annualized(preturns)
}

# Plot weights and Sharpe ratio
plot(grid, vsharpe, xlab = "Weights", ylab= "Ann. Sharpe ratio")
abline(v = grid[vsharpe == max(vsharpe)], lty = 3)
```
The portfolio that is 11% invested in equities, 89% in bonds has the largest possible Sharpe ratio.

### Driver 3: The correlation between the asset returns

The third driver of portfolio performance is the correlation between asset returns. Generally speaking, the correlation tells you how two asset returns tend to move together.

The correlation of assets has important consequences in overall portfolio performance. This correlation is important because it can reduce volatility through diversification, or reducing overall correlation. In fact, the lower the correlation, the more successful the portfolio tends to be in regards to partially offsetting large losses in one asset with only a minor loss, or even a gain, in another asset.

In the extreme case of two identical asset returns, the correlation will be 1, and there is no diversification potential. In the other extreme case where, if one asset return is above average, and the other is almost always below average, the correlation is negative. The correlation is 0 when the asset returns are linearly independent of each other. Note that interdependency can still exist on a non-linear level even when the correlation is 0.
As an exercise, suppose you have an equally weighted portfolio of two assets. Their correlation jumps from 0 to 0.5. What happens with the variance?
```{r}
returns_equities <- Return.calculate(eq_prices)
returns_equities <- returns_equities[-1,]
returns_bonds <- Return.calculate(bond_prices)
returns_bonds <- returns_bonds[-1,]

pf_var <- function (x){
    return((0.5^2) * var(returns_equities) + (0.5^2) * var(returns_bonds) + 
        2 * 0.5 * 0.5 * sd(returns_equities) * sd(returns_bonds) * x)
}
```
function pf_var(x) that calculates the portfolio variance for the equally weighted portfolio of equities and bonds, when their correlation is assumed to be equal to x.
Having correlated assets can increase potential risk. Correlated asset prices rise and fall together. So when you suffer a loss in one asset, there is a greater chance of suffering a loss on the other asset!
```{r}
pf_var(0)
pf_var(0.5)
```
### Interpreting correlation

Now you will learn how to compute the correlation between bond returns and equity returns. Just like volatilities, these correlations are dynamic. Therefore you need to distinguish between a static analysis that calculates correlations over a complete sample and a dynamic analysis that calculates correlations over a rolling sample. This is a similar analysis as you did for the time-varying performance evaluation in terms of mean return and volatility.

In this exercise you will learn 3 new functions from the PerformanceAnalytics package: chart.Scatter(), chart.Correlation(), and chart.RollingCorrelation().
```{r}
# Create a scatter plot
chart.Scatter(returns_bonds, returns_equities, xlab = "bond returns", ylab = "equity returns", main = "bond-equity returns")

# Find the correlation
cor(returns_bonds, returns_equities)

# Merge returns_bonds and returns_equities 
returns <- merge(returns_bonds, returns_equities)

# Find and visualize the correlation using chart.Correlation
chart.Correlation(returns) 

# Visualize the rolling estimates using chart.RollingCorrelation
chart.RollingCorrelation(returns_bonds, returns_equities, width = 24)
```
## Using matrix notation

### Making a risk-reward scatter diagram

In this example, you have decided to extend your investment opportunity by creating a portfolio that consists of US equities ETF (SPY), US bonds ETF (AGG), a real estate investment trust (VNQ), and an ETF tracking in the GSCI commodities index (GSG). The plot in the environment displays the performance of these investments.
```{r}
comm_prices <- readRDS("data/comm_prices.RData")
re_prices <- readRDS("data/re_prices.RData")
```
```{r}
plot(eq_prices)
lines(bond_prices, col = "red")
lines(re_prices, col = "blue")
lines(comm_prices, col = "purple")
legend(x = "topleft", legend = c("Equities", "Bonds", "Real estate", "Commodities"),
       col = c("black", "red", "blue", "purple"),
       lty = c(1, 1, 1, 1))
```
You can also visualize “relative” attractiveness of the investments by making a scatter plot of the average returns against the portfolio volatilities. In order to do this, you need to calculate the averages and volatilities for each asset. This corresponds to each column in the return series returns.

These calculations are made easy by using the function apply() with the first argument as the return data, the second argument is the value of 2 indicating that the calculation should be column-wise, while the third argument is the name of the function that needs to be applied on each column.

```{r}
eq_returns <- Return.calculate(eq_prices)
eq_returns <- eq_returns[-1,]
bond_returns <- Return.calculate(bond_prices)
bond_returns <- bond_returns[-1,]
comm_returns <- Return.calculate(comm_prices)
comm_returns <- comm_returns[-1,]
re_returns <- Return.calculate(re_prices)
re_returns <- re_returns[-1,]
returns <- merge(eq_returns, bond_returns, comm_returns, re_returns)
colnames(returns) <- c("Equities", "Bonds", "Commodities", "Real Estate")
# Create a vector of returns 
means <- apply(returns, 2, "mean")
  
# Create a vector of standard deviation
sds <- apply(returns, 2, "sd")

# Create a scatter plot
plot(sds, means)
text(sds, means, labels = colnames(returns), cex = 0.7)
abline(h = 0, lty = 3)
```
### The covariance matrix

The covariance matrix is crucial in determining the portfolio variance in the general case of N assets. Remember that an element on row i and column j correspond to the covariance of the i th and j th return. Recall also that the covariance of two return series is the product between their volatilities and their correlation, and that the covariance of an asset return with itself is its variance.

In this exercise, you will compute and analyze the covariance and correlation matrix on the monthly returns of the four asset classes from the previous exercise. To refresh, these asset classes are equities, bonds, real estate, and commodities. To create these matrices, you will use the standard functions cov() and cor().

In your workspace are the monthly investments as returns, and the vector of standard deviations sds that you created previously.
Create a diagonal matrix that contains the variances on the diagonal. You can use the function diag() to do this, using a squared sds^2 as the only argument. Call this diag_cov.
```{r}
sds <- c(sd(returns[,1]), sd(returns[,2]), sd(returns[,3]), sd(returns[,4]))
# Create a matrix with variances on the diagonal
diag_cov <- diag(sds^2)
diag_cov

# Create a covariance matrix of returns
cov_matrix <- cov(returns)
cov_matrix

# Create a correlation matrix of returns
cor_matrix <- cor(returns)
cor_matrix

# Verify covariances equal the product of standard deviations and correlation
all.equal(cov_matrix[1,2], cor_matrix[1,2] * sds[1] * sds[2])
```
These matrices can help you in your calculations, and are also a good reference to see the relationship between your assets.

### Matrix-based calculation of portfolio mean and variance

When w is the column-matrix of portfolio weights, $μ$ the column-matrix of expected returns, and $Σ$ the return covariance matrix. Then the portfolio expected return is $w′μ$, and the portfolio variance is $w′Σw$. Remember that the portfolio's volatility is the square root of its variance.

You will practice matrix multiplication in R using the %\*% function, instead of the standard \*. In addition, you will transpose the matrices by using the standard function t(). Remember that transposing a matrix is simply changing the rows of the matrix to the columns.

The weights, vector of means, and the covariance matrix are pre-loaded in your workspace as weights, vmeans, and sigma, respectively.
```{r}
weights <- c(0.4, 0.4, 0.1, 0.1)
vmeans <- c(c(equities = 0.00706924247456836, bonds = 0.00400950651113212, 
realestate = 0.00894954808678432, commodities = -0.0082960697150902
))
sigma <- matrix(c(0.00192021805592667, 3.00976104465011e-05, 0.00240637780809405, 
0.00151099397208086, 3.00976104465011e-05, 0.000122271264776651, 
0.000263223185975344, -6.86082072578803e-05, 0.00240637780809405, 
0.000263223185975344, 0.00544776248878884, 0.0013291423299705, 
0.00151099397208086, -6.86082072578803e-05, 0.0013291423299705, 
0.00479797179115701), ncol = 4)
# Create a weight matrix w
w <- as.matrix(weights)

# Create a matrix of returns
mu <- as.matrix(vmeans)

# Calculate portfolio mean monthly returns
t(w) %*% mu

# Calculate portfolio volatility
sqrt(t(w) %*% sigma %*% w)
```
## Portfolio risk budget

### Who did it?

You saw the difference between a capital allocation budget and a risk budget. In this exercise, you will construct a risk budget, and discover how large each asset's percent risk contribution is in the total portfolio volatility.
For this last exercise, you will calculate the risk contributions for a portfolio that is again invested 40% in equities, 40% in bonds, 10% in real estate, and 10% in commodities. The function StdDev() plays an important role in this exercise. The StdDev() function creates a list of the assets' standard deviation (\$StdDev), their risk contribution (\$contribution), and their percent risk contribution (\$pct_contrib_StdDev).
You will be using three arguments in the StdDev() function to do this calculation. The first is R, a vector, matrix, data frame, time series, or zoo object of returns. The second is portfolio_method, which you will set to component, and the third is weights.
```{r}
# Create portfolio weights
weights <- c(0.4, 0.4, 0.1, 0.1)

# Create volatility budget
vol_budget <- StdDev(returns, portfolio_method = "component", weights = weights)

# Make a table of weights and risk contribution
weights_percrisk <- cbind(weights, vol_budget$pct_contrib_StdDev)
colnames(weights_percrisk) <- c("weights", "perc vol contrib")

# Print the table
weights_percrisk
```
Note that the percentage volatility risk caused by the bonds is much less than their portfolio weight.

# Optimizing the portfolio

## Modern portfolio theory of Harry Markowitz

Mean-variance based investing in DJIA stocks
In this chapter, you will explore portfolio optimization. For the upcoming exercises, you will focus on the Dow Jones Industrial Average (DJIA) stocks. Like the S&P 500 portfolio studied in Chapter 2, the Dow Jones Industrial Average portfolio is an important reference portfolio for the U.S. equity market. It is invested in 30 large publicly owned companies based in the United States.

You will be using the sample of monthly returns from January 1991 until December 2015. These return data are available in the workspace as the variable returns.

Assume you are an investor who likes high returns and dislikes risk. You are also constrained to be fully invested in the 30 DJIA stocks, and only positive weights are allowed. Which of the following statements is false:

```{r}
djia <- readRDS("data/prices.rds")
head(djia)
```
```{r}
returns <- Return.calculate(djia)
returns <- returns[-1,]
```
```{r}
# Create a vector of returns 
means <- apply(returns, 2, "mean")
  
# Create a vector of standard deviation
sds <- apply(returns, 2, "sd")

# Create a scatter plot
plot(sds, means)
text(sds, means, labels = colnames(returns), cex = 0.7)
abline(h = 0, lty = 3)
```
You would rather invest all of your capital in Home Depot (ticker: HD) than in Caterpillar (ticker: CAT) because Home Depot has a higher average return and lower risk.

By combining investments, it may be possible to obtain a portfolio with lower variance than the variance of the Exxon Mobil Corporation’s stock (ticker: XOM).

### Exploring monthly returns of the 30 DJIA stocks

The 1991-2015 monthly returns on the 30 DJIA stocks are available in the workspace as the variable returns. This exercise will help you get comfortable with the data that you will be using for the remainder of the exercises.

Recall that if we compute the means column by column using colMeans(returns), or apply(returns, 2, "mean"), you then obtain the average return per asset. In this exercise, you will be computing the mean per row. You can do this similarly using the function rowMeans() and supplying the argument 1 instead of two to indicate row-wise calculations in the apply() function. By doing so, you obtain the time series of returns for the equally weighted portfolio.
```{r}
# Verify the class of returns 
class(returns)

# Investigate the dimensions of returns
dim(returns)

# Create a vector of row means
ew_preturns <- rowMeans(returns)

# Cast the numeric vector back to an xts object
ew_preturns <- xts(ew_preturns, order.by = time(returns))

# Plot ew_preturns
plot.zoo(ew_preturns)
```
The chart you created visualizes the returns for an equally weighted portfolio.

### Finding the mean-variance efficient portfolio

A mean-variance efficient portfolio can be obtained as the solution of minimizing the portfolio variance under the constraint that the portfolio expected return equals a target return. A convenient R function for doing so is the function portfolio.optim() in the R package tseries. Its default implementation finds the mean-variance efficient portfolio weights under the constraint that the portfolio return equals the return on the equally-weighted portfolio. The only argument needed is the monthly return data on the portfolio components for which the weights need to be determined.

The variable returns containing the monthly returns of the DJIA stocks is already loaded in the console.

Create a mean-variance efficient portfolio of monthly returns using the default of portfolio.optim() targeting the equally-weighted portfolio return, and assign the output to the variable opt.

```{r}
# Load tseries
library(tseries)

# Create an optimized portfolio of returns
opt <- portfolio.optim(returns)

# Create pf_weights
pf_weights <- opt$pw

# Assign asset names
names(pf_weights) <- colnames(returns)

# Select optimum weights opt_weights
opt_weights <- pf_weights[pf_weights >= 0.01]

# Bar plot of opt_weights
barplot(opt_weights)

# Print expected portfolio return and volatility
opt$pm
opt$ps
```
The portfolio you created has an average return of 1.2%!

### Effect of the return target

This exercise will show the effect of increasing your target return on the volatility of your mean-variance efficient portfolio.

The function portfolio.optim has arguments that allow for more general specifications. The arguments are as follows:

    portfolio.optim(x, pm = mean(x), shorts = FALSE, reshigh = NULL)
    
The argument pm sets the target return, the argument reshigh specifies the upper constraints on the portfolio weights, and the argument shorts is a logical statement specifying whether negative weights are forbidden or not, by default shorts = FALSE.

You will create a portfolio that is optimized for a target return that equals the average value of the return series returns. Then you will create a portfolio that has a target return that is 10% higher than the mean return series and calculate the proportion change in risk.
```{r}
# Create portfolio with target return of average returns 
pf_mean <- portfolio.optim(returns, pm = mean(returns))

# Create portfolio with target return 10% greater than average returns
pf_10plus <- portfolio.optim(returns, pm = 1.1 * mean(returns))

# Print the standard deviations of both portfolios
print(pf_mean$ps)
print(pf_10plus$ps)

# Calculate the proportion increase in standard deviation
(pf_10plus$ps - pf_mean$ps) / (pf_mean$ps)
```
By increasing your target return, your portfolio's volatility increased by 8%!

### Imposing weight constraints

Investors are often constrained by the maximum values allowed for the portfolio weights. These constraints can actually be an advantage. The advantage of a maximum weight constraint is that the subsequent portfolio will be less concentrated in certain assets. There is a disadvantage to this, though. The disadvantage is that the same target return may no longer be possible or will be obtained at the expense of higher volatility.

Remember from the previous exercise that the function portfolio.optim() allows you to set weight constraints within the reshigh argument. reshigh requires a vector of maximum weights for each asset.

In this exercise, you will create three portfolios with different maximum weight constraints. For this exercise, it is important to know the output of the portfolio.optim() function. This function creates a list containing four components: (i) \$pw: the portfolio weights, (ii) \$px: the returns of the overall portfolio, (iii) \$pm: the expected return portfolio, (iv) \$ps: the standard deviation of the portfolio returns.
Create three vectors of maximum weights for each asset (column) in returns using the rep() function. The first vector will contain maximum weights of 100%, the second 10%, and the third 5%. Call these max_weights1, max_weights2, max_weights3, respectively.
```{r}
# Create vectors of maximum weights
max_weights1 <- rep(1, ncol(returns))
max_weights2 <- rep(0.1, ncol(returns))
max_weights3 <- rep(0.05, ncol(returns))

# Create an optimum portfolio with max weights of 100%
opt1 <- portfolio.optim(returns, reshigh = max_weights1)

# Create an optimum portfolio with max weights of 10%
opt2 <- portfolio.optim(returns, reshigh = max_weights2)

# Create an optimum portfolio with max weights of 5%
opt3 <- portfolio.optim(returns, reshigh = max_weights3)

# Calculate how many assets have a weight that is greater than 1% for each portfolio
sum(opt1$pw > .01)
sum(opt2$pw > .01)
sum(opt3$pw > .01)

# Print portfolio volatilites 
opt1$ps
opt2$ps
opt3$ps
```
By imposing weight constraints you can ensure that your portfolio isn't completely dominated by just a few assets.

## The efficient frontier

### Computing the efficient frontier using a grid of target returns

As you have seen, one approach to compute the efficient frontier is to first define the grid of target returns and then, for each target return, find the portfolio that has an expected return equal to the target return at the lowest possible variance.

But what is a reasonable grid of target returns? You will set the maximum target return to the maximum average return of the stocks. Ideally, you would set the minimum target return to the return of the minimum variance portfolio. Because you don't know this minimum variance portfolio return yet, you create a grid using the minimum of average returns from all stocks.

In this exercise, you will use a for loop to calculate your grid of potential portfolio means, deviations, and weights.
```{r}
# Calculate each stocks mean returns
stockmu <- colMeans(returns)

# Create a grid of target values
grid <- seq(from = 0.01, to = max(stockmu), length.out = 50)

# Create empty vectors to store means and deviations
vpm <- vpsd <- rep(NA, length(grid))

# Create an empty matrix to store weights
mweights <- matrix(NA, 50, 30)

# Create your for loop
for(i in 1:length(grid)) {
  opt <- portfolio.optim(x = returns, pm = grid[i])
  vpm[i] <- opt$pm
  vpsd[i] <- opt$ps
  mweights[i, ] <- opt$pw
}
```
The for loop that you just created is extremely helpful in being able to explore how target returns can affect weights, volatility, and portfolio returns.

### Interpreting the efficient frontier

![efficient frontier](frontier_01.png)

The portfolio with the greatest excess return per unit of portfolio volatility is the tangency portfolio.

### Properties of the efficient frontier

![](frontier_02.png)
The curve shown in the previous plot shows the solution of minimizing the variance under the constraint that the portfolio expected return equals the target return. When the target return is below the minimum variance return, the obtained portfolio is not efficient because a higher return at lower risk can be obtained by investing in the minimum variance portfolio. The optimized portfolios often are also not realistic because they tend to invest large weights in only a few assets. A practical solution to avoid this is to impose weight constraints.

The effect of such a weight constraint is shown in the workspace, where you can see the efficient frontier plots obtained for the DJIA stocks under a maximum 100% weight constraint (black line), a maximum 10% weight constraint (red line) and a maximum 5% weight constraint (blue).

-	Imposing a more strict weight constraint leads to a shift to the right of the efficient frontier.
-	The portfolios on the dashed lines are not mean-variance efficient.

### The minimum variance and maximum Sharpe ratio portfolio

In the previous exercises, you computed the efficient frontier using a grid of target returns. The output of your calculation of the efficient frontier was a series of two vectors, vpm (vector of portfolio means), and vpsd (vector of standard deviations, or volatilities), and a matrix of weights called mweights. You will use these outputs to identify the portfolios with the least volatility, and the greatest Sharpe ratio, and then plot their weight allocation.

As a reminder, the Sharpe Ratio is found by taking the excess returns less than the risk-free rate, divided by the portfolio volatility.
```{r}
colnames(mweights) <- colnames(returns)
# Create weights_minvar as the portfolio with the least risk
weights_minvar <- mweights[vpsd == min(vpsd), ]

# Calculate the Sharpe ratio
vsr <- (vpm - 0.0075) / vpsd

# Create weights_max_sr as the portfolio with the maximum Sharpe ratio
weights_max_sr <- mweights[vsr == max(vsr)]

# Create bar plot of weights_minvar and weights_max_sr
par(mfrow = c(2, 1), mar = c(3, 2, 2, 1))
barplot(weights_minvar[weights_minvar > 0.01])
barplot(weights_max_sr[weights_max_sr > 0.01])
```
## In-sample vs. out-of-sample evaluation

### Split-sample evaluation

In chapter 2, you used the function window() to subset your returns for graphical purposes. In this exercise, you will use the window() to create two samples: an estimation sample and an evaluation sample. This exercise will illustrate how portfolio weights can differ when changing the estimation window.

To remind you, the function window() has argument of x, start, and end. Where start and end are in the format "YYYY-MM-DD".
```{r}
# Create returns_estim 
returns_estim <- window(returns, start = "1991-01-01", end = "2003-12-31")

# Create returns_eval
returns_eval <- window(returns, start = "2004-01-01", end = "2015-12-31")

# Create vector of max weights
max_weights <- rep(0.1, ncol(returns))

# Create portfolio with estimation sample 
pf_estim <- portfolio.optim(returns_estim, reshigh = max_weights)

# Create portfolio with evaluation sample
pf_eval <- portfolio.optim(returns_eval, reshigh = max_weights)

# Create a scatter plot with evaluation portfolio weights on the vertical axis
plot(pf_estim$pw, pf_eval$pw)
abline(a = 0, b = 1, lty = 3)
```
If portfolio weights are identical, they should be on the 45-degree line.
By creating two samples using the window() function you can backtest, or evaluate the performance of a portfolio on historical data!

### Out of sample performance evaluation

This example will illustrate how your returns can change based on the weighting created by an optimized portfolio. You will use the estimation portfolio (pf_estim) to evaluate the performance of your portfolio on the estimation sample of returns (returns_eval).

How severe is the optimality loss? Let's compare, for the portfolio weights in pf_estim, the performance you expected using the evaluation sample (returns_estim) with the actual return on the out-of-sample period (returns_eval).

Calculate the returns of the portfolio with monthly rebalance weights pf_estim$pw on the estimation sample returns_estim. Call this returns_pf_estim

```{r}
# Create returns_pf_estim
returns_pf_estim <- Return.portfolio(returns_estim, pf_estim$pw, rebalance_on = "months")


# Create returns_pf_eval
returns_pf_eval <- Return.portfolio(returns_eval, pf_estim$pw, rebalance_on = "months")

# Print a table for your estimation portfolio
table.AnnualizedReturns(returns_pf_estim)

# Print a table for your evaluation portfolio
table.AnnualizedReturns(returns_pf_eval)
```
The results from the pf_eval are what you may expect in a real performance.
You have now successfully evaluated a split sample, this wouldn't be possible without the use of historical data

### It ain't over

This journey showed you how to compute returns, analyze portfolio performance, and optimize portfolios. These are important skills since whenever you hold or manage more than one asset, you have a portfolio to consider.

It is now time to compound and get an extra return by exploring other useful R packages for portfolio analysis like PortfolioAnalytics. Instead of optimizing the portfolio by minimizing the variance under a return target constraint, this package allows you to optimize virtually all objective functions under all possible constraints. Even if the solution to the problem is not feasible, it will give you the best possible solution using heuristics, such as differential evolution in the R package [DEoptim]( https://journal.r-project.org/archive/2011-1/RJournal_2011-1_Ardia~et~al.pdf).
