This article explores the Bitcoin return predictability of variables constructed from one-minute high-frequency Bitcoin trading data. During the training period of 2012–2018, LASSO is used to pick out the most powerful predictors. We then use predictors selected by LASSO to predict the Bitcoin returns in the 2018–2019 test sample. An investment strategy based on the return predictions outperforms a simple buy-and-hold strategy and other strategies based on the prediction of Ordinary Least Squares and Neural Networks.
Keywords: Bitcoin; high-frequency; investment strategy; LASSO; neural networks
Being the most successful application of the blockchain technology, Bitcoin has shown speculation, complementary-currency and diversification characteristics resembling an investable asset class (Bouri et al. [
In line with studies emphasizing the importance of high-frequency data (Catania and Sandholdt [
The data contain intraday high-frequency trading records sourced from the Bitstamp exchange in minute intervals. Our sample period spans from January 1st, 2012 to August 12th, 2019. Raw data fields contain timestamps expressed in Unix time, minute-to-minute updates of OHLC (open, high, low, close) prices, volumes denominated in both cryptocurrency and indicated money units, and a weighted Bitcoin price.[
Then, we aggregate minute-frequency variables to the daily level. In specific, daily OHLC prices are their respective minute counterparts in a day's time range. Regarding transaction volume and NaNs count, we sum minute-level observations during a trading day. For the weighted price quote, we take the average across all minute values. Further, the collection of forecasting variables can be expanded by computing the intraday volatility for each daily statistic (except for the number of NaNs). Besides, we also calculate the cross-day differences for the above-mentioned variables by subtracting their day t-1 values from their day t values. In sum, we obtain 30 high-frequency predictors in total. And they are scaled into a consistent order of magnitude. The predictive target is the continuously compound (cc) Bitcoin return between t and t + 1 evaluated at daily open prices. It merits a note that our treatment of data and results stay robust to data from an alternative data source – trades on Coinbase from 1 December 2014 to 9 January 2019.
We consider two periods separately: the training period including data from 1 January2012 through 17 May 2018, and the test period of 18 May 2018 to 12 August 2019. Using the training subsample, we estimate LASSO models to pick out powerful Bitcoin-return predictors from the pool of high-frequency candidate variables. In machine learning, LASSO performs both variable selection and regularization to enhance prediction accuracy and interpretability (Bühlmann and Van De Geer [
Graph
The variable selection function makes LASSO preferred over Ridge regression in that some coefficients can take the value of zero, indicating that the corresponding variables are not contributing to the model. The larger the value of
Graph
Graph
- Lvol_1: 1 day lagged volatility of the low price
- Vb_1: 1 day lagged daily volume of Bitcoin transaction
- Vbvol_diff: cross-day difference in the volatility of Bitcoin volume
- Open_diff: cross-day difference in the open price
- Close_diff: cross-day difference in the close price
- Vb_diff: cross-day difference in daily Bitcoin trading volume
- Vc_diff: cross-day difference in volume denominated in fiat currency
- NaNs_diff: cross-day difference in NaNs count
Table 1 shows the results of regression Bitcoin returns on predictors selected by LASSO. As can be seen, Vb_diff and NaNs_diff are significantly negatively correlated with returns. However, Close_diff and Vc_diff have a positive association with returns at the 1% significance level. These results confirm that variables sticking out from LASSO selection indeed possess strong predictability.
Table 1. Regression of Bitcoin returns on predictors selected by LASSO
Dependent variable: Returns Lvol_1 −0.10 Vb_1 −0.10 Vbvol_diff −0.05 Open_diff −0.13 Close_diff 2.39*** Vb_diff −1.07*** Vc_diff 0.50*** NaNs_diff −0.40*** Constant 0.003*** Observations 2,259 Log Likelihood 3,809.08 Akaike Inf. Crit. −7,600.17
1 Note:
Graph
Graph
Graph
Graph
Graph
Graph
Given the predictive power confirmed, we take one step ahead by utilizing it to construct a simple investment strategy on Bitcoin assets. Like the buy-and-hold strategy, we dynamically adjust coin holdings according to the steps below. First, in each day in the test sample period, with predictors selected by LASSO and estimated coefficients obtained based on the training sample, the predictive regression model is conducted over the period from the first day of the test up to the previous day of the concerned one. And we obtain an estimated return for the next day. Second, a threshold level of return is established based on outside investment options, say 0.0433, which equates to the five-year moving average of long-term corporate bond returns.
Third, when our predicted return exceeds the threshold, we should either turn all pocket cash (if there is some due to previous Bitcoin liquidation) into additional Bitcoin on day t + 1, or do nothing if we do not have cash left in day t. When the predicted return becomes less than the negative of the threshold value, the strategy is that we sell all Bitcoin on day t + 1 if we held Bitcoin on day t or we keep holding cash if the balance of coins is zero on day t. In all other cases, i.e., the return estimate falls within the range between negative and positive threshold, we stay with the positions we have exposed to. As can be seen from the dash-dotted and solid line in Figure 1, the gross return of the buy-and-hold strategy is plotted in comparison to our proposed strategy during the same test period. Our strategy that takes advantage of predictors selected by LASSO is shown to generate a consistent and sizable premium.
PHOTO (COLOR): Figure 1. Comparing gross returns of Bitcoin investment strategies based on alternative prediction methods, May 2018 to August 2019.
We continue to compare the effectiveness of LASSO-based investing to alternatives lying on OLS or neural networks (NNET) estimation. Because neither OLS nor NNET provides an elimination mechanism of potential predictors, we use all intraday variables constructed to ensure that information is extracted as much as possible. The finding is that the in-sample root means square errors (RMSE) for LASSO, OLS and NNET are 0.0447, 0.0444, and 0.0476, respectively; whereas the corresponding out-of-sample RMSE for these three methods are 0.0153, 0.0168, and 0.0286.[
This paper considers 30 variables constructed from high-frequency Bitcoin trading that may foreshadow next-day Bitcoin returns. We apply LASSO to find those with the strongest predictive power. As suggested by our sample, the cross-day differences in open price and trading volumes of fiat money unit relate to Bitcoin cc returns statistically positively. However, the cross-day differences in the volume denominated by cryptocurrency and such differences in daily NaNs count are adversely correlated with returns. We next establish an investment strategy based on the performance of these return predictors. It is demonstrated that, during the 1-year test period, our strategy outperforms the benchmark buy-and-hold strategy as well as other strategies derived from OLS and NNET estimators or a general-to-specific variable-fitting procedure.
No potential conflict of interest was reported by the author(s).
This work is supported by the General Program of National Natural Science Foundation of China (No. 72071211).
By Weige Huang and Xiang Gao
Reported by Author; Author