Deep Learning Systems for Bitcoin 1

Since December 2017, bitcoins can not only be traded at more or less dubious exchanges, but also as futures at the CME and CBOE. And already several trading systems popped up for bitcoin and other cryptocurrencies. None of them can claim big success, with one exception. There is a very simple strategy that easily surpasses all other bitcoin systems and probably also all known historical trading systems. Its name: Buy and Hold. In the light of the extreme success of that particular bitcoin strategy, do we really need any other trading system for cryptos?

Bitcoin – hodl??

A buy and hold strategy works extremely well when a price bubble grows, and extremely bad when it bursts. And indeed, apparently all finance and economy gurus (well, all but John McAfee) tell you that the cryptocurrency market, and especially bitcoin, is a bubble, even a “scam with no substantial worth”, and will soon experience a crash “worse than the 17th century tulip mania” or the “18th century South Sea Company fraud”.

By definition, a bubble is a price largely above the ‘real value’ or ‘fair value’ of an asset, and it bursts when people realize that. So what is the fair value of a bitcoin? Obviously not zero, since blockchain based currencies have (aside from their disadvantages) several advantages over traditional currencies, on the economy level as well as on the private level. Such as:

They break the link of money and debt. Cryptocurrencies don’t require the bank credit mechanism for money creation.
They can be used where normal money would be impractical, such as fee transfers between machines or trading in multiplayer games.
They allow low-cost and anonymous money transactions. At least in theory.
They replace banks for storing and mattresses for stashing money.

I’m ready to believe that blockchain is the future of money transfer and storage. But that does not mean an ever-rising bitcoin price. Hundreds of cryptocurrencies came out in the last two years, any single of them with a better blockchain technology than bitcoin, and any good programmer can add a new coin anytime. Few will survive. Countries or big companies might sooner or later issue their own crypto tokens, as Venezuela already is attempting. The release of an official blockchain Dollar, Yuan, or Euro would leave the old bitcoin with its energy hungry transaction algorithm in thin air. Thus, when investing in bitcoin, we should not hope for a rosy future, but look for its present ‘real value’.

Due to its extreme volatility, bitcoin can not replace bank tresors. But it is already used in some situations for reducing money transfer costs, since the miners get any transaction rewarded in bitcoin. And above all, anonymity can be a substantial motive to own it. When you need a hacker to delete your drunk driving record, pay her in bitcoin. But how big is the online market for illegal hacker jobs, kill contracts, money laundering, drugs, weapons, or pro-Trump facebook advertisements? No one knows, but when we compare it with cash, another form of anonymous payment, we get interesting results.

The current cash in circulation in the US is approximately $1.5 trillion dollars. And the current bitcoin supply, about 17 million bitcoins, represents a total value of about $250 billion. Which means that you can already replace 15% of all US cash with bitcoin! Not to mention all the other cryptos. I fear that this supply already exceeds the demand of anonymous online payment for today and also the next future.

For those reasons, a bitcoin “hodl” system, despite its extreme historical performance, is high risk. We don’t know when and how the bubble will burst – maybe bitcoin will go up to $100,000 before – but we have some reason to suspect that at some point sooner or later the bitcoin price might drop like a stone down to its ‘real value’. Which is unknown, but for practical purposes is probably not in the $15,000 area, but more like $15.

So we need some other method to tackle the cryptocurrency trading problem. The first question: Has the crypto market already developed price curve inefficiencies that can be exploited in a trading system? In (1) we see some tests with basic bitcoin strategies. Our own tests came to the same results. Momentum based strategies can work, and mean-variance optimizing portfolio systems can achieve even extreme returns with crytrocurrencies – up to 10 times higher than “hodl”. But that’s not really surprising due to the high momentums and volatilities of crypto coins. The problem is that all crypto portfolios are exposed to high risk. Other conventional model-based strategies don’t work well anyway with cryptos.

When we concentrate on bitcoin, our proposed system must be a fast trading, trend-agnostic strategy. That means it holds positions only a few minutes, and is not exposed to the bubble risk. I can already tell that short-term mean reversion – even with a more sophisticated system as in (1) – produces no good result with cryptos. So only a few possibilities remain. One of them is exploiting short-term price patterns. This is the strategy that we will develop. And I can already tell that it works. But for this we’ll need a deep machine learning system for detecting the patterns and determining their rules.

Selecting a machine learning library

The basic structure of such a machine learning system is described here. Due to the low signal-to-noise ratio and to ever-changing market conditions, analyzing price series is one of the most ambitious tasks for machine learning. Compared with other AI algorithms, deep learning systems have the highest success rate. Since we can connect any Zorro based trading script to the data analysis software R, we’ll use a R based deep learning package. There are meanwhile many available. Here’s the choice:

Deepnet, a lightweight and straightforward neural net library with a stacked autoencoder and a Boltzmann machine. Produces good results when the feature set is not too complex. The basic train and predict functions for using a deepnet autoencoder in a Zorro strategy:

library('deepnet') 

neural.train = function(model,XY) 
{
  XY <- as.matrix(XY)
  X <- XY[,-ncol(XY)]
  Y <- XY[,ncol(XY)]
  Y <- ifelse(Y > 0,1,0)
  Models[[model]] <<- sae.dnn.train(X,Y,
      hidden = c(30), 
      learningrate = 0.5, 
      momentum = 0.5, 
      learningrate_scale = 1.0, 
      output = "sigm", 
      sae_output = "linear", 
      numepochs = 100, 
      batchsize = 100)
}

neural.predict = function(model,X) 
{
  if(is.vector(X)) X <- t(X)
  return(nn.predict(Models[[model]],X))
}

H2O, an open-source software package with the ability to run on distributed computer systems. Coded in Java, so the latest version of the JDK is required. Aside from deep autoencoders, many other machine learning algorithms are supported, such as random forests. Features can be preselected, and ensembles can be created. Disadvantage: While batch training is fast, predicting a single sample, as usually needed in a trading strategy, is relatively slow due to the server/client concept. The basic H2O train and predict functions for Zorro:
```
library('h2o') 
# also install the Java JDK

neural.train = function(model,XY) 
{
  XY <- as.h2o(XY)
  Models[[model]] <<- h2o.deeplearning(
    -ncol(XY),ncol(XY),XY,
    hidden = c(30),  seed = 365)
}

neural.predict = function(model,X) 
{
  if(is.vector(X)) X <- as.h2o(as.data.frame(t(X)))
  else X <- as.h2o(X)
  Y <- h2o.predict(Models[[model]],X)
  return(as.vector(Y))
}
```

Tensorflow in its Keras incarnation, a neural network kit by Google. Supports CPU and GPU and comes with all needed modules for tensor arithmetics, activation and loss functions, covolution kernels, and backpropagation algorithms. So you can build your own neural net structure. Keras offers a simple interface for that.

Keras is available as a R library, but installing it requires also a Python environment. First install Anaconda from www.anaconda.com. Open the Anaconda Navigator and install the RStudio application (installing Keras outside an Anaconda environment fails on some PCs with an error message). Then open Rstudio inside the Navigator, install the Keras package, then finally execute library(‘keras’) and install_keras(). These steps usually succeed.

The Keras train and predict functions for Zorro:

library('keras')
#needs Python 3.6 and Anaconda
#call install_keras() after installing the package

neural.train = function(model,XY) 
{
  X <- data.matrix(XY[,-ncol(XY)])
  Y <- XY[,ncol(XY)]
  Y <- ifelse(Y > 0,1,0)
  Model <- keras_model_sequential() 
  Model %>% 
    layer_dense(units=30,activation='relu',input_shape = c(ncol(X))) %>% 
    layer_dropout(rate = 0.2) %>% 
    layer_dense(units = 1, activation = 'sigmoid')
  
  Model %>% compile(
    loss = 'binary_crossentropy',
    optimizer = optimizer_rmsprop(),
    metrics = c('accuracy'))
  
  Model %>% fit(X, Y, 
    epochs = 20, batch_size = 20, 
    validation_split = 0, shuffle = FALSE)
  
  Models[[model]] <<- Model
}

neural.predict = function(model,X) 
{
  if(is.vector(X)) X <- t(X)
  X <- as.matrix(X)
  Y <- Models[[model]] %>% predict_proba(X)
  return(ifelse(Y > 0.5,1,0))
}

MxNet, Amazon’s answer on Google’s Tensorflow. Offers also tensor arithmetics and neural net building blocks on CPU and GPU, as well as high level network functions similar to Keras (the next Keras version will also support MxNet). Just as with Tensorflow, CUDA is supported, but not (yet) OpenCL, so you’ll need a Nvidia graphics card to enjoy GPU support. In direct comparison (2), MxNet was reported to be less resource hungry and a bit faster than Tensorflow, but so far I could not confirm this. The standard train and predict functions:

# how to install the CPU version:
#cran <- getOption("repos")
#cran["dmlc"] <- "https://s3-us-west-2.amazonaws.com/apache-mxnet/R/CRAN/"
#options(repos = cran)
#install.packages('mxnet')
library('mxnet')

neural.train = function(model,XY) 
{
  X <- data.matrix(XY[,-ncol(XY)])
  Y <- XY[,ncol(XY)]
  Y <- ifelse(Y > 0,1,0)
  Models[[model]] <<- mx.mlp(X,Y,
       hidden_node = c(30), 
       out_node = 2, 
       activation = "sigmoid",
       out_activation = "softmax",
       num.round = 20,
       array.batch.size = 20,
       learning.rate = 0.05,
       momentum = 0.9,
       eval.metric = mx.metric.accuracy)
}

neural.predict = function(model,X) 
{
  if(is.vector(X)) X <- t(X)
  X <- data.matrix(X)
  Y <- predict(Models[[model]],X)
  return(ifelse(Y[1,] > Y[2,],0,1))
}

By replacing the neural.train and neural.predict functions, and other functions for saving and loading models that are not listed here, you can run the same strategy with different deep learning packages and compare. We’re currently using Keras for most machine learning strategies, and I’ll also use it for the short-term bitcoin trading system presented in the upcoming 2nd part of this article. There is no bitcoin futures data available yet, so tick based price data from several bitcoin exchanges will have to do for the backtest.

I’ve uploaded the interface scripts for Deepnet, H2O, Tensorflow/Keras, and MxNet to the 2018 script repository, so you can run your own deep learning experiments and compare the packages. Here’s a Zorro script for downloading bitcoin prices from Quandl – EOD only, though, since the exchanges demand dear payment for their tick data.

void main()
{
  assetHistory("BITFINEX/BTCUSD",FROM_QUANDL);
}

You can also get Bitcoin M1 data from Kaggle in CSV format. Here’s a Zorro script for converting it to a Zorro T6 dataset:

void main()
{
	string InName = "History\\bitstampUSD_1-min_data_2012-01-01_to_2019-03-13.csv";
	string Format = "+%t,f3,f1,f2,f4,f6";
	dataParse(1,Format,InName); 
	dataSave(1,"History\\BTCUSD.t6");
}

42 thoughts on “Deep Learning Systems for Bitcoin 1”

miner says:

December 27, 2017 at 17:48

hodl
Lorenzo says:

December 27, 2017 at 19:36

Tried many things about ai and btc, but nothing beat buy and hold
Ludo says:

December 27, 2017 at 20:21

soon good i hope to see part 2 quickly 🙂
Pingback: Quantocracy's Daily Wrap for 12/27/2017 | Quantocracy
Nigel Haynes says:

December 29, 2017 at 16:36

I wouldn’t focus on Bitcoin, would be better to look at others, especially Ripple (XRP)
madpower2000 says:

December 30, 2017 at 21:29

Now TensorFlow have experimental feature allow to compile your model to binary or to C++ source code: https://youtu.be/kAOanJczHA0

So, you potentially can deploy your model in R, save it to file and later make fast prediction straight from Zorro, if you able to bind TF runtime/C++ with Zorro.

But for the other hand for trivial models, as in your article, why you not to add simple dense layers functionality to Zorro, since you already made PERCEPTRON? There a lot C++ source code of deep nets implementation available, also don’t forget about OpenBLAS and your prediction engine would be blazing fast.
jcl says:

December 31, 2017 at 09:24

That’s possible, but it had no substantial speed advantage. Prediction would be about 50% faster, but the bottleneck is training. Since we normally have no large feature set in trading systems, prediction is just a few matrix multiplications, and is often anyway faster than many standard indicators with large lookback periods.
Kris says:

January 1, 2018 at 05:19

Nice one Johann! Very interested to hear your ideas about trading cryptos, particularly now that we can throw the futures contract into the mix. Its trading volume wasn’t exactly spectacular leading up to the Christmas break, but no doubt there are many watching with a lot of interest.

As a nice coincidence, I also just launched a blog series about using deep learning in trading systems. I’ll be using Keras, and of course Zorro.

Thanks for sharing your work.
jcl says:

January 1, 2018 at 15:44

Sounds promising – I’m looking forward to the rest of your blog series. And don’t work too much on holidays!
madpower2000 says:

January 21, 2018 at 10:52

From your post, it’s not clear how often you retrain your model and witch time frame you trade. For FX you previously suggest 1H timeframe, and 25 day retrain period, so there no speed bottleneck for any R deep learning framework at all. What about crypto market? Which timeframe you use and how often retrain your model?
jcl says:

January 21, 2018 at 15:31

The timeframe is one minute, retraining every 2 weeks. All this will be covered in the second part of the article.
madpower2000 says:

January 22, 2018 at 14:50

Your every post worth a hundred posts all others authors, you always source of trading wisdom for me, thanks for your sharing.
Waiting for part 2 impatiently!
But I still don’t figure out why you point to taring time as bottleneck, if you retraining only every 2 weeks?
jcl says:

January 22, 2018 at 16:41

Because the time consuming part is the testing, not live trading, where retraining happens in the background anyway. But in walk forward tests the system is training many times, maybe thousands of times when you also do preselection or optimization. That’s where you need multiple cores, GPU support, and any processing power that you can get.
Andy says:

January 31, 2018 at 12:21

There is a way round the slow prediction issue for H2O. A helpful reply on this from Erin LeDell of H2O can be found here:
https://stackoverflow.com/questions/47759418/alternative-to-as-h20-for-small-data
(this works for predictions but sadly not for an autoencoder, which was my original problem/question). A
snipe75 says:

February 17, 2018 at 22:47

Also can’t wait for part 2 of this article 🙂 Trying to design a trading bot myself, so I find this blog very interesting.
However, I think you should read some more about crypto. For example Bitcoin isn’t very anonymous, unlike Monero for example. Also you underestimate the true value of Bitcoin, based on its supply cap 😉
Potential problems with bit-euro or bit-yuan would be same as fiat – if you can print/issue unlimited amounts, it’s not a very good store of value.
Nick WONG says:

February 27, 2018 at 10:53

If you need commodities, market index, currencies or crypto data, here is the data-api website I made. Not free but it’s cheap and with 90 days trial. Have a try ?
https://quantapi.co/
Vincenzo says:

April 9, 2018 at 11:07

Hi CJ,
If can help I made this tool in C# to download crypto history from CryptoCompare in Zorro format (all free).
There is about :
– 2 weeks of 1 minute hystory
– 5 years of hourly history
– All history of daily

https://github.com/vinsom68/CryptoCompareHistoryAPI
jcl says:

April 9, 2018 at 11:43

Zorro S already supports CryptoCompare, but not the free Zorro version, so I think this tool will come handy for users of the free version.
Glenn says:

June 5, 2018 at 09:52

Could you share the link to the 2018 script repo? Thanks!
jcl says:

June 5, 2018 at 17:06

The link is on the sidebar under “links”.
Pingback: machine learning bitcoin trading
Pingback: machine learning Bitcoin Cash trading
JB says:

August 22, 2018 at 23:57

How have your results been?

Relatively simple trend/momentum strategies with various twists perform quite well in backtests, even with commission fees. Though they perform less well since March/April, market is quite choppy now.

I have a simple bot integrated with exchange API, but my main issue now is limit/market order execution optimization… and I see that order entry is its own heavily researched academic field with some rather advanced math: https://www.cis.upenn.edu/~mkearns/papers/rlexec.pdf

Any advice?
JB says:

August 23, 2018 at 00:01

Another paper on the order entry optimization problem “Optimal placement in a limit order book: an analytical
approach” 2017: http://www.ieor.berkeley.edu/~xinguo/papers/GuoLarrardRuan2.pdf

Are there any open source tools that try to optimize order entry? If only there were a way to know in advance which limits would be filled and when to just market order 😉 perhaps the AI approach can improve results.
jcl says:

August 23, 2018 at 13:53

Thank you for the link. For optimizing the order limit, you need depth data from the order book. Live order book data is available free on several crypto exchanges, for instance Bittrex, but the order book history is not free. – We’ve meanwhile tested several network structures and got definitely better results than buy and hold, but I had not had time yet to write the second part of the article about it.
Abhay Aluri says:

September 25, 2018 at 02:26

Will there be another blog post on this? I am interested in what trading strategies were most successful when trading cryptocurrencies.
jcl says:

September 25, 2018 at 07:56

Yes, there will be. In fact the strategy was finished long ago, but it turned out that I would need a different one for the article and had not the yet time to blog about it.
j2ee says:

October 4, 2018 at 02:01

Short and “hodl” is the best for now, these crap coins are done.
Brad Nickel says:

December 4, 2018 at 04:36

Would love to see the 2nd part of this. Is it coming out soon?
jcl says:

December 5, 2018 at 12:15

Possibly in February. I had a large project this year and not much time for the blog, and there’s another article to be released before.
Chickenlasers says:

March 2, 2019 at 01:16

Loved the article! Any updates on if a part 2 could be out any time soon? 🙂
Eric Jiang says:

April 6, 2019 at 10:39

looking forward to the new post.
luca says:

September 3, 2019 at 09:09

Hi, have a doubt about the following statement

Y 0,1,0)

in my dataset Y is equal to the close therefore always positive, should not be the percentage of variation?
thanks for any clarification.
Bye
luca says:

September 3, 2019 at 09:10

sorry, here correct statement Y 0,1,0)
luca says:

September 3, 2019 at 09:11

here ifelse(Y > 0,1,0), sorry again
jcl says:

September 9, 2019 at 10:14

If your Ys are always positive, simply use ifelse(Y > Threshold,1,0). Select Threshold so that 1 and 0 are equally distributed.
faustf says:

December 13, 2020 at 14:15

hi i installed R and zorro and modify Zorro.ini

but why when i run this script
library(‘deepnet’)

neural.train = function(model,XY)
{
XY <- as.matrix(XY)
X <- XY[,-ncol(XY)]
Y <- XY[,ncol(XY)]
Y 0,1,0)
Models[[model]] <<- sae.dnn.train(X,Y,
hidden = c(30),
learningrate = 0.5,
momentum = 0.5,
learningrate_scale = 1.0,
output = "sigm",
sae_output = "linear",
numepochs = 100,
batchsize = 100)
}

neural.predict = function(model,X)
{
if(is.vector(X)) X <- t(X)
return(nn.predict(Models[[model]],X))
}

return me error library undecleared identifier deepnet ??
jcl says:

December 13, 2020 at 17:58

Some places where you can get help for problems with C or R:

https://r-dir.com/community/forums.html
https://opserver.de/ubb7/
https://manual.zorro-project.com/
support@opgroup.de
Octavian says:

January 26, 2021 at 03:02

This is excellent. Any chance of the part 2 of this article happening?
jcl says:

January 27, 2021 at 11:04

The system that I had in mind for part 2 turned out unsuited for publication. But part 2 will eventually come, I only don’t yet know when.
Paul Jones says:

June 21, 2023 at 07:13

The Deep Learning “time-series prediction” field is moving fast. Any word when https://financial-hacker.com/deep-learning-systems-for-bitcoins-part-2/ will become reality? Or is it indefinitely on hold?
jcl says:

June 21, 2023 at 08:24

Since I had not found the time for doing the second part in the last 2 years, I must admit that the statistical probability is slim that I will in the next 2 years.