100% found this document useful (2 votes)
286 views227 pages

Online Portfolio Selection

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
286 views227 pages

Online Portfolio Selection

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Online Portfolio

Finance & Investing / Machine Learning & Pattern Recognition

Li and Hoi
With the aim to sequentially determine optimal allocations across a set of
assets, Online Portfolio Selection (OLPS) has significantly reshaped the

Selection
financial investment landscape. Online Portfolio Selection: Principles
and Algorithms supplies a comprehensive survey of existing OLPS
principles and presents a collection of innovative strategies that leverage
machine learning techniques for financial investment.

The book presents four new algorithms based on machine learning Principles and Algorithms
techniques that were designed by the authors, as well as a new back-

Online Portfolio Selection


test system they developed for evaluating trading strategy effectiveness.
The book uses simulations with real market data to illustrate the trading
strategies in action and to provide readers with the confidence to deploy
the strategies themselves. The book is presented in five sections that:

I. Introduce and formulate OLPS as a sequential decision task


II. Present key OLPS principles, including benchmarks, follow the
winner, follow the loser, pattern matching, and meta-learning
III. Detail four innovative OLPS algorithms based on cutting-edge
machine learning techniques
IV. Provide a toolbox for evaluating the OLPS algorithms and present
empirical studies comparing the proposed algorithms with the
state of the art
V. Investigate possible future directions

Complete with a back-test system that uses historical data to evaluate


the performance of trading strategies, as well as MATLAB® code for the
back-test systems, this book is an ideal resource for graduate students in
finance, computer science, and statistics. It is also suitable for researchers
and engineers interested in computational investment.

Readers are encouraged to visit the authors’ website for updates:


[Link]

K23731

Bin Li and Steven C.H. Hoi


6000 Broken Sound Parkway, NW
Suite 300, Boca Raton, FL 33487 ISBN: 978-1-4822-4963-7
711 Third Avenue 90000
an informa business New York, NY 10017
2 Park Square, Milton Park
[Link] Abingdon, Oxon OX14 4RN, UK
9 781482 249637
w w [Link]

K23731 mech [Link] 1 10/5/15 10:31 AM


Online Portfolio
Selection
Principles and Algorithms

T&F Cat #K23731 — K23731_C000 — page i — 10/13/2015 — 16:23


Online Portfolio
Selection
Principles and Algorithms

Bin Li and Steven C.H. Hoi

T&F Cat #K23731 — K23731_C000 — page iii — 10/13/2015 — 16:23


MATLAB® is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks does
not warrant the accuracy of the text or exercises in this book. This book’s use or discussion of MAT-
LAB® software or related products does not constitute endorsement or sponsorship by The MathWorks
of a particular pedagogical approach or particular use of the MATLAB® software.

CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
© 2016 by Taylor & Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government works


Version Date: 20151001

International Standard Book Number-13: 978-1-4822-4964-4 (eBook - PDF)

This book contains information obtained from authentic and highly regarded sources. Reasonable
efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and
publishers have attempted to trace the copyright holders of all material reproduced in this publication
and apologize to copyright holders if permission to publish in this form has not been obtained. If any
copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information stor-
age or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, please access [Link]-
[Link] ([Link] or contact the Copyright Clearance Center, Inc. (CCC), 222
Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that pro-
vides licenses and registration for a variety of users. For organizations that have been granted a photo-
copy license by the CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
[Link]
and the CRC Press Web site at
[Link]
Contents

List of Figures ix

List of Tables xi

List of Notations xiii

Preface xv

Acknowledgments xvii

Authors xix

I Introduction 1
1 Introduction 3
1.1 Background 4
1.1.1 Challenge 1: Voluminous Financial Instruments 4
1.1.2 Challenge 2: Human Behavioral Biases 4
1.1.3 Challenge 3: High-Frequency Trading 4
1.1.4 Algorithmic Trading and Machine Learning 4
1.2 What Is Online Portfolio Selection? 5
1.3 Methodology 7
1.4 Book Overview 7

2 Problem formulation 11
2.1 Problem Settings 11
2.2 Transaction Costs and Margin Buying Models 13
2.3 Evaluation 14
2.4 Summary 16

II Principles 17
3 Benchmarks 21
3.1 Buy-and-Hold Strategy 21
3.2 Best Stock Strategy 21
3.3 Constant Rebalanced Portfolios 22

T&F Cat #K23731 — K23731_C000 — page v — 10/13/2015 — 16:23


vi CONTENTS
4 Follow the Winner 23
4.1 Universal Portfolios 23
4.2 Exponential Gradient 25
4.3 Follow the Leader 26
4.4 Follow the Regularized Leader 27
4.5 Summary 29

5 Follow the Loser 31


5.1 Mean Reversion 31
5.2 Anticorrelation 32
5.3 Summary 33

6 Pattern Matching 35
6.1 Sample Selection Techniques 36
6.2 Portfolio Optimization Techniques 37
6.3 Combinations 38
6.4 Summary 39

7 Meta-Learning 41
7.1 Aggregating Algorithms 41
7.2 Fast Universalization 42
7.3 Online Gradient and Newton Updates 43
7.4 Follow the Leading History 43
7.5 Summary 43

III Algorithms 45
8 Correlation-Driven Nonparametric Learning 47
8.1 Preliminaries 48
8.1.1 Motivation 48
8.2 Formulations 50
8.3 Algorithms 51
8.4 Analysis 56
8.5 Summary 57

9 Passive–Aggressive Mean Reversion 59


9.1 Preliminaries 59
9.1.1 Related Work 59
9.1.2 Motivation 60
9.2 Formulations 62
9.3 Algorithms 65
9.4 Analysis 67
9.5 Summary 69

T&F Cat #K23731 — K23731_C000 — page vi — 10/13/2015 — 16:23


CONTENTS vii
10 Confidence-Weighted Mean Reversion 71
10.1 Preliminaries 71
10.1.1 Motivation 71
10.2 Formulations 73
10.3 Algorithms 76
10.4 Analysis 78
10.5 Summary 81

11 Online Moving Average Reversion 83


11.1 Preliminaries 83
11.1.1 Related Work 83
11.1.2 Motivation 84
11.2 Formulations 88
11.3 Algorithms 90
11.4 Analysis 91
11.5 Summary 92

IV Empirical Studies 93

12 Implementations 95
12.1 The OLPS Platform 95
12.1.1 Preprocess 96
12.1.2 Algorithmic Trading 96
12.1.3 Postprocess 97
12.2 Data 97
12.3 Setups 99
12.3.1 Comparison Approaches and Their Setups 100
12.4 Performance Metrics 100
12.5 Summary 101

13 Empirical Results 103


13.1 Experiment 1: Evaluation of Cumulative Wealth 103
13.2 Experiment 2: Evaluation of Risk and Risk-Adjusted Return 105
13.3 Experiment 3: Evaluation of Parameter Sensitivity 109
13.3.1 CORN’s Parameter Sensitivity 109
13.3.2 PAMR’s Parameter Sensitivity 109
13.3.3 CWMR’s Parameter Sensitivity 114
13.3.4 OLMAR’s Parameter Sensitivity 114
13.4 Experiment 4: Evaluation of Practical Issues 116
13.5 Experiment 5: Evaluation of Computational Time 120
13.6 Experiment 6: Descriptive Analysis of Assets and Portfolios 122
13.7 Summary 126

T&F Cat #K23731 — K23731_C000 — page vii — 10/13/2015 — 16:23


viii CONTENTS
14 Threats to Validity 129
14.1 On Model Assumptions 129
14.2 On Mean Reversion Assumptions 130
14.3 On Theoretical Analysis 131
14.4 On Back-Tests 131
14.5 Summary 133

V Conclusion 135
15 Conclusions 137
15.1 Conclusions 137
15.2 Future Directions 138
15.2.1 On Existing Work 138
15.2.2 On Practical Issues 140
15.2.3 Learning for Index Tracking 140

Appendix A OLPS: A Toolbox for Online Portfolio Selection 143

Appendix B Proofs and Derivations 171

Appendix C Supplementary Data and Portfolio Statistics 187

Bibliography 193

Index 205

T&F Cat #K23731 — K23731_C000 — page viii — 10/13/2015 — 16:23


List of Figures

1.1 Book organization 8


8.1 A motivating example to illustrate the limitation of Euclidean
distance 49
12.1 Structure of the OLPS toolbox 96
13.1 Trends of cumulative wealth achieved by various strategies on the
six datasets 106
13.2 Risk and risk-adjusted performance of various strategies on the six
datasets 108
13.3 Parameter sensitivity of CORN-U with respect to ρ with fixed W
(W = 5) 110
13.4 Parameter sensitivity of CORN-U with respect to w (W ) with
fixed ρ (ρ = 0.1) 111
13.5 Parameter sensitivity of PAMR with respect to  112
13.6 Parameter sensitivity of PAMR-1 (or PAMR-2) with respect to C
with fixed  ( = 0.5) 113
13.7 Parameter sensitivity of CWMR with respect to  115
13.8 Parameter sensitivity of OLMAR-1 with respect to  with fixed w
(w = 5) 116
13.9 Parameter sensitivity of OLMAR-1 with respect to w with fixed 
( = 10) 117
13.10 Parameter sensitivity of OLMAR-2 with respect to α with fixed 
( = 10) 118
13.11 Scalability of the proposed strategies with respect to the transaction
cost rate (γ) 119
13.12 Distributions of portfolio weights 124
A.1 Starting the Trading Manager 148
A.2 Various components of the Algorithm Analyser 149
A.3 Various components of the Experimenter 150
A.4 Results Manager for Algorithm Analyser 151
A.5 Results Manager for Experimenter 152
A.6 Configuration Manager 153

ix

T&F Cat #K23731 — K23731_C000 — page ix — 10/13/2015 — 16:23


List of Tables

II.1 Principles and representative online portfolio selection algorithms 19


5.1 Motivating example to show the mean reversion trading idea 32
6.1 Pattern matching-based approaches: sample selection and portfolio
optimization 38
9.1 Motivating example to compare BCRP and PAMR 62
10.1 Summary of mean reversion statistics on real markets 73
10.2 A running example of CWMR-Stdev on the Cover’s game 80
10.3 Summary of time complexity analysis 81
11.1 Summary of existing optimization formulations and their underlying
predictions 85
11.2 Illustration of the mean reversion strategies on toy markets 87
12.1 Summary of the six datasets from real markets 98
12.2 Summary of the performance metrics used in the evaluations 101
13.1 Cumulative wealth achieved by various trading strategies on the
six datasets and their reversed datasets 104
13.2 Statistical t-test of the proposed algorithms on the six datasets 107
13.3 Cumulative wealth achieved by various strategies on the six datasets
without and with margin loans (MLs) 121
13.4 Computational time cost (in seconds) on the six real datasets 122
13.5 Some descriptive statistics on the NYSE (O) dataset 123
13.6 Top five (average) allocation weights of some strategies on
NYSE (O) 126
A.1 All implemented strategies in the toolbox 146
A.2 All included datasets in the toolbox 147
A.3 Controlling variables 155
A.4 Vector of the analyzed results 156
C.1 Some descriptive statistics on the NYSE (N) dataset 188
C.2 Some descriptive statistics on the SP500 dataset 189
C.3 Some descriptive statistics on the MSCI dataset 190
C.4 Some descriptive statistics on the DJIA dataset 191
C.5 The top five (average) allocation weights of the proposed strategies
on five datasets 192

xi

T&F Cat #K23731 — K23731_C000 — page xi — 10/13/2015 — 16:23


List of Notations

α Decaying factor for MAR


x̄
t Arithmetic mean of t-th (predict or realized) price relatives
Element-wise product of two vectors
m Simplex domain
 (b; xt ) -insensitive loss function
 Sensitivity parameter for PAMR
γ
 Proportional transaction cost rate
Integral
Rm + Domain of m-dimensional vectors with positive real elements
1 Vector of all 1
b·x Dot product of vector b and vector x
b x Dot product of vector b and vector x
bn1 A portfolio strategy for n periods
bt Portfolio vector for t-th period
I Identity matrix
pt Vector of closing prices for t-th period
x1n Whole market windows for n periods
xse Market windows from period s to e
xt Vector of price relatives (simple gross return) for t-th period
N (μ, ) Normal distribution with mean μ and covariance matrix 
det() Determinant of matrix 
EMAt (α) Exponential moving average
regretn (Alg) Regret of Alg strategy with respect to BCRP
SMAt (w) Simple moving average
μ(·) A distribution on the space of valid portfolio
φ Confidence parameter
(·) Cumulative function of normal distribution
ρ Correlation coefficient threshold
x̃t+1 Predicted price relative vector
Ct (w, ρ) Correlation-similar set
i Index of an asset
m The number of assets

xiii

T&F Cat #K23731 — K23731_C000 — page xiii — 10/13/2015 — 16:23


xiv LIST OF NOTATIONS
n The number of trading periods
P Maximum correlation coefficient
St Cumulative return till the end of t-th period
st Daily return for t-th period
St (Alg) Cumulative return achieved by Alg until the end of
t-th period
t Index of a trading period
W Maximum window size
w Window size
Wt (Alg) Exponential growth rate achieved by Alg until the
t-th period
C Aggressive parameter for PAMR
CORN(w, ρ) Correlation-driven nonparametric expert
CORN-K CORN algorithm with top-K aggregation
CORN-U CORN algorithm with uniform aggregation

T&F Cat #K23731 — K23731_C000 — page xiv — 10/13/2015 — 16:23


Preface

Introduction
Computational intelligence techniques, including machine learning and data mining,
have significantly reshaped the financial investment community over recent decades.
Examples include high-frequency trading and algorithmic trading. This book studies a
fundamental problem in computational finance, or online portfolio selection (OLPS),
which aims to sequentially determine optimal allocations across a set of assets. This
book investigates this problem by conducting a comprehensive survey on existing
principles and presenting a family of new strategies using machine-learning tech-
niques. A back-test system using historical data has been developed to evaluate the
performance of trading strategies.
Our goal in writing this monograph is to present a self-contained text to a wide
range of audiences, including graduate students in finance, computer science, and
statistics, as well as researchers and engineers who are interested in computational
investment. The readers are encouraged to visit our project website for more updates:
[Link]

Organization
Part I introduces the OLPS problem. Chapter 1 introduces the background and sum-
marizes the contributions of this book. Chapter 2 formally formulates OLPS as a
sequential decision task.
Part II presents some key principles for this task. Chapter 3 summarizes three
benchmarks: the Buy-and-Hold strategy, Best Stock strategy, and Constant Rebal-
anced Portfolios. Chapter 4 presents the principle of Follow the Winner, which moves
weights from winning assets to losing assets. Chapter 5 presents an opposite princi-
ple called Follow the Loser, which moves weights from losers to winners. Chapter 6
demonstrates the principle of Pattern Matching, which exploits similar patterns among
historical markets. Chapter 7 talks about Meta-Learning, which views the strategies
as assets, and thus hyperstrategies.
Part III designs four novel algorithms to solve the OLPS problem. All algo-
rithms apply the state-of-the-art machine-learning techniques to the task. Chapter 8
designs a new strategy named CORrelation-driven Nonparametric (CORN) learn-
ing, which overcomes the limitations of existing pattern matching–based strategies
using Euclidean distance to measure the similarity between two patterns. Chapter 9
develops Passive–Aggressive Mean Reversion (PAMR), which is based on the
first-order passive–aggressive online learning method, and Chapter 10 designs

xv

T&F Cat #K23731 — K23731_C000 — page xv — 10/13/2015 — 16:23


xvi PREFACE
Confidence-Weighted Mean Reversion (CWMR), which is based on the second-order
confidence-weighted online learning method. Chapter 11 assumes multiple-period
mean reversion, or so-called Moving Average Reversion (MAR), and presents a new
OLPS strategy named Online Moving Average Reversion (OLMAR), which exploits
MAR by applying online learning techniques.
Part IV presents empirical studies for benchmarking the performance of the
proposed algorithms. Chapter 12 discusses issues related to the implementation of
a back-test system, which is widely used in the evaluation of trading strategies.
Chapter 13 shows the empirical results on six historical markets. Our empirical results
show that (i) the proposed algorithms generally outperform the state of the art in terms
of the cumulative return and risk-adjusted return and (ii) the proposed algorithms
are highly efficient and scalable for large-scale OLPS in real-world applications.
Chapter 14 discusses various assumptions during the study.
Part V concludes the book and presents some potential future directions.

MATLAB is a registered trademark of The MathWorks, Inc. For product informa-


tion, please contact:
The MathWorks, Inc.
3 Apple Hill Drive
Natick, MA 01760-2098, USA
Tel: 508-647-7000
Fax: 508-647-7001
E-mail: info@[Link]
Web: [Link]

T&F Cat #K23731 — K23731_C000 — page xvi — 10/13/2015 — 16:23


Acknowledgments

We thank Peilin Zhao, Doyen Sahoo, Vivekanand Gopalkrishnan, and Dingjiang


Huang, who had participated in different stages of our online portfolio selection
project and made different contributions to some of our related research in this area,
which are the core foundations of this book. We gratefully acknowledge research
support from both Wuhan University and Singapore Management University. This
book was supported in part by the National Natural Science Foundation of China
(71401128); the Scientific Research Foundation for the Returned Overseas Chinese
Scholars, State Education Ministry; the Fundamental Research Funds for the Cen-
tral Universities; and the MOE tier-1 research grant from Singapore Management
University. Special thanks also goes to Nanyang Technological University, where the
authors started the early initiatives of this project and completed this project.

Bin Li
Economics and Management School
Wuhan University, People’s Republic of China

Steven C.H. Hoi


School of Information Systems
Singapore Management University, Singapore

xvii

T&F Cat #K23731 — K23731_C000 — page xvii — 10/13/2015 — 16:23


Authors

Dr. Bin Li received a bachelor’s degree in computer science from Huazhong Univer-
sity of Science and Technology, Wuhan, China, and a bachelor’s degree in economics
from Wuhan University, Wuhan, China, in 2006. He earned a PhD degree from the
School of Computer Engineering of Nanyang Technological University, Singapore, in
2013. He completed the CFAProgram in 2013. He is currently an associate professor of
finance at the Economics and Management School of Wuhan University. Dr. Li was a
postdoctoral research fellow at the Nanyang Business School of Nanyang Technologi-
cal University. His research interests are computational finance and machine learning.
He has published several academic papers in premier conferences and journals.

Dr. Steven C.H. Hoi received his bachelor’s degree in computer science from
Tsinghua University, Beijing, China, in 2002, and both his master’s and PhD degrees in
computer science and engineering from The Chinese University of Hong Kong, Hong
Kong, China, in 2004 and 2006, respectively. He is currently an associate professor in
the School of Information Systems, Singapore Management University, Singapore.
Prior to joining SMU, he was a tenured associate professor in the School of Computer
Engineering, Nanyang Technological University, Singapore. His research interests
are machine learning and data mining and their applications to tackle real-world big
data challenges across varied domains, including computational finance, multimedia
information retrieval, social media, web search and data mining, computer vision and
pattern recognition, and so on. Dr. Hoi has published more than 150 referred arti-
cles in premier international journals and conferences. As an active researcher in his
research communities, he has served as general co-chair for ACM SIGMM Workshops
on Social Media (WSM’09-11), program co-chair for Asian Conference on Machine
Learning (ACML’12), editor for Social Media Modeling and Computing, guest edi-
tor for journals such as Machine Learning and ACM TIST, associate editor-in-chief
of Neurocomputing, associate editor for several reputable journals, area chair/senior
PC member for conferences, including ACM Multimedia 2012 and ACML’11–’15,
technical PC member for many international conferences, and referee for top journals
and magazines. He has often been invited for external grant review by worldwide
funding agencies, including the US NSF funding agency, Hong Kong RGC funding
agency, and so on. He is a senior member of IEEE and a member of AAAI and ACM.

xix

T&F Cat #K23731 — K23731_C000 — page xix — 10/13/2015 — 16:23


Part I

Introduction

T&F Cat #K23731 — K23731_S001 — page 1 — 9/26/2015 — 8:12


Chapter 1

Introduction

Wall Street is notorious for not learning from its mistakes.


Maybe machines can do better.
– Letting the Machines Decide
Investments in financial markets comprise fundamental and challenging tasks in both
financial academia and industry. For example, mutual funds invest the raised cap-
ital among a collection of investment opportunities so as to create value for the
fund investors; insurance companies invest the premiums among the financial mar-
ket so as to satisfy the insurance claims in future. Typically, investors analyze and
explore investment opportunities via fundamental and technical analyses using vari-
ous instruments and tools, often done in manual ways. To meet the rapid development
of investment opportunities (cf., three challenges in Section 1.1), quantitative analysis
has been emerging as a new way for investment analysis and automated trading.
Computational finance (CF), which leverages financial theory via computational
techniques, has been emerging and evolving rapidly in recent years. One of the heavily
studied areas in CF is the investment, as the computer helps to automate various tasks
and make decision in investments. For example, by using advanced computational
tools, investment analysts can analyze huge amount of data and identify the under-
priced stocks. Besides, investment strategists can back-test and compare strategies
using historical data so that they have confidence of the strategy in an unknown
future.
One crucial investment task is the allocation of capital, or so-called “portfolio
selection” (Markowitz 1952). Despite the theoretical perfectness, estimation errors
in their models have constrained their application in real investment. According
to DeMiguel et al. (2009), a naive N1 strategies can outperform various portfolio
selection models. Such estimation errors lead to portfolio strategies without estima-
tion, or the online portfolio selection (OLPS) pioneered by Cover (1991). We further
follow this approach and study the problem of OLPS.
This chapter first introduces the background of OLPS and briefly outlines the
contents to be covered in this book.

T&F Cat #K23731 — K23731_C001 — page 3 — 9/28/2015 — 21:04


4 INTRODUCTION
1.1 Background
The financial investment management industry often faces various challenges and
requires new solutions for the task. Below we introduce three representative
challenges and briefly propose how to tackle these challenges using machine-learning
techniques.

1.1.1 Challenge 1: Voluminous Financial Instruments


One recent challenge is the increasing number of financial instruments,∗ in terms of
both categories and assets in each category. On the one hand, financial innovations
(Miller 1986) in the past decade created various types of financial instruments, such
as interest rate swaps, credit default swaps, and options. On the other hand, with the
development of global economy, thousands of companies and trading instruments
are listed on various exchanges.† The “big data” generated by these instruments and
companies make it very hard for human investors to process and analyze.

1.1.2 Challenge 2: Human Behavioral Biases


The second challenge is humans behavioral biases in decision making (Barberis and
Thaler 2003). Due to humans’ subjective nature, many traditional investment strate-
gies suffer from these biases and release sub-optimal decisions when greed and fear
interact. Actually, exploiting consistent biases in markets is one source of profits for
many traders (Reinganum 1983; Dimson 1988; Jegadeesh 1990). Thus, for an individ-
ual investor or institution, it is better to avoid such behavioral biases or even exploit
other biases, which is hard for most human investors.

1.1.3 Challenge 3: High-Frequency Trading


The development of information technology has significantly speed up the trading
industry. One example is the high-frequency trading (HFT) (Aldridge 2010), which
completes the buy and sell within a time ranging from seconds to one day. On the one
hand, intraday data are much more voluminous and fast than low-frequency data and,
thus, require high-speed tools and methodologies. On the other hand, due to the high
speed, HFT requires a quick response to the market behaviors, otherwise the oppor-
tunities will disappear. While sometimes human investors can spot the opportunities,
they are too slow to open trade positions. Both characteristics of HFT call for new
tools and methodologies.

1.1.4 Algorithmic Trading and Machine Learning


To tackle the above three challenges, algorithmic trading techniques, which assist
investment activities via computational techniques, have emerged. However, with
∗ Financial instruments refer to any tradable assets, such as stocks, futures, and bonds.
† Exchanges provide services, such as trading financial instruments, for traders and brokers. For example,
New York Stock Exchange (NYSE) is a stock exchange.

T&F Cat #K23731 — K23731_C001 — page 4 — 9/28/2015 — 21:04


WHAT IS ONLINE PORTFOLIO SELECTION? 5
the advance of computational techniques, nowadays machines can handle a much
larger quantity of instruments and companies than humans do. It also processes the
data in a much higher speed than humans do and is thus suited for HFT scenarios.
On the other hand, the machine is free from human behavioral biases and produces
exactly the same results if the inputs are the same. There are mainly two areas of
algorithmic trading (Harris 2003), one is on the sell side∗ and the other is on the
buy side.† The sell side algorithmic trading (Bertsimas and Lo 1998; Almgren and
Chriss 2000; Nevmyvaka et al. 2006; Bayraktar 2011) concerns automatically slicing
a large order to smaller ones, such that the market impacts incurred by the large order
are minimized, while the buy side algorithmic trading (Qian et al. 2007; Chan 2008;
Durbin 2010; Kearns et al. 2010) makes intelligent investment decisions to achieve
certain targets, such as profit maximization, risk minimization, or both.
Machine learning (Mitchell 1997), a scientific discipline of designing algorithms
that can identify complex relationships among huge amounts of historical data and
make intelligent decisions upon new data, has been successfully applied to a variety
of areas (Manning and Schütze 1999; Baldi and Brunak 2001), including algorithmic
trading in finance. For the sell side, there are several patterns among the submitted
orders (Harris 2003). To optimally execute one client’s large order, machine-learning
techniques (Nevmyvaka et al. 2006; Agarwal et al. 2010; Ganchev et al. 2010) can
take advantage of the patterns and submit smaller time/volume weighted orders to
exchanges, such that the market impacts are minimized. For the buy side, several
patterns in financial markets (or, in jargon, “market anomalies”) (Dimson 1988; Cont
2001), such as calendar anomalies (Haugen and Lakonishok 1987),‡ fundamental
anomalies (Fama and French 1992),§ and technical anomalies (Bondt and Thaler
1985; Chan et al. 1996),¶ are well documented. To generate profits from these patterns,
several machine-learning algorithms (El-Yaniv 1998; Yan and Ling 2007; Györfi et al.
2012) have been proposed for buy-side algorithmic trading. Their basic idea is to
identify the patterns via machine-learning techniques and obtain profit by trading the
patterns.

1.2 What Is Online Portfolio Selection?


This book studies a core problem in the buy-side algorithmic trading named “Online
Portfolio Selection” (Cover 1991; Ordentlich and Cover 1996), which sequentially
allocates capital among a set of assets aiming to maximize the final return of invest-
ment in the long run. OLPS plays a crucial role in a wide range of financial investment

∗ Sell side often refers to investment banks that sell investment services, such as routing orders to
exchanges, to asset management firms.
† Buy side usually refers to the asset management firms that buy the services from the sell side. For
example, Citadel, an asset management firm (buy side), may send their purchase orders via Goldman Sachs,
an investment bank (sell side).
‡ Calendar anomalies refer to the patterns in asset returns from year to year, or month to month. One
famous example is the January effect.
§ Fundamental anomalies are the patterns in asset returns related to the fundamental values of a company,
such as size effect and value effect.
¶ Technical anomalies are patterns related to historical prices, such as momentum, and contrarian.

T&F Cat #K23731 — K23731_C001 — page 5 — 9/28/2015 — 21:04


6 INTRODUCTION
applications, such as automated wealth management, hedge fund management, and
quantitative trading. In the following, to better understand the idea, we begin by
introducing a concrete example of real-life OLPS applications.
Suppose Bin, a 30-year-old man, has a capital of $10,000, and he wants to increase
the capital to $1,000,000∗ when he retires at 60 years old, such that he can maintain
his current living standards. Assume he has no extra income for investment and purely
relies on the initial capital. He would like to achieve this target via the investments
in financial markets. Assume that his investment consists of three assets, including
Microsoft (stock, symbol: “MSFT”), Goldman Sachs (stock, symbol: “GS”), and
Treasury bill.† All historical records on the three assets, mainly price quotes, are
publicly available. Then, every month,‡ Bin receives updated information about the
three assets and has to face a crucial challenge of decision making, that is, “How
to allocate (rebalance) his capital§ among the three assets every month such that his
capital will be more likely increased in the future?” The idea of exploring OLPS
technology is to help Bin automate the sequences of allocation/rebalancing decisions
so as to maximize his investment return in the long run.
In literature, there are two major schools of principles and theories for portfolio
selection: (i) Markowitz’s mean variance theory (Markowitz 1952, 1959) that trades
off between the expected return (mean) and risk (variance) of a portfolio, which is
suitable for single-period portfolio selection and (ii) capital growth theory (or Kelly
investment) (Kelly 1956; Breiman 1961; Thorp 1971; Finkelstein and Whitley 1981)
that aims to maximize the expected log return of a portfolio and naturally addresses
multiple-period investment. Due to the sequential nature of a real-world portfolio
selection task, many recent OLPS techniques often design algorithms by following
the second family of principles and theories.
Note that this book is focused on the algorithmic aspects, rather than the the-
ory (Breiman 1960; Thorp 1969, 1997; Hakansson 1970, 1971; MacLean et al. 2011).
Our study is often concerned with investment management involving multiple types
of assets, which may include fixed income securities, equities, and derivatives. Our
study is also different from another great body of existing work, which attempted to
forecast financial time series by applying computational intelligence techniques and
conduct single-stock trading (Katz and McCormick 2000; Huang et al. 2011), such as
reinforcement learning (Moody et al. 1998; Moody and Saffell 2001), online predic-
tion (Koolen and Vovk 2012), boosting and expert weighting (Creamer 2007, 2012;
Creamer and Freund 2007, 2010; Creamer and Stolfo 2009), neural networks (Kimoto
et al. 1993; Dempster et al. 2001), decision trees (Tsang et al. 2004), and support
vector machines (Tay and Cao 2001; Cao and Tay 2003; Lu et al. 2009). Finally,
we emphasize the nature of “online” algorithms for addressing the portfolio selec-
tion problem, in which the algorithms must be computationally efficient enough

∗ Here, one million is an arbitrary number; of course, the more the better.
† Treasury bill is often regarded as a risk-free asset, earning a guaranteed risk-free return. Once he does
not want to buy any stocks, he can put all money in Treasury bills, instead of cash.
‡ Here, “month” represents a period, which can be one day, one week, or one month, etc.
§ For example, he may buy $5000 MSFT stock, $3000 GS stocks, and $2000 Treasury bills.

T&F Cat #K23731 — K23731_C001 — page 6 — 9/28/2015 — 21:04


METHODOLOGY 7
for handling large-scale applications (e.g., high-frequency trading), although our
algorithms are not restricted to high-frequency trading.

1.3 Methodology
OLPS for real-world trading tasks is challenging in that the market information
(mainly the market data) arrives sequentially, and a portfolio manager has to make a
decision immediately based on the known information. The problem is endogenously
online. Two types of machine-learning methodologies have been explored to design
strategies for this task.
The first methodology is batch learning, where the model is trained from a
batch of training instances. In this way, we assume that all price information (and
maybe other information) is complete at one decision point, and thus one can deploy
batch-learning methods to learn the portfolios. In this mode, one decision is always
irrelevant to previous decisions. In particular, we adopt such a mode in one proposed
algorithm, which deploys nonparametric learning (or instance-based learning, or
case-based learning; Aha 1991; Aha et al. 1991; Cherkassky and Mulier 1998). With
an effective trading principle, such a mode can achieve the goal of our project.
The second methodology is online learning (or incremental learning), where the
model is trained from a single instance in a sequential manner (Shalev-Shwartz 2012;
Loveless et al. 2013). Online learning is the process of solving a sequence of prob-
lems, given (maybe partial) the solutions to previous problems and possibly additional
side information. This definition naturally fits our problem, which is innately online.
Contrary to the batch mode, in this mode, one decision is often connected to previous
decisions. In particular, in the remaining three of the four algorithms, we adopt two
types of online learning techniques (Crammer et al. 2006, 2008, 2009; Dredze et al.
2008) to solve the problem. Besides, to achieve the target of our project, it is also
important to exploit an effective trading principle when designing a specific strategy.
In this book, we will introduce a variety of classical and modern trading principles
that are commonly used for designing OLPS strategies.
After designing a trading strategy, we need to evaluate the effectiveness of the
proposed strategy using a back-test methodology. In particular, we feed the historical
market data into the testbed to evaluate the strategy and examine how it performs.
Through an extensive set of evaluation and analysis of the back-testing performance,
we can decide how likely the proposed trading strategy may survive in real-life appli-
cations. In this book, we developed an open-source back-testing system, named Online
Portfolio Selection, which allows us to benchmark empirical performances of differ-
ent strategies and algorithms on the same platform. Throughout the book, all the
algorithms and strategies will be evaluated on this platform.

1.4 Book Overview


This book consists of four parts, including introduction, principles, algorithms, and
empirical studies. Figure 1.1 gives an overview of the book organization of different
parts and chapters. The major contents to be covered in each part and each chapter
are given below.

T&F Cat #K23731 — K23731_C001 — page 7 — 9/28/2015 — 21:04


8 INTRODUCTION
Part I: Introduction

Problem formulation

Part II: Principles

Benchmarks Follow the winner Follow the loser Pattern matching–based Meta-algorithms

Part III: Algorithms

PAMR
Passive–aggressive mean reversion

CORN CWMR
Correlation-driven nonparametric learning Confidence-weighted mean reversion

OLMAR
Online moving average reversion

Part IV: Empirical studies

Implementations Empirical results Threats of validity

Part V: Conclusion

Figure 1.1 Book organization.

Part I introduces the background, motivations, and basic definitions of the OLPS
problem. Specifically, Chapter 1 introduces the background of computational finance,
algorithmic trading, and machine learning and their connections to OLPS. Chapter 2
formally formulates the problem of OLPS as a scientific task.
Part II summarizes the main principles and algorithms of OLPS. In particular,
Chapter 3 introduces a family of strategies commonly known as the benchmark prin-
ciples for OLPS. Chapter 4 introduces the principle of “follow the winner,” which is

T&F Cat #K23731 — K23731_C001 — page 8 — 9/28/2015 — 21:04


BOOK OVERVIEW 9
commonly known as the strategies of exploring the “trend following” assumption for
investment. Chapter 5 introduces the principles of “follow the loser,” which is often
known as the strategies of exploring the “mean reversion” assumption for investment.
Chapter 6 introduces the principle of pattern matching for OLPS. Finally, Chapter 7
introduces the principle of meta-learning, which attempts to explore the combination
of multiple principles and strategies for OLPS.
Part III proposes four OLPS algorithms belonging to two categories, that is,
the pattern matching–based approach and follow the loser approach. The first algo-
rithm is a pattern-matching algorithm, “CORrelation-driven Nonparametric learning”
(CORN), in Chapter 8. The other three algorithms are mean reversion algorithms.
That is, we propose the “passive–aggressive mean reversion” (PAMR) algorithm
in Chapter 9, the “confidence-weighted mean reversion” (CWMR) algorithm in
Chapter 10, and the “online moving average reversion” (OLMAR) in Chapter 11.
Part IV presents our empirical studies. Chapter 12 introduces the method of empir-
ical studies, and Chapter 13 extensively evaluates the proposed algorithms on real
datasets and compares with a set of existing algorithms. Chapter 14 defends the
methodologies used in the model setting and empirical studies. Finally, Chapter 15
concludes the book with some future directions.

T&F Cat #K23731 — K23731_C001 — page 9 — 9/28/2015 — 21:04


Chapter 2

Problem Formulation

This chapter introduces the problem setting of online portfolio selection (OLPS)
and formally formulates the problem mathematically as a sequential decision task.
We further relax the problem setting by adding two practical constraints: transaction
costs and margin buying. Finally, we introduce the idea of how to evaluate a strategy’s
performance.
Specifically, this chapter is organized as follows. Section 2.1 formally formulates
the OLPS task as a sequential decision problem. Section 2.2 relaxes the transaction
costs and margin buying constraints. Section 2.3 introduces several evaluation metrics
for the task. Finally, Section 2.4 summarizes this chapter.

2.1 Problem Settings


Consider an OLPS task: assume an investor aims to invest his capital on a finite
number of m ≥ 2 investment assets∗ for a finite number of n ≥ 1 trading periods.†
At the t-th period (t = 1, . . . , n), the asset (close) prices are represented by a
vector pt ∈ Rm + , and each element pt,i , i = 1, . . . , m, represents the close price of
asset i. Their price changes are represented by a price relative vector xt ∈ Rm + , each
component of which denotes the ratio of the t-th close price to the last close price,
pt,i ‡
that is, xt,i = pt−1,i . Thus, an investment in asset i throughout period t increases
by a factor of xt,i .§ Let us denote x1n = {x1 , . . . , xn } as a sequence of price relative
vectors for n periods, and xse = {xs , . . . , xe }, 1 ≤ s < e ≤ n as a market window of
price relative vectors ranging from period s to period e.
An investment in the market for the t-th period is specified by a portfolio vector
bt = (bt,1 , . . . , bt,m ), where bt,i , i = 1, . . . , m, represents the proportion of wealth
invested in asset i at the beginning of the t-th period. Typically, the portfolio is

∗ When m = 1, the problem is reduced to single-stock trading, which is out of the scope of this book.
†A period can be a week, a day, an hour, or even a second in high-frequency trading.
‡ Here we adopt simple gross return, while one may choose simple net return, i.e., pt,i −pt−1,i .
pt−1,i
For the calculation of the first period, suppose we have p0,i .
§ For example, x = 2 means that the investment on an asset will increase by 100%, or double its
t,i
initial investment. xt,i = 1 means that the capital will remain its initial capital.

11

T&F Cat #K23731 — K23731_C002 — page 11 — 9/28/2015 — 21:06


12 PROBLEM FORMULATION
self-financed and no margin/short is allowed; therefore, each entry of a portfolio
is
m nonnegative and adds up to one, that is, bt ∈ m , where m = bt : bt  0,
i=1 bt,i = 1 .∗ The investment procedure is represented by a portfolio strategy,
1 
that is, b1 = m , . . . , m1 , and the following sequence of mappings:

m(t−1)
bt : R+ → m , t = 2, 3, . . . ,
 
where bt = bt x1t−1 is the portfolio determined at the beginning of the t-th period
upon observing past market behaviors. We denote by bn1 = {b1 , . . . , bn } the strategy
for n periods, which is the output of an OLPS strategy.
At the t-th period, a portfolio bt produces aportfolio period return st , that is, the
wealth increases by a factor of st = b t xt =
m †
i=1 bt,i xt,i . Since we reinvest and
adopt relative prices, the wealth would grow multiplicatively. Thus, after n periods,
a portfolio strategy bn1 will produce a portfolio cumulative wealth of Sn , which

increases the initial wealth by a factor of nt=1 bt xt , that is,

n
Sn (bn1 , x1n ) = S0 b
t xt ,
t=1

where S0 denotes initial wealth and is usually set to $1 for convenience.


We present the framework of the above task in Protocol 2.1. In this task, a portfolio
manager’s goal is to produce a portfolio strategy (bn1 ) upon the market price relatives
(x1n ), aiming to achieve certain targets. The manager computes the portfolios in a
sequential manner. For each period t, the manager has access to the sequence of
past price relative vectors x1t−1. The manager computes a new portfolio bt for next
price relative vector xt , where the decision criterion varies among different managers.
Then traders will rebalance to the new portfolio, via buying and selling the underlying
stocks. At the end of a trading day, the market will reveal xt . The resulting portfolio bt
is scored based on portfolio period return st . This procedure is repeated until the final
period, and the portfolio strategy is scored by its portfolio cumulative wealth Sn .
It is important to note that we make several general and common assumptions in
the above model:
1. Transaction cost: no explicit or implicit transaction costs‡ exist.
2. Market liquidity: one can buy and sell the required amount, even fractional, at the
last close price of any given trading period.
3. Market impact: any portfolio selection strategy shall not influence the market or
any other stocks’ prices.
∗  0 denotes that each element of the vector is nonnegative.
† For example, Bin buys MSFT with 50% of his capital ($5000), GS with 30% of his capital ($3000),
and a T-bill with the remaining 20% ($2000). If MSFT goes up by a factor of 2, GS goes down by a
factor of 0.5, and the T-bill remains 1. Then his capital will increase by a factor of 0.5 × 2 + 0.3 × 0.5 +
0.2 × 1 = 1.35, or increase by 35%.
‡ Explicit costs include commissions, taxes, stamp duties, and fees. Implicit costs include the bid–ask
spread, opportunity costs, and slippage costs.

T&F Cat #K23731 — K23731_C002 — page 12 — 9/28/2015 — 21:06


TRANSACTION COSTS AND MARGIN BUYING MODELS 13

Protocol 2.1: Online portfolio selection.


Input: x1n : Historical market price relative sequence
Output: Sn : Final cumulative wealth

Initialize S0 = 1, b1 = m1 , . . . , m1
for t = 1, 2, . . . , n do
Portfolio manager learns a portfolio bt ;
Market reveals a price relative vector xt ;

Portfolio incurs  return st = bt xt and updates cumulative return
 period
St = St−1 × bt xt ;
Portfolio manager updates his or her decision rules;
end

The preceding assumptions are nontrivial. We will further analyze and discuss their
implications and effects for our empirical studies in Sections 13.4 and 14.1.
Finally, as we are going to design intelligent learning algorithms that fit the above
model, let us fix the objective of the proposed learning algorithms. For a portfolio
selection task, one can choose to maximize risk-adjusted return (Markowitz 1952;
Sharpe 1964) or to maximize cumulative return (Kelly 1956; Thorp 1971) at the end
of a period. While the model is online, which contains multiple periods, we choose
to maximize the cumulative return (Hakansson 1971),∗ which is also the objective of
most existing algorithmic studies.

2.2 Transaction Costs and Margin Buying Models


While our model is concise and simple to understand, it ignores some practical issues
in real trading scenarios. We now relax two constraints to address these issues.
In reality, an important and unavoidable issue is the transaction cost, which
includes the commission fees and taxes imposed by brokers and governments, dur-
ing the rebalance activities.† Note that the transaction cost is imposed by markets,
and a portfolio’s behavior cannot change the properties of transaction costs, such
as commission rates or tax rates. To handle the issue, the first way, which is com-
monly adopted by existing strategies, is that a portfolio selection model does not
take transaction costs into account, and the second way is to directly integrate the
costs in the model (Györfi and Vajda 2008). In this book, we take the first way and
adopt the simplified proportional transaction cost model (Blum and Kalai 1999;
Borodin et al. 2004).‡ To be specific, rebalancing a portfolio incurs transaction costs
∗ However, such objective does not prohibit us from comparing different strategies via risk-adjusted
terms, such as the Sharpe ratio and Calmar ratio.
† Besides commission and taxes, some other factors, such as bid–ask spreads, also implicitly incur
transaction costs to a portfolio.
‡ Blum and Kalai (1999, p. 195) also provide the precise model, with which our algorithms’ per-
formance is almost the same as the simplified one, but the precise model costs much more time to
compute.

T&F Cat #K23731 — K23731_C002 — page 13 — 9/28/2015 — 21:06


14 PROBLEM FORMULATION
on every buy and sell operation, based upon a transaction cost rate of γ ∈ (0, 1).
At the beginning of period t, the portfolio manager rebalances his or her wealth to
a new portfolio bt , from last close price adjusted portfolio b̂t−1 , each component of
b ×xt−1,i
which is calculated as b̂t−1,i = t−1,i
b xt−1
. Such rebalance incurs a transaction cost
 t−1
of γ2 × m
i=1 bt,i − b̂t−1,i , where the initial portfolio is set to (0, . . . , 0). Thus, the
cumulative wealth after n periods can be expressed as
  
γ

n
γ 
m

S n = S0 (bt · xt ) × 1 − × bt,i − b̂t−1,i .
2
t=1 i=1

Another practical issue is margin buying, which allows the portfolio managers
to buy securities with cash borrowed from securities brokers, using their own equity
positions as collateral. Following existing studies (Cover 1991; Helmbold et al. 1998;
Agarwal et al. 2006), we relax this constraint and evaluate it empirically. We assume
the margin setting to be 50% down and a 50% loan,∗ at an annual interest rate
of 6% (equivalently, the corresponding daily interest rate of borrowing, c, is set to
0.000238). With such a setting, a new asset named “margin component” is generated
for each asset, and its price relative for period t equals 2 × xt,i − 1 − c. In the case
of xt,i ≤ 1+c
2 , which means the stock drops more than half, we simple set its mar-
gin component to 0 (Li et al. 2012).† As a result, if margin buying is allowed, the
total number of assets becomes 2m. By adding such a “margin component,” we can
magnify both the potential profit or loss on the i-th asset.‡

2.3 Evaluation
One standard criterion to evaluate an OLPS strategy is its portfolio cumulative wealth
at the end of trading periods. As we set the initial wealth, S0 = 1 and thus Sn also
denote the portfolio cumulative return, which is the ratio of final portfolio cumulative
wealth divided by its initial wealth. Another equivalent criterion, which considers

compounding effect, is annualized percentage yield (APY), that is, APY = y Sn − 1,
where y is the number of years corresponding to n periods.§ APY measures the average
wealth increment that a strategy could achieve in a year. Typically, the higher the
portfolio cumulative wealth or annualized percentage yield, the better the strategy’s
performance is.
Besides the absolute return metrics, it is also important to evaluate a strategy’s
risk and risk-adjusted return (Sharpe 1963, 1994). One common criterion is the annu-
alized standard deviation of portfolio period returns to measure volatility risk and
∗ That is, if one has $100 stock (down or collateral) one can borrow at most $100 cash (loan).
† Such a measure is not perfect since it manually changes the margin component, although less than
5 per dataset. One may refer to Györfi et al. (2012, Chapter 4) for other solutions to the possibility of ruin.
‡ For example, assume two assets with price relatives of (1.1, 0.9). After adjustment, the price relative
vector becomes (1.1, 0.9, 1.2, 0.8). Putting wealth on the latter two margin components, the portfolio’s
profit or loss magnifies. That is, 10% profit (1.1) becomes 20% (1.1 × 2 − 1 − c) and 10% loss (0.9) also
becomes 20% (0.9 × 2 − 1 − c). Note that the portfolio vector representing the proportions of capital is
still a simplex.
§ One year consists of 252 trading days or 50 trading weeks.

T&F Cat #K23731 — K23731_C002 — page 14 — 9/28/2015 — 21:06


EVALUATION 15
the annualized Sharpe ratio (SR) (Sharpe 1966) to evaluate volatility risk-adjusted
return. To obtain the annualized standard
√ deviation, we calculate the standard devi-
ation of daily returns and multiply by 252.∗ For volatility risk-adjusted return, we
calculate the annualized SR as
APY − Rf
SR = ,
σp

where Rf is the risk-free return† and σp is the annualized standard deviation. The
higher the annualized SR, the better the strategy’s (volatility) risk-adjusted return is.
Portfolio management community often conducts drawdown analysis (Magdon-
Ismail and Atiya 2004) to measure the decline from a historical peak of portfolio
cumulative wealth. Formally, a strategy’s drawdown (DD) at period t is defined as
DD(t) = sup[0, supi∈(0,t) Si − St ]. Its maximum drawdown (MDD) is the maximum
of drawdowns over all periods and can effectively measure a strategy’s downside
risk. Formally, maximum drawdown for a horizon of n, MDD(n), is defined as

MDD(n) = sup [DD(t)].


t∈(0,n)

Moreover, practitioners also adopt the Calmar ratio (CR) (Young 1991) to measure
a strategy’s drawdown risk-adjusted return:
APY
CR = .
MDD
The smaller the maximum drawdown, the more drawdown risk the strategy can tol-
erate. The higher the Calmar ratio, the better (drawdown) risk-adjusted return the
strategy is.
To test whether simple luck can generate the return achieved by a strategy, portfo-
lio management practitioners (Grinold and Kahn 1999) can conduct statistical tests.
Since all test datasets are just samples of the market population, such tests can val-
idate a strategy for future. We conduct a Student’s t-test to determine the likelihood
that the observed profitability is due to chance alone (under the assumption that a
strategy is not profitable in the population). Since the sample profitability is being
compared with no profitability, 0 is subtracted from the sample mean profit/loss. Note
that (daily) profit/loss equals (daily) return minus 1. The standard error of mean is cal-
culated as the standard deviation divided by square root of the number of periods. The
t-statistic is the sample profit mean‡ divided by the sample standard error. Finally,
the probability of the t-statistic can be calculated with a degree of freedom equal to the
number of periods minus 1. Note that the Student’s t-test assumes that the underlying
distribution of data is normal. According to the central limit theorem, as the sample
size increases, the distribution of the sample mean approaches normal. If a sample
∗ Here, 252 denotes the average number of annual trading days. For other frequencies, we can choose
their corresponding numbers.
† Typically, it equals the return of Treasury bills, and we fix it at 4% per year, or 0.000159 per day.
‡ Suppose we compare the sample profit mean with 0.

T&F Cat #K23731 — K23731_C002 — page 15 — 9/28/2015 — 21:06


16 PROBLEM FORMULATION
dataset contains a large number of trading transactions, which is often the case in our
empirical evaluations, we could regard the distribution of profit/loss as normal. The
smaller the probability, the higher the confidence we have toward the strategy.

2.4 Summary
Online portfolio selection (OLPS) is a fundamental and practical computational
finance problem. It can be mathematically formulated as a sequential decision task
that aims to decide the best sequence of decisions to maximize the investment goals in
the long run. It has been extensively studied in the literature, and recent years have wit-
nessed a rapid growth of fruitful research achievements. The next part will introduce
a family of important principles widely used for solving this challenging task.

T&F Cat #K23731 — K23731_C002 — page 16 — 9/28/2015 — 21:06


Part II

Principles

17

T&F Cat #K23731 — K23731_S002 — page 17 — 9/26/2015 — 8:13


PRINCIPLES 19
Existing online portfolio selection (OLPS) approaches follow the preceding problem
formulations and derive explicit portfolio update schemes. Table II.1 (Li and Hoi
2014) summarizes the main principles and several representative algorithms, and
four of the algorithms are illustrated in detail in later chapters. In particular, first
we introduce several benchmark algorithms. Then, we introduce three categories of
principles or algorithms with explicit portfolio update schemes, which are classified
according to the directions of weight transfer. The first approach, follow the winner,
increases the weights of more successful experts or stocks, often based on their his-
torical performance. Contrarily, the second approach, follow the loser, increases the
weights of less successful experts or stocks, or transfers the weights from winners
to losers. The third category, the pattern matching–based approach, constructs port-
folios based on similar historical patterns and has no explicit directions. Finally, we
survey some related meta-algorithms applying to a set of experts, each of which is
equipped with any algorithms in the preceding three categories.
This part is organized as follows. Chapter 3 surveys the benchmarks used in this
study. Chapter 4 surveys the first principle, follow the winner. Chapter 5 surveys the
second principle, follow the loser. Then, Chapter 6 introduces the third principle, pat-
tern matching–based approaches. The final principle, meta-algorithms, is introduced
in Chapter 7.

Table II.1 Principles and representative online portfolio selection algorithms


Classifications Algorithms Representative References
Benchmarks Buy and Hold
Best Stock
Constant Rebalanced Portfolios Kelly (1956); Cover (1991)
Follow Universal Portfolios Cover (1991)
the winner Exponential Gradient Helmbold et al. (1998)
Follow the Leader Gaivoronski and Stella (2000)
Follow the Regularized Leader Agarwal et al. (2006)
Aggregating-Type Algorithms Vovk and Watkins (1998)
Follow Anticorrelation Borodin et al. (2004)
the loser Passive–Aggressive Mean Li et al. (2012)
Reversion
Confidence Weighted Mean Li et al. (2013)
Reversion
Online Moving Average Li et al. (2015)
Reversion
Robust Median Reversion Huang et al. (2013)

(Continued)

T&F Cat #K23731 — K23731_S002 — page 19 — 9/26/2015 — 8:13


20 PRINCIPLES

Table II.1 (Continued) Principles and representative online portfolio selection algorithms
Classifications Algorithms Representative References
Pattern Nonparametric Histogram Györfi et al. (2006)
matching– Log-Optimal Strategy
based Nonparametric Kernel-Based Györfi et al. (2006)
approaches Log-Optimal Strategy
Nonparametric Nearest Györfi et al. (2008)
Neighbor Log-Optimal
Strategy
Correlation-Driven Li et al. (2011a)
Nonparametric Learning
Strategy
Nonparametric Kernel-Based Györfi et al. (2007)
Semi-Log-Optimal Strategy
Nonparametric Kernel-Based Ottucsák and Vajda (2007)
Markowitz-Type Strategy
Nonparametric Kernel-Based Györfi and Vajda (2008)
GV-Type Strategy
Meta- Aggregating Algorithm Vovk (1990); Vovk and
algorithms Watkins (1998)
Fast Universalization Algorithm Akcoglu et al. (2005)
Online Gradient Updates Das and Banerjee (2011)
Online Newton Updates Das and Banerjee (2011)
Follow the Leading History Hazan and Seshadhri (2009)
Source: Li and Hoi (2014).

T&F Cat #K23731 — K23731_S002 — page 20 — 9/26/2015 — 8:13


Chapter 3

Benchmarks

3.1 Buy-and-Hold Strategy


The most common baseline is the Buy-and-Hold (BAH) strategy, in which one invests
wealth among the market with an initial portfolio of b1 and holds the portfolio till the
end. The manager only buys the assets at the beginning of the first period and does not
rebalance in subsequent periods, while the portfolio holdings are implicitly changed
following the market fluctuations.
 In particular, at the end of period t, the portfo-

denotes the element-wise product.∗ BAH’s
b x
lio holding becomes tb x t , where
t t
final cumulative wealth is the initial portfolio-weighted average of individual asset
returns, that is,
 n 

Sn (BAH(b1 )) = b1 · xt ,
t=1

whereb · x denotes 
 the inner product b x. The BAH strategy with uniform portfolio
b1 = m1 , . . . , m1 is referred to as the uniform BAH strategy, which is usually adopted
as a market strategy to produce a market index.†

3.2 Best Stock Strategy


Another common benchmark is the Best Stock (Best) strategy, which is a special BAH
strategy that invests all capital on the best
 stock in hindsight. Its initial portfolio b0 can
n
be calculated as, b = arg maxb∈m b ·
0
t=1 xt , which is thus a hindsight strategy.
The strategy’s final cumulative wealth equals
 n 

Sn (Best) = max b · xt = Sn (BAH(b0 )).
b∈m
t=1

∗ For example, assuming two assets with price relative vectors (2, 1) and the portfolio at the beginning of
(0.5×2,0.5×1)
a period is (0.5, 0.5), then the actual weights at the end of the period becomes 0.5×2 + 0.5×1 = (0.67, 0.33).
† Market index can also be calculated using other methods, such as capitalization weighted index and
market share weighted index.

21

T&F Cat #K23731 — K23731_C003 — page 21 — 9/26/2015 — 8:11


22 BENCHMARKS
3.3 Constant Rebalanced Portfolios
One challenging benchmark is the Constant Rebalanced Portfolios (CRP) strategy,
which rebalances to a fixed portfolio b every period.∗ In particular, the portfolio
strategy can be represented as bn1 = {b, b, . . . , b}, in which b is a predefined portfolio.
Thus, CRP’s final cumulative portfolio wealth can be calculated as

n
Sn (CRP(b)) = b  xt .
t=1
 
One special CRP with uniform portfolio b = m1 , . . . , m1 is named as Uniform Con-
stant Rebalanced Portfolios (UCRP). Another special CRP is the optimal offline†
CRP strategy, whose portfolio can be calculated as

n
 
b = arg max Sn (CRP(b)) = arg max b xt ,
bn ∈m b∈m t=1

which is convex and can be efficiently solved. The CRP with b is denoted as Best
Constant Rebalanced Portfolios (BCRPs), which achieve a final cumulative wealth as

Sn (BCRP) = max Sn (CRP(b)) = Sn (CRP(b ))


b∈m

Note that BCRP is a hindsight strategy, which can only be calculated with com-
plete market sequences. Cover (1991) proved that BCRP is the best strategy in an
independent and identically distributed (i.i.d.) market and showed its benefits as a
target, that is, BCRP exceeds the Best Stock strategy, Value Line Index (geometric
mean of asset returns), and Dow Jones Index (arithmetic mean of asset returns, or
BAH). In addition, BCRP is invariant under permutations of market sequence, that
is, it does not depend on the order in which x1 , x2 , . . . , xn occur.
One desired theoretical result for an OLPS algorithm is universality (Cover 1991;
Ordentlich 2010). An algorithm Alg is universal if the average (external) regret
(Stoltz and Lugosi 2005; Blum and Mansour 2007) for n periods asymptotically
approaches 0, that is,
1 1 n→∞
regretn (Alg) = (log Sn (BCRP) − log Sn (Alg)) −−−→ 0. (3.1)
n n
In other words, for an arbitrary sequence of price relatives, a universal algorithm
asymptotically approaches the same exponential growth rate as the BCRP strategy.
Since CRP rebalances to a fixed portfolio each period, its frequent transactions
will incur high transaction costs. Helmbold et al. (1998) proposed a Semi-Constant
Rebalanced Portfolio, which rebalances on selected periods rather than every period.
∗ CRP differs from BAH as the former actively rebalances to a predefined portfolio for every period,
while the latter does not rebalance during the entire trading period. However, the portfolio holding of BAH
passively changes as the stock prices fluctuate.
† Contrary to the online case, offline assumes that all price relatives over the n periods are available.

T&F Cat #K23731 — K23731_C003 — page 22 — 9/26/2015 — 8:11


Chapter 4

Follow the Winner

The first principle, follow the winner, is characterized by increasing the weights of
more successful experts or stocks. Rather than targeting market and best stock, algo-
rithms in this category often aim to track the BCRP strategy, that is, their target is to
be universal.
This chapter is organized as follows. Section 4.1 introduces Cover’s universal
portfolios (UP) algorithm, and Section 4.2 details the exponential gradient (EG) algo-
rithm. Sections 4.3 and 4.4 introduce the follow the leader (FTL) and follow the
regularized leader (FTRL) approaches, respectively. Finally, Section 4.5 summarizes
the follow the winner principle.

4.1 Universal Portfolios


The basic idea of Universal Portfolios–type (UP-type) algorithms is to assign the
capital to base experts of a single class, let the experts run, and finally pool their wealth.
They are analogous to the Buy-and-Hold (BAH) strategy. In particular, BAH’s base
experts belong to a special strategy investing on a single asset, and thus the number
of experts equals that of assets. In other words, BAH buys individual stocks, lets
the stocks go, and finally pools their individual wealth. On the other hand, the base
experts in the UP-type algorithms can be any single strategy class that invests over the
whole markets. Besides, UP-type algorithms are also similar to the meta-algorithms
(MA) in Chapter 7, while the latter applies to base experts of multiple classes.
Cover (1991) proposed the universal portfolio strategy, and Cover and Ordentlich
(1996) further refined the algorithm as μ-weighted universal portfolio, in which μ
denotes a given distribution on the space of valid portfolio m . Intuitively, Cover’s
UP operates similar to a fund of funds (FOF),∗ and its main idea is to buy and hold
parameterized CRP strategies over the whole simplex domain. In particular, it initially
invests a proportion of wealth dμ(b) to each portfolio manager operating CRP strategy
with b ∈ m , and lets the CRP managers run. Then, at the end, each manager will
grow his wealth to Sn (b)dμ(b). Finally, Cover’s UP pools the individual experts’

∗An FOF holds a portfolio of other investment funds, rather than directly investing in stocks, futures,
etc.

23

T&F Cat #K23731 — K23731_C004 — page 23 — 9/28/2015 — 21:07


24 FOLLOW THE WINNER
wealth over the continuum of portfolio strategies. Note that Sn (b) = enWn (b) , which
means that the portfolio grows at an exponential rate of Wn (b).
Formally, its update scheme (Cover and Ordentlich 1996, Definition 1) can
be interpreted as a historical performance-weighted average of all valid constant
rebalanced portfolios, 
 bSt (b)dμ(b)
bt+1 =  m .
m St (b)dμ(b)

Note that at the beginning of period t + 1, one CRP manager’s wealth (historical
performance) equals St (b)dμ(b). Incorporating the initial wealth of S0 = 1, the final
cumulative wealth is the weighted average of CRP managers’ wealth (Cover and
Ordentlich 1996, Eq. (24)):

Sn (UP) = Sn (b)dμ(b). (4.1)
m

One special case is that μ equals a uniform distribution; the portfolio update
 reduces
to Cover’s UP (Cover 1991, Eq. (1.3)). Another special cases is Dirichlet 12 , . . . , 12
weighted UP (Cover and Ordentlich 1996), which is proved to be a more optimal
allocation.
Alternatively, if a loss function is defined as the negative logarithmic function
of portfolio return, Cover’s UP is actually an exponentially weighted average fore-
caster (Cesa-Bianchi and Lugosi 2006). The regret (Cover 1991) achieved by Cover’s
UP is O(m log n), and its time complexity is O(nm ), where m denotes the number
of stocksand n refers  to the number of periods. Cover and Ordentlich (1996) proved
that the 12 , . . . , 12 weighted UP has the same scale of regret bound, but a better
constant term (Cover and Ordentlich 1996, Theorem 2).
As Cover’s UP is based on an ideal market model, one research direction is to
extend the algorithm to handle various realistic assumptions. Cover and Ordentlich
(1996) considered side information, including experts’ opinions and fundamental
data. Cover and Ordentlich (1998) extended the algorithm to handle short selling and
margin, and Blum and Kalai (1999) took account of transaction costs.
Another research direction is to generalize Cover’s UP with different base classes,
rather than the CRP strategy. Jamshidian (1992) generalized the algorithm for con-
tinuous time markets and presented its long-term performance. Vovk and Watkins
(1998) applied the aggregating algorithm (AA) (Vovk 1990) to a finite number of
arbitrary investment strategies, of which Cover’s UP becomes a specialized case
when applied to an infinite number of CRPs. Ordentlich and Cover (1998) analyzed
the minimal ratio of final wealth achieved by any nonanticipating investment strategy
to that of BCRP and presented a strategy to achieve such an optimal ratio. Cross and
Barron (2003) generalized Cover’s UP from the CRP strategy class to any parameter-
ized target class and proposed a computation favorable universal strategy. Akcoglu
et al. (2005) extended Cover’s UP from a parameterized CRP class to a wide class
of investment strategies, including trading strategies operating on a single stock and
portfolio strategies operating on the whole stock market. Kozat and Singer (2011)
proposed a similar universal algorithm based on the class of semiconstant rebalanced

T&F Cat #K23731 — K23731_C004 — page 24 — 9/28/2015 — 21:07


EXPONENTIAL GRADIENT 25
portfolios (Helmbold et al. 1998), which provides good asymptotic performance in
case of nonzero transaction costs.
Besides our intuitive analysis, various works have been proposed to discuss the
connection between Cover’s UP with universal prediction (Feder et al. 1992), data
compression (Rissanen 1983), and Markowitz’s mean variance theory (Markowitz
1952). Algoet (1992) discussed a universal scheme for prediction, gambling, and
portfolio selection. Cover (1996) and Ordentlich (1996) discussed the connection
between UP selection and data compression. Belentepe (2005) presented a statistical
view of Cover’s UP strategy and claimed its approximate equivalence to the mean
variance portfolio theory.
Although Cover’s UP has a tight regret bound, its implementation is exponential
to the number of stocks, which restricts its practical applicability. To handle the
computational issue, Kalai and Vempala (2002) presented an efficient implementation
based on rapidly mixing nonuniform random walks, improving the running time from
original O(nm ) to O(m7 n8 ).

4.2 Exponential Gradient


Algorithms in the exponential gradient type (EG type) focus on the following
optimization formulation:

bt+1 = arg max η log b · xt − R(b, bt ), (4.2)


b∈m

where R(b, bt ) denotes a regularization term and η > 0 denotes a learning rate. One
straightforward interpretation is to track the best stock in the last period while keeping
previous portfolio information via a regularization term.
Helmbold et al. (1998) proposed the EG strategy, which is based on the same
algorithm for mixture estimation (Helmbold et al. 1997). Following Equation 4.2,
EG adopts relative entropy as its regularization term, that is,


m
bi
R(b, bt ) = bi log .
bt,i
i=1

EG’s formulation is convex in b; however, it is hard to solve since the log function is
nonlinear. Thus, the authors adopted log’s first-order Taylor expansion at bt , that is,
xt
log b · xt ≈ log(bt · xt ) + (b − bt ).
bt · xt
Then the nonlinear log term becomes linear and the optimization is easy to solve.
Solving the optimization, we can obtain EG’s update rule as
 
xt,i 
bt+1,i = bt,i exp η Z, i = 1, . . . , m,
bt · xt

where Z denotes the normalization term such that the portfolio weights sum to 1.

T&F Cat #K23731 — K23731_C004 — page 25 — 9/28/2015 — 21:07


26 FOLLOW THE WINNER
Besides the multiplicative update rule (EG), the optimization problem can also
be solved using the gradient projection (GP) and expectation–maximization (EM)
(Helmbold et al. 1997). Rather than EG’s relative entropy, GP adopts an L2-norm
regularization, and EM adopts an χ2 regularization term, that is,
 1 m
i=1 (bi − bt,i )
2 GP
2
R(b, bt ) = 1 m (bi −bt,i )2 .
2 i=1 bt,i EM

Solving corresponding optimization problems, we can obtain GP’s update rule as


⎛ ⎞
x 1 
m
xt,j
bt+1,i = bt,i + η ⎝ ⎠,
t,i

bt · xt m bt · xt
j =1

and EM’s update rule as


   
xt,i
bt+1,i = bt,i η −1 +1 .
bt · xt

The latter can also be viewed as EG’s first-order approximation.


One key parameter for the EG-type algorithms is the learning rate η. To achieve a
universal regret bound, η has to be small. However, as η → 0, its update approaches
uniform,∗ which degrades to UCRP. Such an analysis will be empirically verified in
Section 13.6. 
EG has a regret bound of O( n log m) and a running time of O(mn). The regret
is not as tight as Cover’s UP; however, its linear time substantially surpasses that of
UP. Besides, the authors also proposed a variant by transforming all price relatives,
which has a tight regret bound of O(m log n). Though not officially proposed for
online portfolio√ selection (Helmbold et al. 1997), GP can straightforwardly achieve
a regret of O( mn), which is significantly worse than EG.
Das and Banerjee (2011) generalized EG-type algorithms to an MA named online
gradient updates, which combine underlying experts such that the overall system
performs no worse than any convex combination of its base experts.

4.3 Follow the Leader


FTL strategies directly track the BCRP till time t:


t
bt+1 = b∗t = arg max log(b · xτ ). (4.3)
b∈m τ=1

Intuitively, this category follows the BCRP leader over the known periods, and the
ultimate leader is BCRP over the whole periods.

∗ In case of η = 0, b 1
t+1,i = bt,i = · · · = b1,i = m .

T&F Cat #K23731 — K23731_C004 — page 26 — 9/28/2015 — 21:07


FOLLOW THE REGULARIZED LEADER 27
Ordentlich (1996, Chapter 4.4) briefly mentioned a strategy to obtain portfolios
by mixing BCRP to date and uniform portfolio:
t 1 1
bt+1 = b∗ + 1.
t +1 t t +1 m
However, its worst-case regret bound is worse than that of Cover’s UP.
Gaivoronski and Stella (2000) proposed successive constant rebalanced port-
folios (SCRPs) and weighted successive constant rebalanced portfolios (WSCRPs)
for stationary markets. For each period, SCRP directly adopts the BCRP to date,
bt+1 = b∗t . The authors further solved the optimal portfolio b∗t via stochastic opti-
mization (Birge and Louveaux 1997), resulting in the updates (Gaivoronski and Stella
2000, Algorithm 1). On the other hand, WSCRP outputs a convex combination of the
SCRP and previous portfolio:

bt+1 = (1 − γ)b∗t + γbt ,

where γ ∈ [0, 1] represents a trade-off parameter. The regret bounds achieved by


SCRP and WSCRP are O(m log n), which is the same as that of Cover’s UP.
Rather than assuming a stationary market, some algorithms in this category assume
that the historical market is nonstationary. Gaivoronski and Stella (2000) proposed
variable rebalanced portfolios (VRP), which calculate the BCRP on a latest sliding
window. To be specific, VRP updates the portfolio as


t
bt+1 = arg max log(b · xτ ),
b∈m τ=t−W +1

where W denotes a specified window size.


Gaivoronski and Stella (2003) further proposed adaptive portfolio selection
(APS). By changing the objective, APS can handle three portfolio selection tasks,
that is, adaptive Markowitz portfolio, log-optimal constant rebalanced portfolio, and
index tracking. To handle the transaction cost issue, they further proposed threshold
portfolio selection, which only rebalances the portfolio if the expected return of a new
portfolio exceeds that of the last portfolio by a threshold.

4.4 Follow the Regularized Leader


The FTRL approach adds a regularization term to Equation 4.3:


t
β
bt+1 = arg max log(b · xτ ) − R(b), (4.4)
b∈m 2
τ=1

where β denotes a trade-off parameter and R(b) is the regularization term on b. Note
that the first term includes all historical information; thus, the regularization term only
relates to the next portfolio, which is different from the EG algorithm. One typical
regularization is L2-norm, that is, R(b) = b 2 .

T&F Cat #K23731 — K23731_C004 — page 27 — 9/28/2015 — 21:07


28 FOLLOW THE WINNER
Agarwal et al. (2006) proposed the online Newton step (ONS), by solving the
optimization problem (4.4) with L2-norm regularization via online convex optimiza-
tion (Zinkevich 2003; Hazan 2006; Hazan et al. 2006, 2007). Similar to the regular
offline Newton method, the basic idea of ONS is to replace the log term via its second-
order Taylor expansion at bt , and then solve for the closed-form updates. Finally, the
update rule of ONS is
 
1 1 −1
b1 = ,..., , bt+1 = At
m (δAt ct ),
m m

t  xτ xτ    t xτ
with At = τ=1 (bτ ·xτ )2 + Im and ct = 1 + β1 τ=1 bτ ·xτ , where β is a trade-off
parameter, δ is a scaling term, Im denotes an m × m diagonal matrix, and A t
m (·) is
an exact projection to the simplex domain.
ONS’s regret bound is O(m1.5 log(mn)), which is slightly worse than that of
Cover’s UP. Since it iteratively updates the first- and second-order information, it
costs O(m3 ) per period, which is irrelevant to the number of periods. To sum up, its
total time cost is O(m3 n).
While FTRL focuses on the worst-case investing, Hazan and Kale (2009, 2012)
linked the worst-case investing with practically widely used average-case investing,
that is, the geometric Brownian motion (GBM) model (Bachelier 1900; Osborne 1959;
Cootner 1964). The authors designed an investment strategy that is universal in the
worst case and is capable of exploiting the GBM model. The algorithm, or so-called
Exp-Concave-FTL, follows a similar formulation to ONS, that is,


t
1
bt+1 = arg max log(b · xτ ) − b 2 .
b∈m 2
τ=1

The optimization problem can be efficiently solved via online convex optimization,
which typically requires a high time complexity (i.e., similar to the ONS). If the stock
price follows the GBM model, the regret round becomes O(m log Q), where Q is a
quadratic variability calculated as n − 1 times the sample variance of price relative
vectors. Since Q is typically much smaller than n, the regret bound is significantly
improved from previous O(m log n).
Besides the improved regret bound, the authors also discussed the relationship
between their algorithm and trading frequency. The authors asserted that increasing
the trading frequency would decrease the variance of minimum-variance CRP, while
the regret stays the same. Therefore, it is expected to see improved performance
as the trading frequency increases, which is empirically observed by Agarwal et al.
(2006).
Das and Banerjee (2011) further extended the FTRL approach to a generalized MA
termed online Newton update (ONU), which guarantees that the overall performance
is no worse than any convex combination of the base experts.

T&F Cat #K23731 — K23731_C004 — page 28 — 9/28/2015 — 21:07


SUMMARY 29
4.5 Summary
Follow the winner is the main principle of online portfolio selection research. While
most algorithms in this category are guaranteed by the theory (regret bound), their
empirical performance is not outstanding (cf. the empirical results in Chapter 13).
We believe that the main reason for this phenomenon is that their target in hindsight,
or BCRP, assumes that price relatives follow i.i.d., which may be contradictory to
the empirical evidence of real markets. The next chapter will introduce a different
principle, follow the loser, which makes a different assumption regarding market
behaviors.

T&F Cat #K23731 — K23731_C004 — page 29 — 9/28/2015 — 21:07


Chapter 5

Follow the Loser

The Best Constant Rebalanced Portfolios (BCRP) strategy is optimal if the market
is independent and identically distributed (i.i.d.; Cover 1991); however, this assump-
tion may not fit the real market and thus may lead to the inferior performance of the
“follow the winner” category. Rather than tracking the winners, the follow the loser
approach is often characterized by transferring the wealth from winners to losers. The
underlying assumption is the mean reversion (contrarian) idea (Bondt and Thaler
1985), which means that good (poor)-performing assets will perform poor (good)
in the subsequent periods. Thus, follow the loser’s approaches often are character-
ized by transferring capital from poor-performing assets (losers) to good-performing
assets (winners). Although this principle is heavily investigated in finance journals,
it has not been widely disseminated in the topic of online portfolio selection. How-
ever, some algorithms do follow this principle. One famous example is the CRP
benchmark. Moreover, Cover’s UP, which buys and holds CRP strategies, can also be
viewed as follows the loser approach from the underlying stocks’ perspective, while
we categorize it as follow the winner from the experts’ perspective.
This chapter is organized as follows. Section 5.1 illustrates the mean rever-
sion idea, which is the key underlying the “follow the loser” principle. Section 5.2
introduces a representative strategy in this category, or the Anticor strategy. Finally,
Section 5.3 summarizes the follow the loser principle.

5.1 Mean Reversion


Besides the momentum-related idea that assumes that the stock price will continue
its previous trend, there exists another different idea, or the mean reversion (con-
trarian) idea, which assumes that the assets’ prices will revert to their means. Thus,
the follow the loser algorithms will transfer the wealth from outperforming assets to
underperforming assets.
This section illustrates a simple but convincing example to show the mean rever-
sion idea. Consider
  a fluctuating
 market with two assets (A,B), and the price relative
sequence is 12 , 2 , 2, 12 , . . ., where each asset is not going anywhere but actively
moving within a range (Table 5.1). Obviously, in the long run, a market strategy can-
not achieve any abnormal return since the cumulative wealth of each stock remains

31

T&F Cat #K23731 — K23731_C005 — page 31 — 9/30/2015 — 16:42


32 FOLLOW THE LOSER
Table 5.1 Motivating example to show the mean reversion trading idea
BCRP Adjusted
Period # Market (A,B) BCRP Return Weights Notes
1 (1/2, 2) (1/2, 1/2) 5/4 (1/5, 4/5) B −→ A
2 (2, 1/2) (1/2, 1/2) 5/4 (4/5, 1/5) A −→ B
3 (1/2, 2) (1/2, 1/2) 5/4 (1/5, 4/5) B −→ A
.. .. .. .. .. ..
. . . . . .

the
 5 nsame after 2n periods. However, BCRP in hindsight can achieve a growth rate of
4 for a n-trading period.
Now let us analyze the BCRP’s behaviors to show the underlying
  mean rever-
sion trading idea (Table 5.1). Suppose the initial portfolio is 12 , 12 and at the end of
 
period 1, the close price adjusted portfolio distribution becomes 15 , 45 and cumulative
wealth increases by a factor of 54 .At the beginning of period 2, portfolio manager rebal-
 
ances to initial portfolio 12 , 12 by transferring the wealth from a better-performing
asset (B) to a worse-performing asset (A). At the beginning of period 3, the wealth
transfer with the mean reversion trading idea continues. Although the market strategy
gains nothing, BCRP can achieve a growth rate of 54 per period with the underly-
ing mean reversion trading idea, which assumes that if one asset performs worse,
it tends to perform better in the subsequent trading period. It actually gains profit
via the volatility of the market, or so-called volatility pumping (Luenberger 1998,
Chapter 15).
Though extensive studies in finance show that mean reversion is a plausible idea
to be used in trading (Chan 1988; Poterba and Summers 1988; Lo and MacKinlay
1990; Conrad and Kaul 1998), its counterintuitive nature hides it from the OLPS
community. While the “follow the winner” strategies are sound in theory, they often
perform poorly when using real data, which will be shown in the empirical studies
in Part IV. Perhaps the reason is that their momentum principle does not fit the real
market, especially on the tested trading frequency (such as daily). It is thus natural
to utilize the mean reversion idea in developing new strategies so as to boost the
empirical performance.

5.2 Anticorrelation
Borodin et al. (2004) proposed a follow the loser strategy named an Anticorrelation
(Anticor). Instead of making no distributional assumption like Cover’s UP, Anticor
assumes that the market follows the mean reversion principle. To exploit the property,
it statistically makes bets on the consistency of positive lagged cross-correlation and
negative autocorrelation.

T&F Cat #K23731 — K23731_C005 — page 32 — 9/30/2015 — 16:42


SUMMARY 33
Anticor adopts logarithmic price relatives (Hull 1997) in two specific market
windows, that is, y1 = log(xt−2w+1
t−w
) and y2 = log(xt−w+1
t ). It then calculates a cross-
correlation matrix between y1 and y2 ,
1    
Mcov (i, j ) = y1,i − ȳ1 y2,j − ȳ2
w−1

Mcov (i,j )
Mcor (i, j ) = σ1 (i)×σ2 (j ) σ1 (i), σ2 (j ) = 0 .
0 otherwise

Following the mean reversion principle, Anticor transfers weights from the assets
increased more to the assets increased less, and the corresponding amounts are
adjusted by the cross-correlation matrix. In particular, if asset i increases more than
asset j and they are positively correlated, Anticor claims a transfer from asset i to
j with the amount equaling the cross-correlation (Mcor (i, j )) minus their negative
auto-correlation (min{0, Mcor (i, i)} and min{0, Mcor (j, j )}). Finally, these claims
are normalized to keep the portfolio in the simplex domain.
With the mean reversion nature, it is difficult to obtain a useful regret bound for
Anticor. Although heuristic and without theoretical guarantee, Anticor empirically
outperforms all other strategies at the time. On the other hand, though Anticor obtains
good performance, its heuristic nature cannot fully exploit mean reversion. Thus,
exploiting the property via systematic learning algorithms is highly desired, which
motivates one part of our research.

5.3 Summary
Although counterintuitive, the follow the loser principle is quite useful in obtaining
a high cumulative return in the empirical studies. This may be attributed to the fact
that many financial research studies have validated that the market behaviors follow
the mean reversion principle. Thus, to better exploit the market, a trading strategy has
to incorporate the market behaviors. We further propose three novel mean reversion-
based algorithms in Chapters 9, 10, and 11, respectively.

T&F Cat #K23731 — K23731_C005 — page 33 — 9/30/2015 — 16:42


Chapter 6

Pattern Matching

Besides follow the winner and follow the loser, another category utilizes both
winners and losers, and it is based on pattern matching. This category mainly covers
nonparametric sequential investment strategies, which guarantee an optimal growth
of capital under minimal assumptions on the market, that is, stationary and ergodic
of the financial time series. Based on nonparametric prediction (Györfi and Schäfer
2003), this category consists of several pattern matching–based investment strate-
gies (Györfi et al. 2006, 2007, 2008; Li et al. 2011a). Note that in the data-mining
communities, some researchers focus on detecting important signals or patterns in
time series (Mcinish and Wood 1992; Berndt and Clifford 1994; Agrawal and Srikant
1995; Srikant and Agrawal 1996; Ting et al. 2006; Cañete et al. 2008; Du et al. 2009),
which is beyond our discussion.
In general, the pattern matching–based approaches (Györfi et al. 2006) consist of
two steps, that is, the sample selection and portfolio optimization steps. Suppose we are
choosing a portfolio for period t + 1. First, the sample selection step selects a set Ct of
similar historical indices, whose corresponding price relatives will be used to predict
the next one. Then, each price relative vector xi , i ∈ Ct , is assigned a probability of
Pi , i ∈ Ct . Existing methods often choose uniform probability Pi = |C1t | , where | · |
denotes the cardinality of a set. Second, the portfolio optimization step learns an
optimal portfolio based on the selected set, that is,

bt+1 = arg max U (b, Ct ),


b∈m

where U (·) is a specified utility function, such as log utility. In case of an empty
sample set, a uniform portfolio is adopted.
In this chapter, we concretize the sample selection step in Section 6.1 and the port-
folio optimization step in Section 6.2. We finally combine the two steps to formulate
specific online portfolio selection algorithms in Section 6.3. Based on the principle,
we further proposed the correlation-driven nonparametric learning (CORN) algorithm
in Chapter 8.

35

T&F Cat #K23731 — K23731_C006 — page 35 — 9/26/2015 — 8:11


36 PATTERN MATCHING
6.1 Sample Selection Techniques
The general idea of this step is to select similar samples from historical price relatives
by comparing two preceding market windows. Suppose we are locating the price rel-
ative vectors that are similar to the next vector xt+1 . The basic routine is to iterate all
historic price relatives xi , i = w + 1, . . . , t and count xi as one similar vector, if its
i−1 t
preceding market window xi−w is similar to the latest market window xt−w+1 . A set
Ct contains the indexes of similar price relatives. Note that the market window
is a w × m-matrix and the similarity is typically calculated on the concatenated
w × m-vectors. Algorithm 6.1 further illustrates the procedure.

Algorithm 6.1: Sample selection procedure (C(x1t , w)).


Input: x1t : Historical market sequence; w: window size.
Output: Ct : Index set of similar price relatives.
Initialize Ct = ∅;
if t ≤ w then
return;
end
for i = w + 1, w + 2, . . . , t do
i−1 t
if xi−w is similar to xt−w+1 then
Ct = Ct ∪ i;
end
end

A nonparametric histogram-based sample selection (Györfi and Schäfer 2003)


predefines a set of discretized partitions, partitions both the latest market win-
i−1
t
dow (xt−w+1 ) and historical market windows (xi−w , i = w + 1, . . . , t), and finally
i−1
chooses price relatives xi whose preceding market window (xi−w ) is in the same
partition as xt−w+1 . In particular, given a partition P = Aj , j = 1, 2, . . . , d, which
t

discretizes Rm + into d disjoint sets, and a corresponding discretization function


G(x) = j , we can define the similarity set as
    t   i−1 
CH x1t , w = w < i < t + 1 : G xt−w+1 = G xi−w .
Nonparametric kernel-based sample selection (Györfi et al. 2006) identifies the
similarity set by evaluating the Euclidean distance between two market windows,
    t  c
CK x1t , w = w < i < t + 1 : xt−w+1 i−1 
− xi−w ≤ ,

where c and  are thresholds used to control the number of similar samples.
Nonparametric nearest neighbor-based sample selection (Györfi et al. 2008)
searches price relatives whose preceding market windows are within the k nearest
neighbors of the latest market window, that is,
   
CN x1t , w = w < i < t + 1 : xi−w i−1 t
is among the k NNs of xt−w+1 ,
where k is a threshold parameter.

T&F Cat #K23731 — K23731_C006 — page 36 — 9/26/2015 — 8:11


PORTFOLIO OPTIMIZATION TECHNIQUES 37
6.2 Portfolio Optimization Techniques
The second step of pattern matching–based approaches is to construct an optimal
portfolio based on the sample set Ct . Two main principles are Kelly’s (1956) capital
growth portfolio and Markowitz’s (1952) mean variance portfolio.
Györfi et al. (2006) proposed to figure out a log-optimal (Kelly) portfolio, based
on similar price relatives, which clearly follows the capital growth portfolio theory.
Given a sample set Ct , the log-optimal utility function is defined as

UL (b, Ct ) = E{log b · x|xi , i ∈ Ct } = Pi log b · xi ,
i∈Ct

where Pi denotes the probability assigned to xi , i ∈ Ct . Györfi et al. (2006) assumed


a uniform probability, thus equivalently,

UL (b, Ct ) = log b · xi . (6.1)
i∈Ct

Maximizing the above function results in a BCRP portfolio (Cover 1991) over the
similar price relatives.
Györfi et al. (2007) introduced semi-log-optimal utility function, which approx-
imates log utility in Equation 6.1 aiming to release its computational complexity;
and Vajda (2006) presented corresponding theoretical analysis and proved its
universality. The semi-log-optimal utility function is defined as

US (b, Ct ) = E{f (b · x)|xi , i ∈ Ct } = Pi f (b · xi ),
i∈Ct

where f (·) is the second-order Taylor expansion of log z with respect to z = 1, that
is,
1
f (z) = z − 1 − (z − 1)2 .
2
Györfi et al. (2007) adopted a uniform probability of Pi , thus, equivalently,

US (b, Ct ) = f (b · xi ).
i∈Ct

Ottucsák and Vajda (2007) proposed a Markowitz-type utility function, which


further generalizes the semi-log-optimal strategy. The basic idea is to trade off between
portfolio mean and variance, which is similar to Markowitz’s mean variance theory.
To be specific, its utility function is defined as

UM (b, Ct ) = E{b · x|xi , i ∈ Ct } − λVar{b · x|xi , i ∈ Ct }


= E{b · x|xi , i ∈ Ct } − λE{(b · x)2 |xi , i ∈ Ct } + λ(E{b · x|xi , i ∈ Ct })2 ,

where λ is a trade-off parameter. In particular, simple numerical transformations show


that the semi-log-optimal portfolio is one special case of this utility function.

T&F Cat #K23731 — K23731_C006 — page 37 — 9/26/2015 — 8:11


38 PATTERN MATCHING
To solve the problem with transaction costs, Györfi and Vajda (2008) proposed
a GV-type utility function∗ by incorporating the transaction costs (Gyorfi and Walk
2012),
UT (b, Ct ) = E{log b · x + log w(bt , b, xt )},
where w(·) ∈ (0, 1) is the transaction cost factor, which represents the remaining pro-
portion after transaction costs. With a uniform probability assumption, it is equivalent
to calculate: 
UT (b, Ct ) = (log b · xi + log w(bt , b, xt )).
i∈Ct

In any of the above procedures, if the similarity set is non-empty, we can obtain an
optimal portfolio based on the similar price relatives and their assumed probability. In
the case of an empty set, we can choose either a uniform portfolio or the last portfolio.

6.3 Combinations
Finally, let us combine the two steps and describe specific algorithms in the pattern
matching–based approach. Table 6.1 summarizes all existing combinations.
One default utility function is the log-optimal function. Györfi and Schäfer (2003)
introduced the nonparametric histogram-based log-optimal investment strategy (BH ),
which combines the histogram-based sample selection and log-optimal utility func-
tion. Györfi et al. (2006) presented the nonparametric kernel-based log-optimal
investment strategy (BK ), which combines the kernel-based sample selection and
log-optimal utility function. Györfi et al. (2008) proposed the nonparametric near-
est neighbor log-optimal investment strategy (BNN ), which combines the nearest
neighbor sample selection and log-optimal utility function.
Besides the log-optimal utility function, several algorithms using different util-
ity functions have been proposed. Györfi et al. (2007) proposed the nonparametric
kernel-based semi-log-optimal investment strategy (BS ) by combining the kernel-
based sample selection and semi-log-optimal utility function, which greatly eases

Table 6.1 Pattern matching–based approaches: sample selection and portfolio optimization
Sample Selection Techniques
Portfolio Optimization Histogram Kernel Nearest Neighbor
Log-optimal BH :CH + U L BK : CK + UL BNN : CN + UL
Correlation-driven — CORN —
Semi-log-optimal — B S : CK + US —
Markowitz-type — B M : CK + UM —
GV-type — BGV : CK + UR —
Note: —, no algorithm in the combinations.

∗Algorithm 2 in Györfi and Vajda (2008).

T&F Cat #K23731 — K23731_C006 — page 38 — 9/26/2015 — 8:11


SUMMARY 39
the computation of BK . Ottucsák and Vajda (2007) proposed the nonparametric
kernel-based Markowitz-type investment strategy (BM ) by combining the kernel-
based sample selection and Markowitz-type utility function. Györfi and Vajda (2008)
proposed the nonparametric kernel-based GV-type investment strategy (BGV ) by
combining the kernel-based sample selection and GV-type utility function to select
portfolios in case of nonzero transaction costs.

6.4 Summary
This chapter summarizes the pattern matching–based principle, which mainly includes
pattern-matching and portfolio optimization steps. Empirically, these algorithms
exploit recurring patterns over the history and produce good empirical performance.
One of its key problems is to identify the recurring patterns, which leads to our CORN
strategy in Chapter 8.

T&F Cat #K23731 — K23731_C006 — page 39 — 9/26/2015 — 8:11


Chapter 7

Meta-Learning

Another research topic in online portfolio selection (OLPS) is meta-learning,


or meta-algorithms (MAs) (Das and Banerjee 2011), which is closely related to
expert learning (Cesa-Bianchi and Lugosi 2006). This is directly applicable to the
“fund of fund” (FOF),∗ which delegates portfolio capital to other funds. In general,
MA defines several base experts, each of which is equipped with strategies from the
same strategy class or different classes, or even MAs. Each expert outputs a portfolio
vector, and MA combines these portfolios to form a final portfolio, which is used for
rebalance. The whole system can achieve the best performance among the experts
in hindsight, which thus is desired for some nonuniversal algorithms. MAs are sim-
ilar to Cover’s UP algorithm in the follow the winner approach; however, they are
proposed to handle different classes of experts, among which UP’s CRP becomes a
special case. On the one hand, MAs can be used to smooth the final performance with
respect to all experts, especially when base experts are sensitive to certain environ-
ments/parameters. On the other hand, combining universal algorithms and heuristic
algorithms, which is not easy to obtain a theoretical regret bound, can provide the
universality property for the whole system. Finally, MAs can be applied to all existing
approaches and thus have much broader areas of application.

7.1 Aggregating Algorithms


Though BCRP is optimal for an independent and identically distributed (i.i.d.) market,
which is often suspected in real markets, the optimal portfolio may not belong to CRP.
Several algorithms have been proposed to track a different set of experts. The base
experts in this category belong to a special class rather than complex experts from
multiple classes.
Vovk and Watkins (1998) applied the aggregating algorithm (AA) (Vovk 1990,
1997, 1999, 2001) to the OLPS task, of which Cover’s UP is a special case. The general
setting for AA is to define a countable or finite set of base experts and sequentially
allocate the resource among multiple base experts to achieve a good performance that

∗ FOF selects portfolios on different fund managers, rather than on assets. For example, an FOF
manager may evenly split his fund, and put one part to fund A and the other to Fund B.

41

T&F Cat #K23731 — K23731_C007 — page 41 — 9/28/2015 — 21:15


42 META-LEARNING
is no worse than any fixed combination of underlying experts. Its portfolio update
formula (Vovk and Watkins 1998, Algorithm 1) for OLPS is
 η
b t−1
m i=1 (b · xt ) P0 (db)
bt+1 =  t−1 .
η
m i=1 (b · xt ) P0 (db)

As a special case, Cover’s UP corresponds to AA with uniform prior distribution


and η = 1.
Several further algorithms have been proposed. Singer (1997) proposed the
switching portfolios (SP), which switches among a set of strategies handling dif-
ferent regimes. The author proposed two switching schemes, both of which assume
the duration of base strategies is geometrically distributed. While the first strategy
assumes a fixed distribution, the second assumes that the distribution is dynamically
changing. The authors further presented the lower bound of its logarithmic wealth
with respect to the best switching regime. Empirical evaluations show that SP can
outperform UP, EG, and BCRP.
Levina and Shafer (2008) proposed the Gaussian random walk strategy, which
switches among base experts according to a Gaussian distribution. Kozat and Singer
(2007) extended SP to piecewise fixed fraction strategies, which partitions the peri-
ods into different segments and transits among these segments. Kozat and Singer
(2008) extended Kozat and Singer (2007) to the cases of transaction costs. Kozat and
Singer (2009, 2010) further generalized to sequential decision problems. Kozat et al.
(2008) proposed another piecewise universal portfolio selection strategy via context
trees, and Kozat et al. (2011) also generalized to sequential decision problems via tree
weighting.
SP adopts the notion of regime switching (Hamilton 1994, 2008), which seems to
be more plausible than an i.i.d. market assumption. Regime switching is also applied to
some state-of-the-art trading strategies (Hardy 2001). However, existing geometrical
and Gaussian distributions do not seem to fit the market well, which leads to other
possible distributions that can fit the markets better.

7.2 Fast Universalization


Akcoglu et al. (2005) proposed fast universalization (FU), which extends Cover’s
(1991) UP from a parameterized CRP class to a wide class of investment strate-
gies, including trading strategies operating on a single stock and portfolio strategies
allocating wealth among the whole market. FU’s basic idea is to evenly split the
wealth among base experts, let these experts operate on their own, and finally pool
their wealth. FU’s update is the same as that of Cover’s UP, and it also asymptotically
achieves a growth rate that equals that of an optimal fixed convex combination of base
experts. In cases in which all experts are CRPs, FU would downgrade to Cover’s UP.
Formally, FU’s investment can be described as

St (w)Rt (w)dμ(w)
bt = W , (7.1)
W Rt (w)dμ(w)

T&F Cat #K23731 — K23731_C007 — page 42 — 9/28/2015 — 21:15


SUMMARY 43
where R0 (w) = 1 for the w ∈ W. Note the mean is the same as Cover’s UP, which
equally splits the money among different strategies and lets them run.
Besides the universalization in the continuous parameter space, various dis-
crete buy-and-hold combinations have been adopted by various existing algorithms.
Rewriting Cover’s UP in its discrete form, the update can be straightforwardly
obtained. For example, Borodin et al. (2004) adopted the BAH strategy to com-
bine Anticor experts with respect to a finite number of window sizes (or parameters).
Moreover, all pattern matching–based approaches adopted BAH to combine their
underlying experts, also with a finite number of window sizes (or parameters).

7.3 Online Gradient and Newton Updates


Das and Banerjee (2011) proposed two meta-optimization algorithms, named online
gradient update (OGU) and online Newton update (ONU), which are extended
from exponential gradient (EG) and online Newton step (ONS), respectively. Since
their updates and proofs are similar to their precedents, we ignore their updates.
Theoretically, OGU and ONU can achieve the same growth rate as the optimal con-
vex combination of underlying experts. Particularly, if any base expert is universal,
then the final system enjoys the universality property. This property is useful, as an
MA can combine a heuristic algorithm and a universal algorithm, and the final system
can enjoy both superior heuristic performance and the universality property.

7.4 Follow the Leading History


Hazan and Seshadhri (2009) proposed the follow the leading history (FLH) algorithm
for changing environments. FLH can incorporate various universal base experts, such
as the ONS algorithm. Its basic idea is to maintain a working set of finite experts, which
are dynamically added in and dropped out, and allocate the weights among some active
working experts with an MA, for example, the Herbster–Warmuth algorithm (Herbster
and Warmuth 1998). Different from other MAs with all experts operating from the
beginning, FLH adopts experts starting from different periods. Theoretically, FLH
based on universal algorithms is also universal, and empirically, FLH equipped with
ONS can significantly outperform ONS.

7.5 Summary
Meta-learning is another widely discussed principle in the research of online portfolio
selection (OLPS). It derives from base algorithms but treats these experts as the
underlying assets. Thus, from this aspect, meta-algorithms (MAs) can be widely
applied to all strategies discussed in previous chapters. We are interested in this
principle because practical trading systems usually contain multiple strategies, and
meta-learning can be used to combine these strategies in an effective way.

T&F Cat #K23731 — K23731_C007 — page 43 — 9/28/2015 — 21:15


Part III

Algorithms

45

T&F Cat #K23731 — K23731_S003 — page 45 — 9/26/2015 — 8:13


Chapter 8

Correlation-Driven Nonparametric
Learning

As described in Part II, several approaches have been proposed to select portfolios
from financial markets. The pattern matching–based approach, which is intuitive in
nature, can achieve best performance at the present time. However, one key chal-
lenge to this approach is to effectively locate a set of trading days whose price
relative vectors are similar to the coming one. As detailed in Section 6.1, existing
strategies often adopt Euclidean distance to measure the similarity between two pre-
ceding market windows. Euclidean distance can somehow measure the similarity;
however, it simply considers the neighborhood of the latest market windows and
ignores the linear or nonlinear relationship between two market windows, which is
important for price relative estimation. In this chapter, we propose to exploit similar
patterns via a correlation coefficient, which effectively measures the linear relation-
ship, and further propose a novel pattern matching–based online portfolio selection
algorithm “CORrelation-driven Nonparametric learning” (CORN) (Li et al. 2011a).
The proposed CORN algorithm can better locate a similarity set, and thus can output
portfolios that are more effective than existing pattern matching–based strategies.
Moreover, we also proved CORN’s universal consistency,∗ which is a nice property
for the pattern matching–based algorithms. Further, in Part IV, we will extensively
evaluate the algorithm on several real stock markets, where the encouraging results
show that the proposed algorithm can easily beat both market index and best stock
substantially (without or with small transaction costs) and also surpass a variety of
the state-of-the-art techniques significantly.
This chapter is organized as follows. Section 8.1 motivates the proposed correla-
tion metric for selecting similarity sets. Section 8.2 details the ideas of the proposed
online portfolio selection algorithm, and then Section 8.3 illustrates the proposed
algorithms. Section 8.4 proves CORN’s universal consistency and further analyzes
the proposed algorithms. Finally, Section 8.5 summarizes this chapter and indicates
future directions.

∗ This property is missing in Li et al. (2011a).

47

T&F Cat #K23731 — K23731_C008 — page 47 — 9/28/2015 — 21:18


48 CORRELATION-DRIVEN NONPARAMETRIC LEARNING
8.1 Preliminaries
8.1.1 Motivation
One main idea of existing approaches is to optimize portfolios by mining similar
patterns and information from historical market sequences. Anticor (Borodin et al.
2004) attempts to find statistical relations between pairs of stocks, such as posi-
tive auto covariance and negative cross-covariance, while pattern matching–based
strategies (Györfi et al. 2006, 2008) try to discover similar appearances among histor-
ical markets. Though successful in mining statistical relations among stocks, Anticor
ignores market movements, which are crucial for a portfolio selection task. Moreover,
Anticor is heuristic in nature, which could lead to suboptimal solutions. On the other
hand, existing pattern matching–based strategies (Györfi et al. 2006, 2008) rely on
Euclidean distance to measure the similarity between two market windows. Though
their empirical performance is excellent, the Euclidean distance cannot exploit the
directional information between the two market windows. Therefore, it may detect
some useful price relatives, but often includes some potentially useless or even harm-
ful price relatives and excludes many beneficial price relatives. Such a similarity set
will finally weaken the following portfolio optimization step, resulting in less effective
portfolios.
To better understand the drawbacks of Euclidean distance in measuring the simi-
larity between two market windows, we give a motivating example in Figure 8.1. Let
us assume that all market windows consist of two price relatives, such as a market of
one asset and the window size is two, or a market with two assets and the window size
t−1
equals one. Let the latest market window for the t-th period be xt−2 = (1.10, 1.20).
t−1
Clearly, xt−2 shows an increasing trend, and we aim to locate similar market windows
that also show increasing trends. Suppose we have three possible pairs of market win-
dows: A1: (0.90, 0.80), A2: (0.80, 0.90); B1: (1.2, 1.1), B2: (1.1, 1.2); C1: (1.4, 1.3),
C2: (1.3, 1.4). Note that in a long-only portfolio, relative trends, rather than absolute
trends, determine the allocations of capital.∗ For example, although A2 contains two
decreasing price relatives (both 0.90 and 0.80 are less than 1), the market sequence is
relatively increasing (0.90 > 0.80). In case that the vectors contain two assets, for the
t−1
recent market window xt−2 , it is better to allocate more capital on the second asset
(1.20 > 1.10), which is also the case in A2. However, this is not the case in B1 or C1,
though their absolute price relatives are all increasing.† Among the three pairs, A2,
B2, and C2 show increasing trends, while A1, B1, and C1 show decreasing trends.
Thus, a good similarity measure should classify A2, B2, and C2 as similar appear-
ances, which will benefit the next step, and A1, B1, and C1 as dissimilar appearances,
which will harm the subsequent portfolio optimization step.
Now let us classify these
 i−1market sequences
 via a Euclidean distance measure with
a radius of 0.2,‡ that is, xi−2 t−1 
− xt−2 ≤ 0.2. According to Figure 8.1c, a Euclidean

∗ In our problem setting, there are no cash or risk-free assets. In reality, a weaker constraint (e.g., at
most, 90% of capital can be put in assets), may appear in mutual funds.
† Because their first asset is more favorable than the second one, which is different from the latest xt−1 .
t−2
‡ The radius is arbitrarily chosen to to limit the number of selected price relatives.

T&F Cat #K23731 — K23731_C008 — page 48 — 9/28/2015 — 21:18


PRELIMINARIES 49
Price relatives
C2
1.40
C1

1.20 B2 X t–1
t–2

B1

1.00
... Periods
i–2 i–1 t–2 t–1
A2

0.80 A1
(a)

Sequence Price relatives


t−1
xt−2 (1.1, 1.2)
A1 (0.9, 0.8)
A2 (0.8, 0.9)
B1 (1.2, 1.1)
B2 (1.1, 1.2)
C1 (1.4, 1.3)
C2 (1.3, 1.4)

(b)

t−1
xt−2 A1 A2 B1 B2 C1 C2
Euclidean distances 0.45 0.42 0.14 0 0.32 0.28
Similar? (Y/N) N N Y Y N N
Correlation coefficients −1 1 −1 1 −1 1
Similar? (Y/N) N Y N Y N Y

(c)

Figure 8.1 A motivating example to illustrate the limitation of Euclidean distance.


t−1
(a) Market windows A1, A2, B1, B2, C1, C2, and xt−2 , each of which contains two price
relatives. (b) The price relative vectors. (c) The market windows via Euclidean distance and
correlation coefficient.

measure will classify B1 and B2 to the similarity set, since they are both located
t−1
within the Euclidean ball of xt−2 (with a radius of 0.2). Such a classification is
clearly suboptimal, as it includes harmful B1 and excludes beneficial A2 and C2. As a
consequence of the imperfect similarity set, the subsequent portfolio optimization
will considerably suffer from irrelevant or even harmful market windows (such as
market window B1) and the neglect of beneficial market windows (such as market

T&F Cat #K23731 — K23731_C008 — page 49 — 9/28/2015 — 21:18


50 CORRELATION-DRIVEN NONPARAMETRIC LEARNING
windows A2 and C2). This motivates us to overcome the limitation by exploring a
more effective similarity measure.

8.2 Formulations
The proposed algorithm is mainly inspired by the idea of exploiting statistical correla-
tions between two market windows, and also driven by the consideration of exploring
the powerful nonparametric learning techniques to effectively optimize a portfolio.
Traditional portfolio selection methods in finance often try to estimate a target
function based on past data and build portfolios based on the learned function. How-
ever, since the financial market is complex and accurate modeling of its movements
is a difficult task, we adopt a nonparametric learning approach (or instance-based
learning, or case-based learning) (Aha 1991; Aha et al. 1991; Cherkassky and Mulier
1998). Nonparametric learning makes no assumptions on data distribution (or mar-
ket distribution), and it captures the knowledge from stored training data without
building any target functions. In particular, at the beginning of every period, the pro-
posed algorithm locates similar price relatives among all past price relatives, and then
maximizes the expected multiplicative portfolio return directly based on the similar
appearances. Without estimating any global functions of the market movements, the
proposed algorithm estimates a target value of next price relative.
To overcome the limitation of Euclidean distance in mining historical market
windows and the negligence of whole-market movements in all existing strategies,
we propose to employ the Pearson product–moment correlation coefficient, which is
an effective tool for measuring statistical linear relationships. Note that it measures
the statistical correlations between market windows of all assets, rather than pairs
of assets as Anticor does. Since market windows of all assets represent the whole-
market movements in a period, they could be more effective to match the similar price
relatives regarding the whole market.
Till now, we declare a correlation-similar set that contains historical trading days
whose previous market windows are statistically correlated to the latest one, and
formally define it as
  i−1 t  !
cov xi−w , xt−w+1
Ct (w, ρ) = w < i < t + 1  i−1   t  ≥ρ ,
std xi−w std xt−w+1

where w denotes the window size, −1 ≤ ρ ≤ 1 is a correlation coefficient threshold,


cov(A, B) denotes the covariance between market windows A and B, and std(A)
denotes the standard deviation of market window A. If either std term equals 0, that
is, the market is of zero volatility in a specific window, we simply set the correlation
i−1 t
coefficient to 0. In the above calculation, both matrix xi−w and xt−w+1 are concate-
nated into m × w-dimensional vectors, and we can obtain the univariate correlation
coefficient between the two market windows.
The correlation coefficient measure distinguishes the proposed algorithm from
previous nonparametric learning strategies, which measure the similarity via
Euclidean distance. First, Euclidean distance only considers the magnitude between

T&F Cat #K23731 — K23731_C008 — page 50 — 9/28/2015 — 21:18


ALGORITHMS 51
two market windows, while the proposed correlation coefficient measures their linear
similarity, in both magnitude and direction. On the one hand, it considers the direction.
For example, ρ1 = 0.8 and ρ2 = −0.8 intuitively correspond to equivalent magni-
tudes of linear dependence or similarity; however, they are in opposite directions, that
is, the first market window is in the same trend with the latest market window, and the
other is opposite. On the other hand, it also considers the magnitude. For example,
ρ1 = 0.8 and ρ2 = 0.2 clearly indicate that the first market window is more suitable
than the second one (ρ1 > ρ2 ). Thus, the correlation coefficient measure considers not
only magnitude but also direction, which are appropriately balanced. With such lin-
ear dependence, we can better identify similar price relatives, thus leading to superior
performance. Euclidean distance may also be used to measure directional information
indirectly, for example, by using the slope of two centralized points. However, such
a method only measures the directional information but ignores their magnitude.
Second, to calculate the univariate correlation coefficient, we will calculate the
arithmetic mean of both m × w-dimensional vectors. This mean return is uniformly
distributed among m assets over w periods, which is in essence the market strategy.
As a result, the mean return actually reflects the whole-market movements during the
window. The correlation coefficient measures the linear dependency between two mar-
ket windows, whose means represent the whole-market movements. This, therefore,
distinguishes the proposed strategy from Anticor strategy and existing nonparametric
learning strategies, which ignore the whole market.
Now let us return to the preceding motivating example and select a similarity
set via correlation coefficient metric, with a threshold of 0. Figure 8.1(c) clearly
shows that the metric can correctly classify these market windows, whose results are
identical to our intuitive analysis. In particular, A2, B2, and C2 are classified as similar,
and A1, B1, and C1 are classified as dissimilar. Note that our example is extremely
straightforward and thus results in extreme values (either +1 or −1), which is not
always the case.

8.3 Algorithms
Next, we present the proposed CORN algorithm, which exploits the correlation-
similar set in optimizing portfolios for actively rebalancing.
We start by defining a set of W × P experts, each expert indexed by (w, ρ), that is,

{E(w, ρ) : w ≥ 1, −1 ≤ ρ ≤ 1},

and W represents the maximum window size and P represents the number of corre-
lation coefficient thresholds. Each expert E(w, ρ) represents a CORN expert learning
algorithm and outputs a portfolio, denoted as E(w, ρ) = b(w, ρ).
As summarized in Algorithm 8.1, a CORN expert learning algorithm consists of
two major steps. The first step, as illustrated in Section 8.2, is to locate a correlation-
similar set via the correlation coefficient metric, and the second step is to obtain an
optimized portfolio that can maximize the expected return, which is the main target
of our research. After calculating the correlation-similar set Ct (w, ρ) at the end of

T&F Cat #K23731 — K23731_C008 — page 51 — 9/28/2015 — 21:18


52 CORRELATION-DRIVEN NONPARAMETRIC LEARNING

Algorithm 8.1: CORrelation-driven Nonparametric expert: CORN (w, ρ).


Input: w: Window size; ρ: Correlation coefficient threshold; x1t : Historical
market sequence; t: Index of current period.
Output: bt+1 : Expert’s portfolio for period t + 1.
begin  
Initialize Ct = ∅, bt+1 = m1 , . . . , m1 ;
if t ≤ w then
return bt+1 ;
end
for i = w + 1, w + 2, . . . , t do
i−1
if corrcoef (xi−w t
, xt−w+1 ) ≥ ρ then
Ct = Ct ∪ i;
end
end
if Ct = ∅ then
Search for an optimal portfolio: bt+1 = arg maxb∈m i∈Ct (b · xi );
end
return bt+1 ;
end

period t, we propose to learn an optimal portfolio following the idea of BCRP (Cover
1991), which maximizes the expected multiplicative return over the sequence of
similar price relatives, that is,

bt+1 (w, ρ) = arg max (b · xi ), (8.1)


b∈m i∈Ct (w,ρ)

where m represents an m-dimensional simplex. In case that Ct (w,ρ) is empty(espe-


cially for a large ρ value), we will simply adopt uniform portfolio m1 , . . . , m1 . Note
that the correlation-similar set usually contains a large number of correlated price
relatives. If one similar price relative vector has occurred frequently in history, it will
also appear multiple times in the correlation-similar set. In other words, Equation 8.1
has more or less considered the occurrence/confidence of the correlated price relative
vectors, which would avoid simply taking an extreme case in history.
We further combine all experts according to their historical performance St (w, ρ)
and a probability distribution function q(w, ρ). Specifically, CORN combines experts’
portfolios and calculates the final portfolio for period t + 1 as

w,ρ q(w, ρ)St (w, ρ)bt+1 (w, ρ)
bt+1 =  , (8.2)
w,ρ q(w, ρ)St (w, ρ)

where bt+1 (w, ρ) represents the portfolio computed by expert E(w, ρ) and St (w, ρ)
represents its historical performance. For an individual expert, the higher its historical
return, the higher its weight assigned in the final portfolio.

T&F Cat #K23731 — K23731_C008 — page 52 — 9/28/2015 — 21:18


ALGORITHMS 53
After releasing the price relative vector of xt+1 , CORN updates the cumulative
wealth,
St+1 = St × (bt+1 · xt+1 ).

For the underlying experts, CORN updates their cumulative wealth,

St+1 (w, ρ) = St (w, ρ) × (bt+1 (w, ρ) · xt+1 ),

where St (w, ρ) represents the cumulative wealth achieved by expert E(w, ρ) till
period t.
Therefore, it is straightforward that the cumulative wealth achieved by the pro-
posed CORN strategy after n periods is equivalent to a q-weighted sum of all experts’
returns,

Sn = q(w, ρ)Sn (w, ρ). (8.3)
w,ρ

Clearly, the final cumulative return is affected by all underlying experts, and the
portions of contributions made by each expert are determined by the predefined
distribution q(w, ρ) and expert’s performance Sn (w, ρ).
Ideally, indexed by (w, ρ), we can choose CORN experts such that they cover all
possible parameter settings, thus eliminating their effects. However, the computational
cost of such a combination is inhibitively high. To boost the efficiency, we can choose
finite discrete dimensions of the parameters, that is, a specified number of (w, ρ)
combinations.
The selection of experts also trades off an individual expert’s performance and its
computational time. First, Equation 8.3 clearly shows that each expert contributes to
the final cumulative wealth by its performance; thus, choosing a worse expert may
lower the final performance. Second, the mixture’s computation time is generally the
summation of all experts’ individual time. In other words, choosing too many experts,
which cost too much time, may affect its practical scalability.
In this study, we first adopted uniform combination, which chooses a uniform
distribution of q(w, ρ), and named it “CORN uniform combination” (CORN-U).
Algorithm 8.2 shows the details of the proposed CORN-U algorithm. In particular,
we assign the same weights to all CORN experts, although the weights can be adjusted
if we have more information. Moreover, CORN-U only considers P = 1 and chooses
a specific value of ρ.
The above uniform combination algorithm may include some poor experts, lead-
ing to the degradation of overall performance. To overcome such limitations, the
second algorithm, “CORN top-K combination” (CORN-K), combines only the top
K best experts. Algorithm 8.3 illustrates the proposed CORN-K algorithm. In partic-
ular, it chooses the top K experts with the highest historical returns and uniformly
combines them. That is, the strategy assigns the set of top K experts a uniform distri-
bution q(w, ρ) = K1 , while the weights assigned to other experts are simply set to 0.
Moreover, for the proposed CORN-K algorithm, we define P ≥ 1 associated experts,
each of which has a different ρ value.

T&F Cat #K23731 — K23731_C008 — page 53 — 9/28/2015 — 21:18


54 CORRELATION-DRIVEN NONPARAMETRIC LEARNING

Algorithm 8.2: Online portfolio selection with CORN uniform algorithm


(CORN-U).
Input: W : Maximum window size; ρ: Correlation coefficient threshold;
x1n = (x1 , . . . , xn ): Historical market sequence.
Output: Sn : Final cumulative wealth.
begin
Initialize S0 and W experts: S0 = 1, b1 = m1 1, q(w, ρ) = W1 ,
S0 (w, ρ) = 1, b1 (w, ρ) = m1 1, w = 1, . . . , W ;
for t = 1 to n do
Rebalance the portfolio to bt ;
Receive current price relatives: xt ;
Update the cumulative wealth: St = St−1 × (bt · xt );
Update the experts’ cumulative wealth:
St (w, ρ) = St−1 (w, ρ) × (bt (w, ρ) · xt );
Update next portfolio:
begin
for w = 1 to W do
CORN expert (Algorithm 8.1) finds a portfolio:

bt+1 (w, ρ) = CORN(w, ρ)

end
Combine experts’ portfolios:

q(w, ρ)St (w, ρ)bt+1 (w, ρ)
bt+1 = w 
w q(w, ρ)St (w, ρ)

end
end
end

Remarks on Aggregation: Note that the aggregation or combination rule


described in Equation 8.2 is a special case of the general concept of exponential
weighting. For a learning parameter η > 0, put

 η log St (w,ρ)bt+1 (w,ρ)


w,ρ q(w, ρ)e
bt+1 =  η log St (w,ρ)
.
w,ρ q(w, ρ)e

If η = 1, then one gets the rule in Equation 8.2. The proof of universal consistency of
B H , B K , and B NN works without any difficulties if η ≤ 1. However, the exponential
results are superior if η is much larger than 1, but there is no theoretical support for this
phenomenon. The large η corresponds to CORN-K with K = 1 (i.e., this rule is the

T&F Cat #K23731 — K23731_C008 — page 54 — 9/28/2015 — 21:18


ALGORITHMS 55

Algorithm 8.3: Online portfolio selection with CORN top-K algorithm


(CORN-K).
Input: W : Maximum window size; P : The number of correlation coefficient
thresholds; K: The number for top K experts; x1n = (x1 , . . . , xn ):
historical market sequence.
Output: Sn : Final cumulative wealth.
begin  
Initialize S0 and W × P experts: S0 = 1, P = 0, P1 , . . . , PP−1 ,
q(w, ρ) = W ×P 1
, S0 (w, ρ) = 1, w = 1, . . . , W, ρ ∈ P;
for t = 1 to n do
Rebalance the portfolio to bt ;
Receive current price relatives: xt ;
Update the cumulative wealth: St = St−1 × (bt · xt );
Update the experts’ cumulative wealth:
St (w, ρ) = St−1 (w, ρ) × (bt (w, ρ) · xt );
Update experts’ weights:
begin
Select top K experts K = {(w, ρ)} w.r.t. St (w, ρ) ;
Set weights for the top K experts: q(w, ρ) = K1 , (w, ρ) ∈ K;
Set zero weights for other experts: q(w, ρ) = 0, (w, ρ) ∈ K;
end
Update next portfolio:
begin
for w = 1 to W do
for ρ ∈ P do
CORN expert (Algorithm 8.1) finds a portfolio:

bt+1 (w, ρ) = CORN(w, ρ)

end
end
Combine top K experts’ portfolios:

w,ρ q(w, ρ)St (w, ρ)bt+1 (w, ρ)
bt+1 = 
w,ρ q(w, ρ)St (w, ρ)

end
end
end

follow the winner rule, which experts called follow the leader in the machine-learning
literature) (Cesa-Bianchi and Lugosi 2006). It would be nice to prove or disprove that
the follow the leader aggregation results in universally consistent strategies (i.e.,
asymptotically it is of growth optimal for any stationary and ergodic market process).

T&F Cat #K23731 — K23731_C008 — page 55 — 9/28/2015 — 21:18


56 CORRELATION-DRIVEN NONPARAMETRIC LEARNING
8.4 Analysis
In this section, we first analyze CORN’s universal consistency with respect to the
class of all ergodic process.

Theorem 8.1 The portfolio scheme CORN is universal with respect to the class of
all ergodic processes such that E{| log Xj |} < ∞, for j = 1, . . . , m.

Proof The proof can be found in Appendix B.1.1.


In the CORN expert learning procedure, there are two key parameters: the cor-
relation coefficient threshold ρ and the window size w. Below, we analyze how they
affect the algorithms.
As shown in the motivating example, the correlation coefficient threshold ρ is
critical to a correlation-similar set. If ρ is negative, the correlation-similar set would
contain some negatively correlated price relative vectors or irrelevant vectors. On
the other hand, if ρ is too large, for example, ρ ≥ 0.5, the correlation-similar set
would neglect some positively correlated vectors. Since the correlation-similar set is
crucial in selecting optimal portfolios, it would harm the learning performance if it
either contains negatively correlated vectors/irrelevant vectors or discards positively
correlated vectors. Empirically, we found that the optimal ρ value is often dataset
dependent, but often close to 0, which will be verified in Section 13.3.1. Moreover,
we note that CORN would degrade to a special case when ρ → 1. As ρ → 1, fewer
market windows are highly positively correlated to the latest window. In the extreme
case of ρ = 1, Ct (w, ρ) becomes almost empty, which thus reduces to the uniform
CRP strategy.∗
Another key parameter for the CORN expert learning process is window size.
Since the calculation of correlation coefficient treats market windows as a vector, the
window size does not have a significant impact on the final portfolio. When certain
experts give very bad performance, the final result tends to be relatively stable since the
proposed combination methods (viz., CORN-U and CORN-K) will reduce the impact
of these experts and thus provide a stable portfolio. We will numerically analyze the
effect of window size in Section 13.3.1, which shows that the proposed combination
can effectively smoothen the performance curve.
The simplicity and effectiveness of CORN raise a fundamental question: Is it
reasonable to select a portfolio using only market price information?” While our
goal is not to resolve the philosophical debates between fundamental and techni-
cal analysts, we believe this work goes a long way to provide empirical evidence
endorsing the effectiveness of technical analysis. Moreover, note that the success of
CORN depends on three basic assumptions that form the basis of most technical anal-
ysis methods, including: (i) market action discounts everything; (ii) price moves in
trends; and (iii) history tends to repeat itself. The first point assumes that stock prices
at any given time reflect everything that has or could affect a company, including
fundamental factors. And the second and third points directly lead to our proposed
∗ This is not a general case, which depends on the initial portfolio and default values if a similarity set
is empty.

T&F Cat #K23731 — K23731_C008 — page 56 — 9/28/2015 — 21:18


SUMMARY 57
CORN algorithm and existing pattern matching–based approach. All these assump-
tions allow us to construct a portfolio using only similar appearances of historical
market prices, without considering other factors, either technical or fundamental.

8.5 Summary
This chapter proposed a novel “CORrelation-driven Nonparametric learning”
(CORN) strategy for online portfolio selection, which effectively exploits the sta-
tistical correlations hidden in stock markets, and benefits from the exploration of
powerful nonparametric learning techniques. The proposed CORN algorithm is sim-
ple in nature and easy to implement, and has parameters that are easy to set. It also
enjoys the universal consistency property. Our empirical studies on real markets, in
Part IV, show that CORN can substantially beat the market index and the best stock,
and also consistently surpasses a variety of state-of-the-art algorithms.
Currently, the proposed CORN can capture the linear relationship between two
market windows, and it is possible to further capture their nonlinear relationship.
Although high return strategies are often associated with high risk, it would be more
attractive to develop a strategy that can manage the risk properly without slashing
too much return. As an extension to this work, we are currently developing such risk-
limiting strategies for CORN. In future, we plan to investigate theoretical insights
of the algorithm and examine its extensions to improve the performance with high
transaction costs.

T&F Cat #K23731 — K23731_C008 — page 57 — 9/28/2015 — 21:18


Chapter 9

Passive–Aggressive Mean Reversion

This chapter proposes a novel online portfolio selection (OLPS) strategy named
“passive–aggressive mean reversion” (PAMR) (Li et al. 2012). Unlike traditional
trend-following approaches, the proposed approach relies upon the mean reversion
relation of financial markets. We are the first to devise a loss function that reflects the
mean reversion principle. Further equipped with passive–aggressive online learning
(Crammer et al. 2006), the proposed strategy can effectively exploit mean reversion.
By analyzing PAMR’s update scheme, we find that it nicely trades portfolio return
with volatility risk and reflects the mean reversion principle. We conduct extensive
numerical experiments in Part IV to evaluate the proposed algorithms on various real
datasets. In most cases, the proposed PAMR strategy outperforms all benchmarks and
almost all state-of-the-art strategies under various performance metrics. In addition
to superior performance, the proposed PAMR runs extremely fast and thus is very
suitable for real-life online trading applications.
This chapter is organized as follows. Section 9.1 briefly reviews the ideas of
existing trend-following strategies and motivates the proposed strategy. Section 9.2
formulates the proposed PAMR strategy, and Section 9.3 derives the algorithms.
Section 9.4 further analyzes and discusses the algorithms. Finally, Section 9.5
summarizes this chapter and indicates future directions.

9.1 Preliminaries
9.1.1 Related Work
One popular trading idea in reality is trend following or momentum, which assumes
that historically outperforming stocks would still perform better than others in future.
Some existing algorithms, such as EG and ONS, approximate the expected loga-
rithmic daily return and logarithmic cumulative return, respectively, using historical
price relatives. Though this idea is easy to understand and makes fortunes for many
of the best traders and investors, trend following is hard to implement effectively. In
addition, in the short term, the stock price relatives may not follow previous
trends (Jegadeesh 1990; Lo and MacKinlay 1990).

59

T&F Cat #K23731 — K23731_C009 — page 59 — 9/29/2015 — 18:26


60 PASSIVE–AGGRESSIVE MEAN REVERSION
Besides trend following, another widely adopted approach is mean rever-
sion (Cover and Gluss 1986; Cover 1991; Borodin et al. 2004), which is also termed
as contrarian. This approach stems from the CRP strategy (Cover and Gluss 1986),
which rebalances to an initial portfolio every period. The idea behind this approach is
that if one stock performs worse than others, it tends to perform better in the following
periods. As a result, a contrarian strategy is characterized by the purchase of securities
that have performed poorly and the sale of securities that have performed well, or,
quite simply, “Sell the winner, buy the loser.” According to Lo and MacKinlay (1990),
the effectiveness of mean reversion is due to positive cross-autocovariances across
securities. Among existing algorithms, CRP, UP,∗ and Anticor adopt this idea. How-
ever, CRP and UP passively revert to the mean, while empirical evidence from the
Anticor algorithm (Borodin et al. 2004) shows that active reversion to the mean may
better exploit the fluctuation of financial markets and is likely to obtain much higher
profit. On the other hand, although Anticor actively reverts to the mean, it is a heuristic
method based on statistical correlations. In other words, it may not effectively exploit
the mean reversion property.
Pattern matching–based nonparametric learning algorithms (BK , BNN , and
CORN, etc.) can identify many market conditions, including both mean reversion
and trend following. However, when searching similar price relatives, they may locate
both mean reversion and trend-following price relatives, whose patterns are essen-
tially opposite, thus weakening the following maximization of expected cumulative
wealth.
In summary, both trend following and mean reversion can generate profit in
the financial markets, if appropriately used. In the following, we will propose an
active mean reversion–based portfolio selection method. Though simple in update
rules, it empirically outperforms the existing strategies in most back-tests† with real
market data, indicating that it appropriately takes advantage of the mean reversion
trading idea.

9.1.2 Motivation
The proposed approach is motivated by the CRP (Cover and Gluss 1986), which
adopt the mean reversion trading idea. As shown in Chapter 5, the mean reversion
principle has not been widely investigated for OLPS.
Another motivation of the proposed algorithm is that, in financial crisis, all
stocks drop synchronously or certain stocks drop significantly. Under such situations,
actively rebalance may be inappropriate since it puts too much wealth on “mine”
stocks, such as Bear Stearns‡ during the subprime crisis. To avoid potential risk
concerning such “mine” stocks, it is better to stick to a previous portfolio, which

∗ From the expert level, UP follows the winner. However, since its experts belong to CRP, it also follows
the loser in stock level. In the preceding survey, we classify it following the expert level.
† Back-test refers to testing a trading strategy via historical market data.
‡ Bear Stearns was a US company whose stock price collapsed in September 2008.

T&F Cat #K23731 — K23731_C009 — page 60 — 9/29/2015 — 18:26


PRELIMINARIES 61
constitutes the CRP strategy. Here, the reason to choose a passive CRP strategy is that
these “mine” stocks are usually known only in hindsight, thus identifying them a priori
is almost impossible. Thus, to avoid suffering too much from such situations, the pro-
posed approach alternates between “aggressive” and “passive” reversion depending
on market conditions. The passive mean reversion avoids the high risk of aggressive
mean reversion, which would put most wealth on these “mine” stocks.
In the following, we propose a novel trading strategy named “passive–aggressive
mean reversion,” or PAMR for short. On the one hand, the underlying assumption
is that better-performing assets would perform worse than others in the next period.
On the other hand, if the market drops too much, we would stop actively rebalanc-
ing portfolios to avoid certain “mine” stocks and their associated risk. To exploit
these intuitions, we suggest adopting passive–aggressive (PA) online learning
(Crammer et al. 2006), which was originally proposed for classification. The basic
idea of PA is that it passively keeps the previous solution if the loss is zero, while it
aggressively updates the solution whenever the suffering loss is nonzero.
We now describe the proposed PAMR strategy in detail. Firstly, if the portfolio
period return is below a threshold, we will try to keep the previous portfolio such
that it passively reverts to the mean to avoid potential “mine” stocks. Secondly, if the
portfolio period return is above the threshold, we will actively rebalance the portfolio
to ensure that the expected portfolio daily return is below the threshold, in the belief
that the next price relatives will revert. This sounds a bit counterintuitive, but it is
indeed reasonable, because if the price relative reverts, keeping the expected port-
folio return below the threshold enables one to maintain a high portfolio return in the
next period. Here, the expected portfolio return is calculated with respect to historical
price relatives, for example, in our study, the last price relative (Helmbold et al. 1998).
To further illustrate that aggressive reversion to the mean can be more effective
than a passive one, let us continue the example that has a market going nowhere but
actively fluctuating. In such a market, the proposed strategy is much more powerful
than best constant rebalanced portfolio (BCRP), a passive mean reversion trading
strategy in hindsight,
 n as shown in Table 9.1. As the motivating example shows,
BCRP grows to 54 for a n-trading period, while at the same time, PAMR grows
 n−1
to 54 × 32 (the details of the calculation/algorithm will be presented in the next
section). We intuitively explain the success of PAMR below.
Assume the threshold for a PAMR update is set to 1, that is, if the portfolio period
return is below 1,we do  nothing but keep the existing portfolio. Our strategy begins
with a portfolio 21 , 12 . For period 1, the return is 45 > 1. Then, at the beginning
of period 2, we rebalance the portfolio such that an approximate portfolio return
based
 2 1 on last price relatives is below the threshold of 1, and the resulting portfolio is
3 , 3 . As the mean reversion principle suggests, although we are building a portfolio
performing below the threshold in the current period, we are actually maximizing the
next portfolio return. As we can observe, the return for period 2 is 32 > 1. Then,
 
following the same rule, we will rebalance the portfolio to 13 , 23 . As a result, in such
 n−1
a market, PAMR’s growth rate is 54 × 32 for a n-period, which is superior to
 5 n
BCRP’s 4 .

T&F Cat #K23731 — K23731_C009 — page 61 — 9/29/2015 — 18:26


62 PASSIVE–AGGRESSIVE MEAN REVERSION
Table 9.1 Motivating example to compare BCRP and PAMR
BCRP PAMR
Period # Relatives Portfolio Return Portfolio Return Notes
1 (1/2, 2) (1/2, 1/2) 5/4 (1/2, 1/2) 5/4 Rebalance to
(2/3, 1/3)
2 (2, 1/2) (1/2, 1/2) 5/4 (2/3, 1/3) 3/2 Rebalance to
(1/3, 2/3)
3 (1/2, 2) (1/2, 1/2) 5/4 (1/3, 2/3) 3/2 Rebalance to
(2/3, 1/3)
4 (2, 1/2) (1/2, 1/2) 5/4 (2/3, 1/3) 3/2 Rebalance to
(1/3, 2/3)
.. .. .. .. .. .. ..
. . . . . . .

Remarks on Motivations: Although the motivating example in Table 9.1 demon-


strates the effectiveness of PAMR over BCRP, PAMR may not always outperform
BCRP. In general, PAMR is an online algorithm, whereas BCRP is an optimal offline
algorithm for i.i.d. markets (Cover and Thomas 1991, Theorem 15.3.1). Now, we
discuss some possible situations where PAMR may fail to outperform BCRP.
Consider a special case where one stockcrashes
  and the other explodes, for exam-
ple, a market sequence of two stocks as 12 , 2 , 12 , 2 , . . .. In this market, BCRP
increases at an exponential rate of 2n as it wholly invests in the second asset, while
PAMR keeps a fixed wealth of 54 over the trading period. Obviously, in such situation,
PAMR performs much worse than BCRP, that is, PAMR’s 54 versus BCRP’s 2n over
n periods. Though not shining in this example, PAMR still bounds its losses. More-
over, such a market, which violates the mean reversion assumption, is occasional,
at least from the viewpoint of our empirical studies.

9.2 Formulations
Now we shall formally devise the proposed PAMR strategy for the OLPS task.
PAMR is based on a loss function that exploits the mean reversion idea, which is
our innovation, and is equipped with the PA online learning technique (Crammer
et al. 2006).∗
First of all, given a portfolio vector b and a price relative vector xt , we define an
-insensitive loss function for the t-th period as

0 b · xt ≤ 
 (b; xt ) = , (9.1)
b · xt −  otherwise

∗ In fact, with the loss function, we can adopt any learning methods to exploit the mean reversion
property. We choose PA for its simplicity and effectiveness. Certainly, other learning techniques can be
adopted, if the new method can provide some new insights.

T&F Cat #K23731 — K23731_C009 — page 62 — 9/29/2015 — 18:26


FORMULATIONS 63
where  ≥ 0 is a sensitivity parameter that controls the mean reversion threshold.
Since portfolio daily return fluctuates around 1,∗ we empirically choose  ≤ 1 to
buy underperforming assets. The -insensitive loss is zero when return is less than
the threshold , and otherwise grows linearly with respect to portfolio return. For
conciseness, let us use t to denote  (b; xt ). By defining this loss function, we can
distinguish the preceding two motivating cases.
Then, we will formulate the proposed strategy and will propose specific algorithms
to solve them. Recalling that bt denotes the portfolio vector for the period t, the first
proposed method for PAMR is formulated as a constrained optimization.
Optimization Problem 1: PAMR

1
bt+1 = arg min b − bt 2 s. t.  (b; xt ) = 0. (9.2)
b∈m 2

The above formulation attempts to find an optimal portfolio by minimizing the


deviation from last portfolio bt if the constraint of zero loss is satisfied. On the
one hand, the above approach passively keeps the last portfolio, that is, bt+1 = bt ,
whenever the loss is zero, or the portfolio daily return is below the threshold . On
the other hand, whenever the loss is nonzero, it aggressively updates the solution
by forcing it to strictly satisfy the constraint, that is,  (bt+1 ; xt ) = 0. Clearly, this
formulation is able to address the two motivations.
Although the above formulation is reasonable to address our concerns, it may have
some undesirable properties when noisy price relatives exist, which are common in
real-world financial markets. For example, a noisy price relative in a trending sequence
may suddenly change the portfolio in a wrong direction due to the aggressive update.
To avoid such problems, we propose two variants of PAMR that are able to trade off
between aggressiveness and passiveness. The idea of the two variants is similar to soft
margin support vector machines by introducing some nonnegative slack variables into
optimization. Specifically, for the first variant, we modify the objective function by
introducing a term that scales linearly with respect to a slack variable ξ and formulate
the following optimization.
Optimization Problem 2: PAMR-1
" #
1
bt+1 = arg min b − bt 2 + Cξ s.t.  (b; xt ) ≤ ξ and ξ ≥ 0, (9.3)
b∈m 2

where C is a positive parameter to control the influence of the slack variable on the
objective function. We refer to this parameter as an aggressiveness parameter similar
to PA learning (Crammer et al. 2006) and call this variant “PAMR-1.”
Instead of a linear slack variable, for the second variant, we modify the objec-
tive function by introducing a term that scales quadratically with respect to a slack
variable ξ, which results in the following optimization problem.
∗ Here we use simple gross return, as defined in Section 9.2. Financial literature often adopts simple net
return (Tsay 2002), which fluctuates around 0.

T&F Cat #K23731 — K23731_C009 — page 63 — 9/29/2015 — 18:26


64 PASSIVE–AGGRESSIVE MEAN REVERSION
Optimization Problem 3: PAMR-2
" #
1
bt+1 = arg min b − bt + Cξ
2 2
s.t.  (b; xt ) ≤ ξ. (9.4)
b∈m 2

We refer to this variant as “PAMR-2.”


Remarks on Loss Function: In our loss function of Equation 9.1, we use the
portfolio return b · xt , while it is possible to use log return log(b · xt ) (Latané 1959).∗
With the log utility, optimization problems Equations 9.2 through 9.4 are all noncon-
vex and nonlinear, and thus difficult to solve. One way to solve them is to use log’s
first-order Taylor expansion at last portfolio and ignore higher order terms, that is,
xt
log(b · xt ) ≈ log(bt · xt ) + (b − bt ).
bt · xt
After the approximation, the nonlinear term becomes linear, and the optimization
problems are thus convex and can be efficiently solved. However, such linear approx-
imation may have some drawbacks. First of all, there is no way to justify the goodness
of linear approximation. With log utility, the loss function is flat, then sharply rises
and finally flattens out. While linear approximation is good in the two flat regimes, it
is terrible at the point of nondifferentiability and subpar in the sharply rising region.
Moreover, linear approximation yields a upper bound on regret of log utility loss
function. For the loss function in the form of Equation 9.1 without log utility or
with log’s√linear approximation, the best possible regret, in a minimax sense, is at
most O( n) (Abernethy et al. 2009), while true log loss minimization algorithm
can routinely achieve O(log n). However, our loss function is not a traditional loss
function maximizing return (or minimizing the loss of −log b · xt ), but only a tool
to realize mean reversion. Thus, the regret achieved using our loss function does not
represent a regret about return, which may not be meaningful as traditional regret
bound is. Anyway, though on empirical evaluations PAMR works well, anyone who
cares about its theoretical aspects should be notified about the possible worse bound,
which may not be elicited by the empirical evaluations.
Remarks on Formulations: Although our formulations mainly focus on portfolio
return without explicitly dealing with risk (e.g., volatility of daily returns), the final
algorithms can be nicely interpreted as certain trade-offs between risk and return, as
discussed in Section 9.4. Such an interesting observation is further verified by our
empirical evaluation, which shows that the proposed PAMR algorithms achieve good
risk-adjusted returns in terms of two risk-related metrics (volatility risk and drawdown
risk).
Similar to existing studies, our formulations avoid incorporating transaction cost,
which simplifies and highlights PAMR’s key ingredients. As shown in Sections 2.2
and 13.4, it is straightforward to evaluate the impact of transaction costs. In Chapter 13,
we present results on both cases: with and without transaction costs. The results
∗ Empirically, log utility does not help much, since log(b · x ) and b · x are both small. However,
t t
theoretically, using log utility may help. We remark on it so as to attract theoretical interest from other
researchers.

T&F Cat #K23731 — K23731_C009 — page 64 — 9/29/2015 — 18:26


ALGORITHMS 65
show that in most markets, the proposed algorithms work well without or even with
moderate transaction costs.

9.3 Algorithms
We now derive the solutions for the three PAMR formulations using standard tech-
niques from convex analysis (Boyd and Vandenberghe 2004) and present the proposed
PAMR algorithms. Specifically, the following three propositions summarize their
closed-form solutions.

Proposition 9.1 The solution to optimization problem 1 (PAMR) without consider-


ing the nonnegativity constraint (b  0) is expressed as

b = bt − τt (xt − x̄t 1), (9.5)


xt ·1
where x̄t = m denotes market return, and τt is computed as
 !
bt · xt − 
τt = max 0,   . (9.6)
xt − x̄t 12

Proof The proof can be found in Appendix B.2.1.

Proposition 9.2 The solution to optimization problem 2 (PAMR-1) without consid-


ering the nonnegativity constraint (b  0) is expressed as
 
b = bt − τt xt − x̄t 1 ,
xt ·1
where x̄t = m denotes market return, and τt is computed as
  !!
bt · xt − 
τt = max 0, min C,   . (9.7)
xt − x̄t 12

Proof The proof can be found in Appendix B.2.2.

Proposition 9.3 The solution to optimization problem 3 (PAMR-2) without consid-


ering nonnegativity constraint (b  0) is expressed as
 
b = bt − τt xt − x̄t 1 ,
xt ·1
where x̄t = m denotes the market return, and τt is computed as
 !
bt · xt − 
τt = max 0,   . (9.8)
xt − x̄t 12 + 1
2C

Proof The proof can be found in Appendix B.2.3.

T&F Cat #K23731 — K23731_C009 — page 65 — 9/29/2015 — 18:26


66 PASSIVE–AGGRESSIVE MEAN REVERSION
Algorithm 9.1 details the proposed PAMR algorithms, and Algorithm 9.2 sum-
marizes the OLPS procedure utilizing PAMR.  Firstly, with no historical information,
the initial portfolio is set to uniform b1 = m1 , . . . , m1 . At the beginning of period t,
we rebalance the portfolio following the decision made at the end of period t − 1.
At the end of t-th period, the market reveals a stock price relative vector, which rep-
resents the market movements. Since both portfolio and price relatives are already
known, the portfolio manager computes the portfolio daily return bt · xt and the loss
 (bt ; xt ) as defined in Equation 9.1. Then, we calculate an optimal step size τt based
on last portfolio and stock price relatives. Given an optimal step size τt , we can update
the portfolio for the next period. Finally, by projecting the updated portfolio into the
simplex domain, we normalize the final portfolio.
Moreover, our algorithms have two key parameters, viz., the sensitivity param-
eter  and the aggressiveness parameter C. In practice, their values could affect the
performance of the proposed algorithms. To achieve a good performance in a spe-
cific market, the parameters have to be finely tuned. We will thoroughly examine
the two parameters on real-life datasets and suggest their empirical selections in
Section 13.3.2.

Algorithm 9.1: Passive–Aggressive Mean Reversion: PAMR(, C, bt , x1t , t).


Input:  ∈ [0, 1]: sensitivity parameter; C: aggressiveness parameter; bt :
current portfolio; x1t : historical market sequence; t: index of current
trading period.
Output: bt+1 : Next portfolio.
begin
Suffer loss: t = max{0, bt · xt − };
Set parameters:
⎧ t

⎪   2 (PAMR)

⎪  xt −x̄t 1

⎪  !

t
τt = min C,   2 (PAMR-1)

⎪ xt −x̄t 1



⎪ t
⎩   (PAMR-2)
xt −x̄t 12 + 1
2C

Update portfolio:  
bt+1 = bt − τt xt − x̄t 1
Normalize portfolio:

bt+1 = arg min b − bt+1 2


b∈m

end

T&F Cat #K23731 — K23731_C009 — page 66 — 9/29/2015 — 18:26


ANALYSIS 67

Algorithm 9.2: Online portfolio selection with PAMR.


Input:  ∈ [0, 1]: sensitivity parameter; C: aggressiveness parameter;
x1n : historical market sequence.
Output: Sn : final cumulative wealth.
begin  
Initialize b1 = m1 , . . . , m1 , S0 = 1;
for t = 1, . . . , n do
Rebalance the portfolio to bt ;
Receive stock price relatives: xt = (xt,1 , . . . , xt,m );
 
Calculate the daily return and cumulative return: St = St−1 × bt xt ;
 
Update the portfolio: bt+1 = PAMR , C, bt , x1t , t ;
end
end

9.4 Analysis
To reflect the mean reversion trading idea, we are interested in analyzing PAMR’s
update rules, which mainly involve portfolio bt+1 and step size τt . In particular, we
want to examine how the update rules are related to return and risk—the two most
important concerns in a portfolio selection task.
First of all, we analyze the portfolio update rule for the three algorithms, that is,
 
bt+1 = bt − τt xt − x̄t 1 .
The step size τt is nonnegative, and x̄t is mean return or market return. The xt − x̄t 1
represents stock abnormal returns with respect to the market on period t. We can
further interpret it as a directional vector for the weight transfer. The negative sign
before the term indicates that the update scheme is consistent with our motivation, that
is, to transfer weights from outperforming stocks (with positive abnormal returns) to
underperforming stocks (with negative abnormal returns).
It is interesting that the second part of the update,
 
at = −τt xt − x̄t 1 ,
coincides with the general form (Lo and MacKinlay 1990, Eq. (1)) of return-based
contrarian strategies (Conrad and Kaul 1998; Lo 2008), except a changing multi-
plier τt . This part represents an arbitrage (zero-cost) portfolio, since its elements
always sum to 0, that is, at · 1 = 0. Adding the arbitrage portfolio to the last portfolio,
bt , results in the next portfolio. The long elements of the arbitrage portfolio (at,i > 0)
increase the corresponding elements of the whole portfolio, and the short elements
(at,i < 0) decrease the corresponding elements. Such an explanation is similar to the
analysis in the last paragraph and connects PAMR’s update with the general form of
return-based contrarian strategies.
Besides, another important update is the step size τt calculated as Equations 9.6
through 9.8 for three PAMR methods, respectively. The step size τt adaptively controls

T&F Cat #K23731 — K23731_C009 — page 67 — 9/29/2015 — 18:26


68 PASSIVE–AGGRESSIVE MEAN REVERSION
the weights to be transferred by scaling the directional vector. One common term
t
in τt is x −x̄ 1 2 . Its numerator denotes the -insensitive loss for period t, which
t t
equals the t-th portfolio return minus a mean reversion threshold, or zero. Assuming
other variables are constant, if the return is high (low), it leads to a large (small)
value of τt , which would aggressively transfer more (less) wealth from outperforming
assets to underperforming assets. The denominator is essentially the market quadratic
variability, that is, the number of assets times market variance of period t. In modern
portfolio theory (Markowitz 1952), the variance of assets returns typically measures
volatility risk for a portfolio. As indicated by the denominator, if the risk is high (low),
the step size τt would be small (large). Consequently, the weight transfer made by the
update scheme will be weakened (strengthened). This is consistent with our intuition
that prediction would not be accurate in drastically dropping markets, and we opt to
make less transfer to reduce risk. Moreover, PAMR-1 caps the step size by a constant
1
C, while PAMR-2 decreases the step size by adding a constant 2C to its denominator.
Both mechanisms can prevent drastic weight transfers in case of noisy price relatives,
which is consistent with their motivations.
From the above analysis on the updates of portfolio and step size, we can conclude
that PAMR nicely balances between return and risk and clearly reflects the mean rever-
sion trading idea. To the best of our knowledge, such an important trade-off has only
been considered by nonparametric kernel-based Markowitz-type strategy (Ottucsák
and Vajda 2007). While the strategy trades off return and risk with respect to a set of
similar historical price relatives, the proposed PAMR explicitly trades off return and
risk with respect to last price relatives. This nice property distinguishes the proposed
approach from most existing approaches that often cater to return, but ignore risk,
and are therefore undesirable.
One objective for PAMR-1 and PAMR-2 is to prevent a portfolio from being
affected too much from noisy price relatives, which might drastically change the
portfolio. In this part, let us exemplify the benefits of PAMR’s variants. Let xt =
(1.00, 0.01), whose second value is a noise, and bt = (1, 0). Setting  = 0.30 and
C = 1.00, we can calculate the next portfolio bt+1 . This market sequence describes
that certain stocks drop significantly, which is common during the financial crisis.
Without tuning, PAMR would transfer a large proportion to the second asset. This can
be verified by calculating PAMR’s portfolio; in other words, PAMR calculates the
update step size τt = 1.43 and obtains the subsequent portfolio bt+1 = (0.29, 0.71).
However, to avoiding such noises, a natural choice is to transfer less proportion
to the second asset. On the other hand, PAMR-1 and PAMR-2 obtain the step
sizes of τt = 1.00 and τt = 0.71, respectively, which are smaller than the origi-
nal PAMR’s. Accordingly, we obtain the next portfolios bt+1 = (0.50, 0.50) and
bt+1 = (0.65, 0.35) for PAMR-1 and PAMR-2, respectively. Clearly, the variants
transfer less wealth to the second asset than the original PAMR does. Thus, PAMR-1
and PAMR-2 suffer less from noisy price relatives, though they cannot completely
avoid such suffering situations.
Finally, let us analyze PAMR’s time complexity. Besides a normalization/
projection step (Step 7 in Algorithm 9.1), PAMR takes O(m) per period. In our

T&F Cat #K23731 — K23731_C009 — page 68 — 9/29/2015 — 18:26


SUMMARY 69
implementation, we adopt a linear projection method (Duchi et al. 2008),∗ which
takes O(m) per period. In total, the time complexity is O(mn). Thus, PAMR has
the same time complexity as the EG algorithm and is more superior to other meth-
ods. Linear time complexity enables the proposed algorithm to handle transactions
in scenarios in which low latency is of crucial importance, such as high-frequency
trading (Aldridge 2010).

9.5 Summary
In this chapter, we proposed a novel online portfolio selection (OLPS) strategy,
passive–aggressive mean reversion (PAMR). Motivated by the idea of mean rever-
sion and passive–aggressive online learning, PAMR either aggressively updates the
portfolio following mean reversion, or passively keeps the previous portfolio. PAMR
executes in linear time, making it suitable for online applications. We also find that
its update scheme is based on the trade-off between return and volatility risk, which
is ignored by most existing strategies. This interesting property connects the PAMR
strategy with modern portfolio theory, which may provide further explanation from
the aspect of finance.
The proposed algorithms are still far from perfect and may be improved in the
following aspects. First of all, though the universality property may not be required
in real investment, PAMR’s universality is still an open question. Second, PAMR
sometimes fails if mean reversion does not exist in the market components. Thus, it is
crucial to locate asset sets exhibiting mean reversion. Finally, PAMR’s formulations
ignore transaction costs. Thus, directly incorporating the issue into formulations may
improve PAMR’s practical applicability.

∗ The precise MATLAB routine ProjectOntoSimplex can be found on [Link]


[Link]/∼jduchi/projects/DuchiShSiCh08/

T&F Cat #K23731 — K23731_C009 — page 69 — 9/29/2015 — 18:26


Chapter 10

Confidence-Weighted Mean Reversion

Empirical evidence (Borodin et al. 2004) shows that stock price relatives may follow
the mean reversion property, which has not been fully exploited by existing strategies.
Moreover, all existing online portfolio selection (OLPS) strategies only focus on the
first-order information of a portfolio vector, though second-order information may
also benefit a strategy. This chapter proposes a novel strategy named “confidence-
weighted mean reversion” (CWMR) (Li et al. 2011b, 2013). Inspired by the mean
reversion principle in finance and confidence-weighted (CW) online machine learning
technique (Crammer et al. 2008; Dredze et al. 2008), CWMR models the portfolio vec-
tor as a Gaussian distribution, and sequentially updates the distribution following the
mean reversion principle. Analysis of CWMR’s closed form updates clearly reflects
the mean reversion trading idea and the interaction of first-order and second-order
information. Extensive experiments, in Part IV, on various real markets show that
CWMR is able to effectively exploit the power of mean reversion and second-order
information, and is superior to the state-of-the-art techniques.
This chapter is organized as follows. Section 10.1 motivates the proposed CWMR
strategy. Section 10.2 formulates the strategy, and Section 10.3 derives the algorithms
based on the formulations. Section 10.4 further analyzes the algorithms. Finally,
Section 10.5 summarizes this chapter and indicates future directions.

10.1 Preliminaries
10.1.1 Motivation
The proposed method, similar to passive–aggressive mean reversion (PAMR), is based
on the mean reversion trading idea, which, in the context of portfolio or multiple assets,
implies that good-performing assets tend to perform worse than others in subsequent
periods, and poor-performing assets are inclined to perform better. Thus, to maximize
the next portfolio return, we could minimize the expected return with respect to
today’s price relatives since next price relatives tend to revert. This seems somewhat
counterintuitive, but, according to Lo and MacKinlay (1990), the effectiveness of
mean reversion is due to the positive cross-autocovariances across assets.

71

T&F Cat #K23731 — K23731_C010 — page 71 — 9/28/2015 — 21:24


72 CONFIDENCE-WEIGHTED MEAN REVERSION
Besides the virtual example in Section 9.1.2, we empirically analyze real market
data to show that mean reversion does exist.∗ Although measuring mean reversion in
a single stock is well studied (Poterba and Summers 1988; Chaudhuri and Wu 2003;
Hillebrand 2003), the study of mean reversion in a portfolio is rare. Since, in our formu-
lation, the portfolio is long-only,† we focus on whether we can obtain a higher return
than the market by investing on poor-performing assets.‡ With a threshold δ, let At
be the set of poor-performing stocks (xt,i < δ), Bt be the set of mean reversion (MR)
stocks (xt,i < δ & xt+1,i > 1), Ct be the set of non–mean reversion (non–MR) stocks
(xt,i < δ & xt+1,i < 1), and Dt be the set of remaining stocks (xt,i < δ & xt+1,i = 1).
On period t, we calculate the percentage of a set U , which can be either A, B,
C, or D, as Pt (U ) = |Ut |/|At |, where | · | denotes 
the cardinality of a set, and the
gain of uniform investment in the set as Gt (U ) = i∈Ut xt,i /|Ut |. For a total of n
1 n−1
periods, we further calculate their average values as P̄ (U ) = n−1 t=1 Pt (U ) and
1 n−1
Ḡ(U ) = n−1 t=1 Gt (U ), respectively. In particular, we refer to the percentage of
mean reversion stocks as P̄ (B), and the gain of mean reversion stocks as Ḡ(B). To
show whether buying poor-performing stocks is profitable, we calculate the average
gain of uniform investment on poor-performing stocks, denoted as Ḡ(A), and the
average gain of uniform investment in the whole market, denoted as Ḡ(Market).
Table 10.1 gives the statistics on six real market daily datasets.§ On the one hand,
except for the DJIA dataset (please refer to Chapter 12 for details), mean reversion
does exist (P̄ (B) > P̄ (C)),¶ and uniform investment on poor-performing stocks pro-
vides a greater profit∗∗ than the market (Ḡ(A) > Ḡ(Market)). On the other hand, the
test failed on the DJIAdataset, and in the following empirical evaluations, CWMR also
failed badly on the dataset, which motivates our next proposed method in Chapter 11.
Moreover, all state-of-the-art approaches only exploit first-order information of a
portfolio vector, while higher order information may also benefit the portfolio selec-
tion task (Harvey et al. 2010). Evidence (Chopra and Ziemba 1993) shows that in
portfolio selection, errors in variance have about 5% impact on the objective value
as errors in mean do. For simplicity, we exploit variance information while ignor-
ing covariance information, which has a much smaller impact on the final objective
value. To take advantage of both first- and second-order information, we adopt CW
online learning (Crammer et al. 2008; Dredze et al. 2008), which was originally pro-
posed for classification. CW’s basic idea is to maintain a Gaussian distribution for a

∗ The test program and datasets will be available at [Link]


† Long-only means if something is considered undervalued, managers would invest; if something is
considered overvalued, managers would avoid it.
‡ If short is allowed, we can also show whether shorting good-performing stocks provides a higher
return.
§ We list their details in Section 12.2. We empirically choose δ = 0.985 on all datasets. As we have
tested, other thresholds also release similar observations. For tests on other frequencies, please refer to
Li et al. (2013).
¶ This indicates a higher probability of reversion, but we have no theoretical guarantee for the criteria.
∗∗ The absolute return in the daily scale is relatively small. However, considering their net return, such a
strategy makes much higher profit than the market does. Moreover, with compounding, such small absolute
differences will result in huge differences over time.

T&F Cat #K23731 — K23731_C010 — page 72 — 9/28/2015 — 21:24


FORMULATIONS 73
Table 10.1 Summary of mean reversion statistics on real markets

Dataset P̄ (B) Ḡ(B) P̄ (C) Ḡ(C) P̄ (D) Ḡ(A) Ḡ(Market)


TSE 42.89% 1.022370 41.63% 0.978395 15.48% 1.000598 1.000405
MSCI 54.19% 1.015737 45.05% 0.984046 0.76% 1.001107 1.000053
NYSE (O) 43.43% 1.021599 39.86% 0.981949 16.71% 1.002523 1.000620
NYSE (N) 47.87% 1.019624 43.19% 0.982050 8.93% 1.001644 1.000610
DJIA 48.54% 1.018545 50.57% 0.980843 0.90% 0.999398 0.999719
SP500 50.20% 1.020692 47.96% 0.980502 1.84% 1.000881 1.000488

classifier, and sequentially update the distribution similar to passive–aggressive (PA)


learning (Crammer et al. 2006). Thus, CW learning can take advantage of both first-
and second-order information of the classifier.
To address the above two concerns, we present a novel OLPS method named
CWMR. To exploit the first- and second-order information of a portfolio vector,
we model the portfolio vector as a Gaussian distribution, which is probably the most
widely studied distribution and can satisfy our motivations. We do not consider higher
orders and other distributions for their complexities. Then, we sequentially update the
distribution following the mean reversion principle. On the one hand, we keep the
previous distribution if the portfolio is profitable by using mean reversion. On the other
hand, we move the distribution to a new distribution such that the new distribution is
expected to make profit while keeping it close to the previous distribution. Different
from CRP and Anticor, CWMR actively exploits the mean reversion property of
financial markets with a powerful learning method. Moreover, compared with all
existing algorithms, including PAMR, which only consider the first-order information,
CWMR exploits both the first- and second-order information of a portfolio vector.

10.2 Formulations
We model b as a Gaussian distribution with mean μ ∈ Rm and diagonal covariance
matrix  ∈ Rm×m with nonzero diagonal elements and zero for off-diagonal elements.
The i-th element of μ represents the proportion of the i-th element. The i-th diagonal
term of  stands for the confidence on the i-th proportion. The smaller the diagonal
term, the higher the confidence we have in the corresponding μ.
At the beginning of period t, we figure out a b based on the distribution N (μ, ),
that is, b ∼ N (μ, ). Then, after xt is revealed, the wealth increases by a factor
of b xt . It is straightforward that the return D = b xt can be viewed as a random
variable of the following univariate Gaussian distribution:
 
D ∼ N μ xt , xt xt .

Its mean is the return of mean vector, and its variance is proportional to the projection
of xt on .

T&F Cat #K23731 — K23731_C010 — page 73 — 9/28/2015 — 21:24


74 CONFIDENCE-WEIGHTED MEAN REVERSION
According to the mean reversion idea, the probability of a profitable b with respect
to a predefined mean reversion threshold  is defined as
( )
Prb∼N (μ,) [D ≤ ] = Prb∼N (μ,) b xt ≤  .

For simplicity, we write Pr[b xt ≤ ] instead. Note that we are considering the mean
reversion profitability in a portfolio consisting of multiple stocks; thus, this definition
is equivalent to the motivating idea of buying poor-performing stocks or, equivalently,
selling good-performing stocks.
The algorithm adjusts the distribution to ensure that the probability of a mean
reversion profitable b is higher than a confidence-level parameter θ ∈ [0, 1]:
( )
Pr b xt ≤  ≥ θ.

This is somewhat counterintuitive but reasonable with respect to the mean reversion
idea. If it is highly probable that the portfolio return b xt is less than a threshold, it
is also highly probable that its next return based on xt+1 tends to be higher since xt+1
will revert.
Then, following the intuition underlying PA algorithms (Crammer et al. 2006),
our algorithm chooses a distribution closest to the current distribution N (μt ,  t ) in
terms of Kullback–Leibler (KL) divergence (Kullback and Leibler 1951). As a result,
at the end of period t, the algorithm updates the distribution by solving the following
optimization problem.

The Raw Optimization Problem: CWMR

(μt+1 ,  t+1 ) = arg min DKL(N (μ, ) N (μt ,  t ))


s.t. Pr[b xt ≤ ] ≥ θ (10.1)
μ ∈ m .

The optimization problem (10.1) clearly reflects our motivation. On the one hand,
if the current μt is mean reversion profitable, that is, the first constraint is satisfied,
CWMR chooses the same distribution, resulting in a passive CRP strategy. On the
other hand, if μt does not satisfy the mean reversion constraint, CWMR tries to
figure out a new distribution, which is expected to profit and not far from the current
distribution.
Let us reformulate the objective and constraints. For the objective part, the KL
divergence between two Gaussian distributions can be rewritten as

DKL(N (μ, ) N (μt ,  t ))


   
1 det t
= log + Tr( −1
t ) + (μt − μ) −1
 t (μ t − μ) − d .
2 det

T&F Cat #K23731 — K23731_C010 — page 74 — 9/28/2015 — 21:24


FORMULATIONS 75
For the constraint part, since b ∼ N (μ, ), b xt has a univariate Gaussian
2 = x x . Thus, the probability
distribution with mean μD = μ xt and variance σD t t
of a return less than  is
* +
D − μD  − μD
Pr[D ≤ ] = Pr ≤ .
σD σD

In the preceding equation, D−μ D


is a normally distributed random variable; thus,
 −μDσD
the probability equals  σD , where  is the cumulative distribution function of
Gaussian distribution. As a result, we can rewrite the constraint as −μ −1
σD ≥  (θ).
D

Substituting μD and σD by their definitions and rearranging the terms, we can obtain
,
 − μ xt ≥ φ xt xt ,

where φ = −1 (θ). Clearly, we require that the weighted summation of return and
standard deviation is less than a threshold. Till now, we can rewrite the preceding
optimization problem.

The Revised Optimization Problem: CWMR


   
1 det t
(μt+1 ,  t+1 ) = arg min log + Tr( −1
t ) + (μt − μ) −1
 t (μ t − μ)
2 det
,
s.t.  − μ xt ≥ φ xt xt
μ 1 = 1, μ  0. (10.2)

For the optimization problem (10.2), the first constraint is not convex in , there-
fore we have two ways to handle it. The first way (Dredze et al. 2008) is to linearize it
by omitting the square root, that is,  − μ xt ≥ φxt xt . As a result, we can finalize
the first optimization problem, named CWMR-Var.

The Final Optimization Problem 1: CWMR-Var


   
1 det t −1  −1
(μt+1 ,  t+1 ) = arg min log + Tr( t ) + (μt − μ)  t (μt − μ)
2 det
s.t.  − μ xt ≥ φxt xt
μ 1 = 1, μ  0. (10.3)

The second reformulation (Crammer et al. 2008) is to decompose the positive


semidefinite (PSD) , that is,  = ϒ 2 with ϒ = Qdiag(λ1 , . . . , λm )Q , where
1/2 1/2

Q is orthonormal and λ1 , . . . , λm are the eigenvalues of  and thus ϒ is also PSD. This
reformulation yields the second final optimization problem, named CWMR-Stdev.

T&F Cat #K23731 — K23731_C010 — page 75 — 9/28/2015 — 21:24


76 CONFIDENCE-WEIGHTED MEAN REVERSION
The Final Optimization Problem 2: CWMR-Stdev
   
1 detϒ 2t −2 2  −2
(μt+1 , ϒ t+1 ) = arg min log + Tr(ϒ t ϒ ) + (μt − μ) ϒ t (μt − μ)
2 detϒ 2
s.t.  − μ xt ≥ φ ϒxt , ϒ is PSD

μ 1 = 1, μ  0. (10.4)

Clearly, the revised optimization problem (10.2) is equivalent to the raw optimiza-
tion problem (10.1). From the revised problem, we proposed two final optimization
problems, Equations 10.3 and 10.4, which are convex and thus can be efficiently
solved by convex optimization (Boyd and Vandenberghe 2004). The first variation,
CWMR-Var, linearizes the constraint; thus, it results in an approximate solution for the
revised and the raw optimizations. In contrast, the second variation, CWMR-Stdev, is
equivalent to the revised optimization problem (10.2) and results in an exact solution
for both the revised and raw optimization problems.
Remarks on Formulations: Note that the short version of this chapter (Li et al.
2011b) assumes log utility (Bernoulli 1954; Latané 1959) on μ xt and is slightly
different from this version (Li et al. 2013). Since both  and φ are adjustable, they have
similar effects on μ. Assuming other parameters are constant except μ, as μ xt >
log μ xt , the current linear form can move μ toward the mean reversion profitable
portfolio more than the log form can. However, the log form in this constraint causes
another convexity issue besides the standard deviation on the right-hand side. To solve
the optimization problem with log, Li et al. (2011b) chose to replace the log term by
its linear approximation, which may converge to a different solution. Moreover, the
current form and log’s linear approximation are essentially the same.∗ Thus, we adopt
return without log, which has no above convexity issues concerning log and its linear
approximation.

10.3 Algorithms
Now, let us devise the proposed algorithms based on the optimization problem,
Equations 10.3 and 10.4. Their solutions are shown in Propositions 10.1 and 10.2,
respectively. Both proofs are presented in Appendices B.3.1 and B.3.2, respectively.

Proposition 10.1 The solution to the final optimization problem (10.3) (CWMR-Var)
without considering the non-negativity constraint (μ  0) is expressed as

μt+1 = μt − λt+1  t (xt − x̄t 1),  −1 −1 


t+1 =  t + 2λt+1 φxt xt ,

where λt+1 corresponds to the Lagrangian multiplier calculated as Equation B.10 in



Appendix B.3.1 and x̄t = 11t x1t denotes the CW average of xt .
t

∗ Current form is μ x . Li et al. (2011b) uses approximation, that is, log μ x ≈ log μ x + xt ·(μ−μt ) .
t t t t μ
t xt
Both terms are linear, although their scales are different.

T&F Cat #K23731 — K23731_C010 — page 76 — 9/28/2015 — 21:24


ALGORITHMS 77
Proposition 10.2 The solution to the final optimization problem (10.4) (CWMR-
Stdev) without considering the non-negativity constraint (μ  0) is expressed as

xt xt
μt+1 = μt − λt+1  t (xt − x̄t 1),  −1
t+1 =  −1
t + λt+1 φ √ ,
Ut

where λt+1 denotes the Lagrangian multiplier calculated as Equation B.14 in



Appendix B.3.2, x̄t = 11t x1t represents the CW average of xt , and Vt = xt  t xt and
, t
√ −λt+1 φVt + λ2t+1 φ2 Vt2 +4Vt
Ut = 2 denote the return variances for period t and t + 1,
respectively.

Initially, with no information available for the task, we simply initialize μ1 to


uniform and each diagonal element of the covariance matrix  1 to m12 , or equiva-
lent standard deviation m1 . Note that we solve the optimization problems by ignoring
the non-negativity constraint (μ  0), which is a typical way to reduce the com-
plexity (Helmbold et al. 1998; Agarwal et al. 2006). To solve this issue that μ can
be negative, we simply project the resulting μ to the simplex domain (Agarwal et al.
2006). In the context of investment, this means that we firstly allow shorting, and later
lower the leverage∗ by a simplex projection. Another remaining issue is that, although
the covariance matrix is nonsingular in theory, in real computation,  sometimes may
be singular due to computer precision. To avoid this problem and be consistent with
the projection of μ, we rescale  by normalizing its summation value to m1 , which
equals the sum of elements in μ1 . Note that we arbitrarily choose m1 , while one
can choose other values, which generally do not affect the performance too much.
The final CWMR algorithms are presented in Algorithm 10.1, and OLPS with both
deterministic and stochastic CWMR algorithms is illustrated in Algorithm 10.2.
The algorithms have two possible parameters, that is, confidence parameter φ and
mean reversion parameter . Typically, the first parameter, φ, can be 1.28, 1.64, 1.95,
or 2.57, with corresponding θ values of 80%, 90%, 95%, or 99%. As we have tested,
φ does not affect the final performance too much. On the contrary,  has a significant
impact on the final performance. As our model is long-only,† we put more weights on
the poor-performing assets; thus,  is often in the range of [0, 1]. On the one hand, if the
value is too large, such as  ≥ 1.2, the last portfolio distribution can always satisfy
the constraint and requires no updates. In such a situation, with an initial uniform
portfolio, CWMR will degrade to uniform CRP. On the other hand, if the value is
too small, such as  ≤ 0.5, the last distribution can always dissatisfy the constraint
and has to be frequently updated. In between, CWMR updates the distribution when
the last distribution cannot satisfy the constraint. We will further validate the above
analysis by evaluating its parameter effect in Section 13.3.3.

∗ In investment, notional leverage denotes total holding assets plus total notional amount of liability

divided by equity. If shorting is allowed, the notional leverage equals i |bi | to 1. The problem setting in
our study is long-only, in which we do not allow shoring/margin; thus, the leverage is always 1 to 1.
† Long-only means no shoring/margin is allowed; thus, the notional leverage is always 1 to 1.

T&F Cat #K23731 — K23731_C010 — page 77 — 9/28/2015 — 21:24


78 CONFIDENCE-WEIGHTED MEAN REVERSION

Algorithm 10.1: Confidence-Weighted Mean Reversion:


CWMR(φ, , (μt ,  t ), x1t , t).
Input: φ: Confidence parameter;  ∈ [0, 1]: Mean reversion parameter;
(μt ,  t ): Current portfolio distribution; x1t : Historical market
sequence; t: Index of current trading period.
Output: (μt+1 ,  t+1 ): Next portfolio distribution.
begin
Calculate the following variables:

1  t xt
Mt = μ
t xt , Vt = xt  t xt , Wt = xt  t 1, x̄t =
1  t 1
Update the portfolio distribution:


⎨ λt+1 as in Equation B.10 in Appendix B.3.1
CWMR-Var μt+1 = μt − λt+1  t (xt − x̄t 1)


 t+1 = ( −1
t + 2λt+1 φdiag (xt ))
2 −1



⎪ λt+1 as in Equation B.14 in Appendix B.3.2

⎪ ,

⎨ √ −λt+1 φVt + λ2t+1 φ2 Vt2 +4Vt
CWMR-Stdev Ut = 2

⎪ μt+1 = μt − λt+1  t (xt − x̄t 1)



⎩  t+1 = ( −1 + λt+1 √φ diag2 (xt ))−1
t U t

Normalize μt+1 and  t+1 :


 t+1
μt+1 = arg min μ − μt+1 2 ,  t+1 =
μ∈m mTr( t+1 )

end

10.4 Analysis
In this section, we analyze and interpret the proposed algorithms. Firstly, we compare
CWMR with CW learning (Crammer et al. 2008; Dredze et al. 2008). Then, we
analyze CWMR’s update schemes, that is, μ and , with running examples. Further,
we describe the behavior of stochastic CWMR. Finally, we show its computational
time and compare it with existing work.
The proposed CWMR algorithms are partially motivated by CW learning, thus
their formulations and subsequent derivations are similar. However, they address
different problems, as CWMR handles OLPS while CW focuses on classification.
Although both objectives adopt KL divergence to measure the closeness between two
distributions, their constraints reflect that they are oriented toward different problems.
To be specific, CW’s constraint is the probability of a correct classification, while

T&F Cat #K23731 — K23731_C010 — page 78 — 9/28/2015 — 21:24


ANALYSIS 79

Algorithm 10.2: Online portfolio selection with CWMR.


Input: φ = −1 (θ): Confidence parameter;  ∈ [0, 1]: Mean reversion
parameter; x1n : Historical market sequence.
Output: Sn : Final cumulative wealth.
begin
Initialization: t = 1, μ1 = m1 1,  1 = m12 I, S0 = 1;
for t = 1, . . . , n do
Draw a portfolio bt from N (μt ,  t ):

Deterministic CWMR : bt = μt
Stochastic CWMR : b̃t ∼ N (μt ,  t ), bt = arg min b − b̃t 2
b∈m

Receive stock price relatives: xt = (xt1 , . . . , xtm );


Calculate the daily return and cumulative return: St = St−1 × (b
t xt );
Update the portfolio distribution:
(μt+1 ,  t+1 ) = CWMR(φ, , (μt ,  t ), x1t , t);
end
end

CWMR’s constraints are the probability of an underperforming portfolio plus the


simplex constraint. If there is mean reversion, the portfolio should be a profitable
one, in the next period. Their formulations’ differences result in subsequent different
derivations.
Then, we provide a preliminary analysis on μ, which is the main concern for
CWMR, to reflect its underlying mean reversion idea. Both CWMR-Var and CWMR-
Stdev share the same update on μ, that is,

μt+1 = μt − λt+1  t (xt − x̄t 1).

Straightforwardly, we can rewrite its term as μt+1,i = μt,i − λt+1 σt2 (xt,i − x̄t ). Obvi-
ously, λt+1 is non-negative and  t is PSD. The term xt − x̄t 1 denotes excess return
vector for period t, where x̄t is the CW average of xt . Holding other terms constant,
μt+1 tends to move toward μt , while the magnitude is negatively related to the last
excess return, which is the mean reversion principle. Meanwhile, these movements
are dynamically adjusted by λt+1 , the last covariance matrix  t and mean μt , which
catch both first- and second-order information. To the best of our knowledge, none
of the existing algorithms has explicitly exploited the second-order information of b,
even though the second-order information could benefit the proposed algorithms.
Let us continue to analyze . With only nonzero diagonal elements, we can write
the update of the i-th variance as

σ2 = σi2 /(1 + λt+1 φ xti2 σi2 ),

T&F Cat #K23731 — K23731_C010 — page 79 — 9/28/2015 — 21:24


80 CONFIDENCE-WEIGHTED MEAN REVERSION
where φ = 2φ for CWMR-Var and φ = √φU for CWMR-Stdev. Since both λt+1
t
and φ are positive, poor-performing assets (with lower values of xt,i ) have higher
variance terms than good-performing ones (with higher xt,i ). Note that  denotes the
covariance matrix of b rather than x. Thus, a higher value means that the corresponding
mean is more volatile than others. Since we move the weights from good-performing
assets to poor-performing ones, the latter would change more than the former, that
is, the latter has higher volatility. In the next update of μ, assets with high volatility
would actively magnify the movement magnitude.
To better illustrate the updates, we give running updates based on a classic exam-
ple (Cover and Gluss 1986).  Let amarket
 consist of cash and one volatility asset, and
the sequence of x is 1, 12 , 1, 2 , 1, 12 , . . . . Obviously, market strategy can gain
 
nothing since no asset grows in the long run. The best CRP strategy, with b = 12 , 12 ,
 n  
grows to 98 2 at the end of n periods. However, starting with μ0 = 12 , 12 , the CWMR
n−1
strategy can grow to 34 × 2 2 after n-th periods. Table 10.2 shows the running details
for the initial five periods, and further details can be derived. On each period t + 1,
the mean moves toward the last mean and also moves far away by the excess return
vector (xt − x̄t 1), and its magnitude is determined by both λt and  t . Note that, in
this example, μ before projection is out of the simplex domain and is forced sparse
via normalization, which is not an usual case in real tests. In summary, both the first-
and second-order information contribute to CWMR’s success.
Then, let us compare deterministic CWMR with the stochastic version (Line 4
in Algorithm 10.2), which draws a portfolio based on both the mean and covariance
matrix. Interestingly,  negatively affects CWMR’s performance in several aspects.
Firstly, a stochastic b drawn from the distribution is always different from the optimal
mean μ, which obviously causes performance divergences. Given that  converges
to the zero matrix (see the recursive updates in the two propositions), the distribution
of b conditioning on the data converges to the point mass at the mean parameter
value, μ = limt μt . Thus, drawing weights b from the distribution (the stochastic
version) is suboptimal, since we already have an estimate of μ. It is better to choose b
as either the mode or mean (incidentally, the same for the Gaussian case), which is
actually deterministic CWMR. Another effect caused by the stochastic behavior is

Table 10.2 A running example of CWMR-Stdev on the Cover’s game

t xt bt b
t xt λt xt − x̄t 1 diag( t ) μt
0 (0.25, 0.25) (0.5, 0.5)
1 (1.0, 0.5) (0.5, 0.5) 0.75 40.78 (0.25, −0.25) (0.10, 0.40) (0.0, 1.0)
2 (1.0, 2.0) (0.0, 1.0) 2.00 61.61 (−0.80, 0.20) (0.40, 0.10) (1.0, 0.0)
3 (1.0, 0.5) (1.0, 0.0) 1.00 75.56 (0.10, −0.40) (0.10, 0.40) (0.0, 1.0)
4 (1.0, 2.0) (0.0, 1.0) 2.00 61.61 (−0.80, 0.20) (0.40, 0.10) (1.0, 0.0)
5 (1.0, 0.5) (1.0, 0.0) 1.00 75.56 (0.10, −0.40) (0.10, 0.40) (0.0, 1.0)
.. .. .. .. .. .. .. ..
. . . . . . . .

T&F Cat #K23731 — K23731_C010 — page 80 — 9/28/2015 — 21:24


SUMMARY 81
Table 10.3 Summary of time complexity analysis
Methods Time Complexity Methods Time Complexity
UP O(nm )/O(m7 n8 ) SP/GRW/M0 O(mn)
EG O(mn) Anticor O(N 3 m2 n)
ONS O(m3 n) BK /BNN /CORN O(N 2 mn2 )+O(N mn2 )
PAMR O(mn) CWMR O(mn)

the additional projection, as sometimes the stochastic b may be out of the simplex
domain. To better understand the two aspects, let us continue the Cover’s game in
Table 10.2. For the first case, let μ = (0.5, 0.5) and diag() = (0.25, 0.25). We draw
stochastic b for 10,000 times, and the average b after projection is (0.5038, 0.4962)
(before projection, the value is (0.5070, 0.4993)), which slightly deviates from the
optimal mean and will result in different performance. For the second case, let μ =
(0, 1) and diag() = (0.1, 0.4). We draw and project 10,000 stochastic b, and get an
average b of (0.1391, 0.8609), which is far from the optimal mean (0, 1). In both cases,
stochastic CWMR tends to deviate from the optimal mean, and thus underperforms
the deterministic one, which is shown in the related experiments (Li et al. 2013,
Table VII).
Since computational time is of crucial importance for certain trading scenarios,
such as high-frequency trading (Aldridge 2010), which can occur in fractions of a
second, we finally show CWMR’s time complexity. In the implementation, we only
consider the diagonal elements of ; thus, its inverse costs linear time. Moreover,
the projection (Line 3 in Algorithm 10.1) can be implemented in O(m) time (Duchi
et al. 2008). Thus, in total, CWMR algorithms (Algorithm 10.1) take O(m) time per
period. Straightforwardly, OLPS with CWMR (Algorithm 10.2) takes O(mn) time.
Table 10.3 compares CWMR’s time complexity with that of existing strategies.∗
Clearly, CWMR takes no more time than any others.

10.5 Summary
In this chapter, we proposed a novel online portfolio selection (OLPS) strategy named
confidence-weighted mean reversion (CWMR), which effectively learns portfolios
by exploiting the mean reversion property in financial markets and the second-order
information of a portfolio. CWMR’s update schemes are obtained by solving two
optimization problems that consider both first- and second-order information of a
portfolio vector, which goes beyond any existing approaches that only consider first-
order information. As shown in Part IV, the proposed approach beats a number of
∗ Nonparametric learning approaches (BK , BNN , and CORN) require to solve a nonlinear optimization

each period, that is, bt+1 = arg maxb∈m i (b xi ), whose time complexity is generally high. To produce
an approximate solution, batch gradient projection algorithms (Helmbold et al. 1997) take O(mn), while the
batch convex Newton method (Agarwal et al. 2006) takes O(m3 n). In the table, we set the step O(mn) time
complexity. In our implementation, we adopt MATLAB Optimization ToolboxTM (function fmincon
with active-set) to obtain exact solutions.

T&F Cat #K23731 — K23731_C010 — page 81 — 9/28/2015 — 21:24


82 CONFIDENCE-WEIGHTED MEAN REVERSION
competing state-of-the-art approaches on various up-to-date datasets collected from
the real market.
In future, we plan to study in detail the cause behind the existence of the mean
reversion property in the financial markets. This will help us to further understand the
nature of the markets. Second, we also intend to explore the possibility of combining
both the trend-following and mean reversion principles to provide more practically
effective solutions. Finally, we note that an interesting future direction is to extend
our analysis for long-short portfolios.∗

∗ Long-short portfolios can have negative weights, which denote the short positions.

T&F Cat #K23731 — K23731_C010 — page 82 — 9/28/2015 — 21:24


Chapter 11

Online Moving Average Reversion

Empirical evidence shows that a stock’s high and low prices are temporary, and stock
price relatives are likely to follow the mean reversion phenomenon. While exist-
ing mean reversion strategies can achieve good empirical performance on many real
datasets, they often make a single-period mean reversion assumption, which is not
always satisfied, leading to poor performance on some real datasets. To overcome the
limitation, this chapter (Li et al. 2015) proposes a multiple-period mean reversion,
or the so-called moving average reversion (MAR), and a new online portfolio selec-
tion (OLPS) strategy named the online moving average reversion (OLMAR), which
exploits MAR by applying powerful online learning techniques. Our empirical eval-
uations in Part IV show that OLMAR can overcome the drawbacks of existing mean
reversion algorithms and achieve significantly better results, especially on the datasets
where existing mean reversion algorithms failed. In addition to superior performance,
OLMAR also runs extremely fast, further supporting its practical applicability to a
wide range of applications.
This chapter is organized as follows. Section 11.1 analyzes existing works and
motivates the proposed strategy. Section 11.2 formulates the strategy, and Section 11.3
solves the formulations and derives the algorithms. Section 11.4 further analyzes the
proposed algorithm. Finally, Section 11.5 summarizes this chapter and indicates future
directions.

11.1 Preliminaries
11.1.1 Related Work
Most existing formulations follow the basic routine of Kelly-based portfolio selec-
tion (Kelly 1956; Thorp 1971). In particular, a portfolio manager predicts x̃t+1 in terms
1 , . . . , x̃k
of k possible values x̃t+1 t+1 and their corresponding probabilities p1 , . . . , pk .
i
Note that each x̃t+1 denotes one possible combination vector of individual price rela-
tive predictions. Then, he or she can figure out a portfolio by maximizing the expected
log return,
k
bt+1 = arg max pi log(b · x̃t+1
i
).
b∈m i=1

83

T&F Cat #K23731 — K23731_C011 — page 83 — 9/30/2015 — 16:44


84 ONLINE MOVING AVERAGE REVERSION
Based on the methods to predict x̃t+1 i and pi , most existing algorithms can be clas-
sified into three categories. Table 11.1 summarizes their optimization formulations
and underlying prediction schemes, whose details can be found on their respective
studies. Note that we have transformed certain formulations without changing their
key ideas.
Now let us introduce the three categories according to their empirical perfor-
mance. The second category, which consists of successive constant rebalanced
portfolio (SCRP) and online Newton step (ONS), assumes that the predictions consist
of all historical price relatives with uniform distribution. That is, at period t + 1, the
price relative vector may be xi , i = 1, . . . , t with a probability of 1t . In other words,
this category aims to model the next price relatives as their historical average. The
algorithms in this category all present good theoretical regret bound and are universal.
However, their empirical results show that such an assumption may be inappropriate
to model the market behaviors. The third category, which mainly consists of the pat-
tern matching–based algorithms, models the next price relatives as a sampled set of
similar price relatives. In particular, denoting the similar index set as Ct , it models the
next price relative vector as xi , i ∈ Ct , with a uniform probability of |C1t | . The algo-
rithms in this category (except correlation-driven nonparametric learning [CORN])
enjoy the universal consistency property, and their empirical results also show that
such an assumption can explain the markets well.
Algorithms in the first category, which consists of exponential gradient (EG),
passive–aggressive mean reversion (PAMR), and confidence-weighted mean rever-
sion (CWMR), assume a single prediction value with a probability of 100% and
maintain previous portfolio information via regularization techniques. In particu-
lar, EG assumes x̃t+1 1 = xt with p1 = 100%, while PAMR and CWMR assume
1∗
x̃t+1 = xt with p1 = 100%, which is in essence mean reversion. Note that the formu-
1

lations of PAMR and CWMR ignore the log utility due to the single-value prediction
and the consideration of convexity and computation. Though all three algorithms
assume that all information is fully reflected by xt , their performance diverges and
supports that mean reversion may better explain the markets. On the one hand, even
with a decent theoretical result, EG always performs poorly. On the other hand, though
without theoretical guarantees, PAMR and CWMR have produced the best results in
certain real markets. However, when such a single-period mean reversion assumption
is not satisfied, PAMR and CWMR would suffer from dramatic failures (Li et al.
2012, Table 4, the DJIA dataset), which motivates the following approach.

11.1.2 Motivation
Empirical results (Li et al. 2011b, 2012) show that mean reversion, which assumes
the poor stock may perform well in the subsequent periods, may better explain the
markets. PAMR and CWMR can exploit the mean reversion property well and achieve
good results on most datasets at the time, especially on the New York Stock Exchange
benchmark dataset (Cover 1991). However, they rely on a naïve assumption that next
∗ This assumption requires some transformations. That is, given x ∈ R+ , minimizing b · x is equivalent
t m t
to maximizing b · x1t . The latter follows the analysis framework here.

T&F Cat #K23731 — K23731_C011 — page 84 — 9/30/2015 — 16:44


PRELIMINARIES

Table 11.1 Summary of existing optimization formulations and their underlying predictions

Categories Methods Formulations i )


Prediction (x̃t+1 Probability (pi )
n 1
In hindsight BCRP bt+1 = arg maxb∈m i=1 n log b · xi xi , i = 1, . . . , n 1/n
1 EG bt+1 = arg maxb∈m log b · xt − λR(b, bt ) xt 1.00
PAMR bt+1 = arg minb∈m b · xt + λR(b, bt ) 1/xt 1.00
CWMR bt+1 = arg minb∈m Prob(b · xt ) + λR(b, bt ) 1/xt 1.00

2 SCRP bt+1 = arg maxb∈m ti=1 1t log b · xi xi , i = 1, . . . , t 1/t

ONS bt+1 = arg maxb∈m ti=1 1t log b · xi − λR(b) xi , i = 1, . . . , t 1/t

3 BK /BNN /CORN bt+1 = arg maxb∈m i∈Ct |C1t | log b · xi xi , i ∈ C t 1/|Ct |
Note: R(·) and R(·, ·) denote regularization terms, such as L2 norm. PAMR/CWMR’s prediction is not of strictly
equivalence, which we do not prove.

T&F Cat #K23731 — K23731_C011 — page 85 — 9/30/2015 — 16:44


85
86 ONLINE MOVING AVERAGE REVERSION
price relative x̃t+1 will be inversely proportional to last price relative xt . In particular,
they implicitly assume that next price p̃t+1 will revert to last price pt−1 ,

1 p̃t+1 pt−1
x̃t+1 = =⇒ = =⇒ p̃t+1 = pt−1 .
xt pt pt

Note that both x and p are vectors and the above operations are element-wise.
Though empirically effective on most datasets, PAMR and CWMR’s single-
period assumption causes two potential problems. Firstly, both algorithms suffer
from frequently fluctuating raw prices, as they often contain a lot of noise. Secondly,
their assumption of single-period mean reversion may not always be satisfied in the
real world. Even two consecutive declining price relatives, which are common, can
deactivate or fail both algorithms. One real example (Li et al. 2012) is the DJIA
dataset (Borodin et al. 2004), on which PAMR performs the worst among the state of
the art. Thus, traders are more likely to predict prices using some long-term values.
Also on the DJIA dataset, Anticor, which exploits the multiperiod statistical corre-
lation, performs much better than others. However, due to its heuristic nature (Li
et al. 2011b, 2012), Anticor cannot fully exploit the mean reversion property. The
two problems caused by the single-period assumption and Anticor’s inability to fully
exploit mean reversion call for a more powerful approach to effectively exploit mean
reversion, especially in terms of multiple periods.
Now let us see a classic example (Cover and Gluss 1986) to illustrate the draw-
backs of single-period mean reversion, as shown in Table 11.2. The toy market consists
of cash and one volatile stock, whose market sequencefollows  A. It is easy to prove
that best constant rebalanced portfolio (BCRP) (b = 12 , 12 ) can grow by a factor
 n/2
of 89 , while PAMR can grow by a better factor of 32 × 2(n−1)/2 . Note that this
virtual sequence is essentially single-period mean reversion, which perfectly fits with
PAMR and CWMR’s assumption. However, if market sequence does not satisfy such
an assumption, both PAMR and CWMR would fail badly. Let us extend the market
sequence to a two-period reversion, that is, market sequence B. In such a market,
BCRP can achieve the same growth as before. Contrarily, PAMR can achieve a con-
stant wealth 32 , which has no growth! More generally, if we further extend to k-period
mean reversion,  1 BCRP
 can still achieve the same growth, while PAMR will grow to
 (n−1)× − 1

2× 2
3 1 2 k , which definitely approaches bankruptcy if k ≥ 3.

To better exploit the (multiperiod) mean reversion property, we proposed a new


type of algorithms, OLMAR, for OLPS. The essential idea is to exploit multiperiod
moving average (mean) reversion via power online machine learning. Rather than
p̃t+1 = pt−1 , OLMAR assumes that the next price will revert to a moving average
(MA), that is, p̃t+1 = MAt , where MAt denotes the MA till the end of period t. In
time-series analysis, MA focuses on long-term trends and is typically used to smooth
short-term price fluctuations, and thus can solve the two drawbacks of existing mean
reversion algorithms.

T&F Cat #K23731 — K23731_C011 — page 86 — 9/30/2015 — 16:44


PRELIMINARIES

Table 11.2 Illustration of the mean reversion strategies on toy markets


Market Sequences BCRP PAMR OLMAR
     9 n/2 3 n−1 9
A : (1, 2), 1, 12 , (1, 2), 1, 12 , . . . ×2 2
2 8
 1  1  89 n/2 3 9 n−4
B : (1, 2), (1, 2), 1, 2 , 1, 2 , (1, 2), . . . 2
2 16 × 2
       89 n/2 3
 1  n−1 9 n−5
C : (1, 2), (1, 2), (1, 2), 1, 12 , 1, 12 , 1, 12 , (1, 2), . . . × 6 6
2 8 ×2
 89 n/2 3
 21  n−1 9 n−6
D : (1, 2), . . . , (1, 2), (1, 1/2), . . . , (1, 1/2), (1, 2), . . . × 4 8
- ./ 0 - ./ 0 8 2 2 4 ×2
k=4 k=4
 9 n/2 3
 1 (n−1)× 3 9 n−7
E : (1, 2), . . . , (1, 2), (1, 1/2), . . . , (1, 1/2), (1, 2), . . . × 10 ×2 10
- ./ 0 - ./ 0 8 2 2 2
k=5 k=5
Note: Since OLMAR is sensitive to the windows size, we set its windows to k. We calculate all
OLMAR values with a mean reversion threshold of 2.

T&F Cat #K23731 — K23731_C011 — page 87 — 9/30/2015 — 16:44


87
88 ONLINE MOVING AVERAGE REVERSION
Without detailing the calculation,∗ we list the growth of OLMAR in different
toy markets in Table 11.2. Clearly, OLMAR performs much better than PAMR in
multiperiod mean reversion, but PAMR performs better than OLMAR in single-period
reversion. Further empirical evaluations in Part IV show that the markets are more
likely to follow multiperiod reversion.

11.2 Formulations
In this chapter, we adopt two types of moving average. The first, the so-called simple
moving average (SMA), truncates the historical prices via a window and calculates
its arithmetical average:

1 
t
SMAt (w) = pi ,
w
i=t−w+1

where w denotes the window size and the summation is element-wise. Although
we can enlarge the window size such that SMA can include more historical price
relatives, the empirical evaluations in Part IV show that as the window size increases,
its performance drops.
To consider entire price relatives rather than a window, the second type, exponen-
tial moving average (EMA), adopts all historical prices, and each price is exponentially
weighted,

EMA1 (α) = p1
EMAt (α) = αpt + (1 − α)EMAt−1 (α)
= αpt + (1 − α)αpt−1 + (1 − α)2 αpt−2 + · · · + (1 − α)t−1 p1 ,

where α ∈ (0, 1) denotes a decaying factor.


To this end, we can calculate the predicted price relative vector following the idea
of the so-called moving average reversion (MAR). Based on the two types of moving
average, we can infer two types of MAR.

Moving Average Reversion: MAR-1


 
SMAt (w) 1 pt pt−1 pt−w+1
x̃t+1 (w) = = + +···+
pt w pt pt pt
  (11.1)
1 1 1
= 1 + + · · · + w−2 ,
w xt i=0 xt−i

where w is the window size and denotes the element-wise product.

∗ We calculate OLMAR’s growth using Algorithm 11.1. As the market sequences repeat themselves,
OLMAR will finally stabilize.

T&F Cat #K23731 — K23731_C011 — page 88 — 9/30/2015 — 16:44


FORMULATIONS 89
Moving Average Reversion: MAR-2

EMAt (α) αpt + (1 − α)EMAt−1 (α)


x̃t+1 (α) = =
pt pt
EMAt−1 (α) pt−1
= α1 + (1 − α) (11.2)
pt−1 pt
x̃t
= α1 + (1 − α) ,
xt

where α ∈ (0, 1) denotes the decaying factor and the operations are all element-wise.
Based on the expected price relative vector in Equations 11.1 and 11.2, OLMAR
further adopts the idea of an effective online learning algorithm, that is, passive–
aggressive (PA) (Crammer et al. 2006) learning, to exploit the MAR. Generally
proposed for classification, PA passively keeps the previous solution if the classi-
fication is correct, while aggressively approaches a new solution if the classification
is incorrect. After formulating the proposed OLMAR, we solve its closed-form update
and design specific algorithms.
The proposed formulation, OLMAR, is to exploit MAR via PA online learning.
The basic idea is to maximize the expected return b · x̃t+1 and keep last portfolio
information via a regularization term. Thus, we follow the similar idea of PAMR (Li
et al. 2012) and formulate an optimization as follows.

Optimization Problem: OLMAR

1
bt+1 = arg min b − bt 2 s. t. b · x̃t+1 ≥ .
b∈m 2

Note that we adopt expected return rather than expected log return. According to
Helmbold et al. (1998), to solve the optimization with expected log return, one can
adopt the first-order Taylor expansion, which is essentially linear. Such discussions
are illustrated in Sections 9.2 and 10.2.
The above formulation explicitly reflects the basic idea of the proposed OLMAR.
On the one hand, if its constraint is satisfied, that is, the expected return is higher than
a threshold, then the resulting portfolio becomes equal to the previous portfolio. On
the other hand, if the constraint is not satisfied, then the formulation will figure out
a new portfolio such that the expected return is higher than the threshold, while the
new portfolio is not far from the last one.
Since OLMAR follows the same learning principle as PAMR, their formulations
are similar. However, the two formulations are essentially different. In particular,
PAMR’s core constraint (i.e., b · xt ≤ ) adopts the raw price relative and has a dif-
ferent inequality sign. After a certain transformation, PAMR may be written in a
similar form, as shown in Table 11.1. However, the prediction functions are different
(i.e., OLMAR adopts multiperiod mean reversion, while PAMR exploits single-period
mean reversion).

T&F Cat #K23731 — K23731_C011 — page 89 — 9/30/2015 — 16:44


90 ONLINE MOVING AVERAGE REVERSION
11.3 Algorithms
The preceding formulation is thus convex and straightforward to solve via convex
optimization (Boyd and Vandenberghe 2004). We now derive the OLMAR solution
as illustrated in Proposition 11.1.

Proposition 11.1 The solution of OLMAR without considering the nonnegativity


constraint is

bt+1 = bt + λt+1 (x̃t+1 − x̄t+1 1),

where x̄t+1 = m1 (1 · x̃t+1 ) denotes the average predicted price relative, and λt+1 is
the Lagrangian multiplier calculated as
" #
 − bt · x̃t+1
λt+1 = max 0, .
x̃t+1 − x̄t+1 1 2

Proof The proof can be found in Appendix B.4.1.


Following PAMR and CWMR, the above derivation first ignores the nonnegativity
constraint (Helmbold et al. 1998). Thus, it is possible that the resulting portfolio
goes out of the portfolio simplex domain. To maintain a proper portfolio, we finally

Algorithm 11.1: Online portfolio selection with OLMAR.


Input:  > 1: Reversion threshold; w ≥ 1: Window size; α ∈ (0, 1): Decaying
factor; x1n : Market sequence.
Output: Sn : Cumulative wealth after n periods.
begin
Initialization: b1 = m1 1, S0 = 1, x̃1 = 1;
for t = 1, . . . , n do
Rebalance the portfolio to bt
Receive stock price relatives: xt
Calculate daily return and cumulative return: St = St−1 × (bt · xt )
Predict next price relative vector:
⎧  
⎨ 1 1+ 1 +···+  1 MAR-1
x̃t+1 = w xt w−2
i=0 xt−i
⎩ x̃t
α1 + (1 − α) xt MAR-2

Update the portfolio:

bt+1 = OLMAR(, x̃t+1 , bt )

end
end

T&F Cat #K23731 — K23731_C011 — page 90 — 9/30/2015 — 16:44


ANALYSIS 91

Algorithm 11.2: Online Moving Average Reversion: OLMAR(, x̃t+1 , bt ).


Input:  > 1: Reversion threshold; x̃t+1 : Predicted price relatives; bt : Current
portfolio.
Output: bt+1 : Next portfolio.
begin
Calculate the following variables:
" #
1 x̃t+1  − bt · x̃t+1
x̄t+1 = , λt+1 = max 0,
m x̃t+1 − x̄t+1 1 2

Update the portfolio:

bt+1 = bt + λt+1 (x̃t+1 − x̄t+1 1)

Normalize bt+1 :
bt+1 = arg min b − bt+1 2
b∈m

end

project the portfolio to the simplex domain (Duchi et al. 2008), which costs linear
time.
To this end, we can design the proposed algorithm based on the proposition.
The proposed OLMAR procedure is demonstrated in Algorithm 11.1, and the OLPS
procedure utilizing the OLMAR algorithm is illustrated in Algorithm 11.2.

11.4 Analysis
The update of OLMAR is straightforward, that is, bt+1 = bt + λt+1 (x̃t+1 − x̄t+1 1).
This second part of the update formula, +λt+1 (x̃t+1 − x̄t+1 1), coincides with the
general form (Conrad and Kaul 1998, Eq. (1)) of return-based momentum strategies,
except the varying λt+1 . Intuitively, the update divides assets into two groups by pre-
diction average. For assets in the group with higher predictions than average, OLMAR
increases their proportions; for other assets, OLMAR decreases their proportions. The
transferred proportions are related to the surprise of predictions over their average
value and the nonnegative Lagrangian multiplier. This is consistent with the normal
portfolio selection procedure, that is, to transfer the wealth to assets with a better
prospect to grow.
Clearly, the OLMAR update costs linear time per period with respect to m, and
the normalization step can also be implemented in linear time (Duchi et al. 2008).
To the best of our knowledge, OLMAR’s linear time is no worse than any existing
algorithms, which can be inferred from Table 10.3.

T&F Cat #K23731 — K23731_C011 — page 91 — 9/30/2015 — 16:44


92 ONLINE MOVING AVERAGE REVERSION
11.5 Summary
This chapter proposed a novel online portfolio selection (OLPS) strategy named
online moving average reversion (OLMAR), which exploits moving average rever-
sion (MAR) via online learning algorithms. The approach can solve the problems of
the state of the art caused by the assumption of single-period mean reversion and
achieve satisfying results in real markets. It also runs extremely fast and is suitable
for large-scale real applications.
In future, we will further explore the theoretical aspect of mean reversion and
analyze the behaviors of mean reversion–based portfolios.

T&F Cat #K23731 — K23731_C011 — page 92 — 9/30/2015 — 16:44


Part IV

Empirical Studies

93

T&F Cat #K23731 — K23731_S004 — page 93 — 9/26/2015 — 8:13


Chapter 12

Implementations

As we have proposed several online portfolio selection (OLPS) algorithms, we are


interested in whether they work in real markets. To examine their empirical efficacy,
we conducted an extensive set of empirical studies on a variety of real datasets. In our
evaluations, we adopted six real datasets, which were collected from several diverse
financial markets. The performance metrics include cumulative wealth (return) and
risk-adjusted returns (based on volatility risk and drawdown risk). We also compared
the proposed algorithms with various existing algorithms. The results clearly demon-
strate that the proposed algorithms sequentially surpass the state-of-the-art techniques
in terms of either metric.
This chapter is organized as follows. Section 12.1 describes the experimental plat-
form or the OLPS platform. Section 12.2 details the experimental testbed, including
six real datasets. Section 12.3 sets up all the proposed algorithms and illustrates sev-
eral compared approaches. Section 12.4 introduces the performance metrics used for
the empirical studies. Finally, Section 12.5 summarizes this chapter.

12.1 The OLPS Platform


To evaluate the performance of a proposed algorithm, researchers and practitioners
usually implemented a back-test system, simulating the strategies using historical
market data. We also designed a back-test system, named “OLPS”, as follows, and
Appendix A describes the details of the OLPS toolbox. It implements a frame-
work for back-testing and various algorithms for online portfolio selection. Based
on MATLAB ,∗ it is compatible with Window, Linux, and Mac OS. Figure 12.1
illustrates the structure of the OLPS toolkit, which consists of three parts. The first
part on the upper left preprocesses data, that is, it loads a specified dataset and initial-
izes the trading environments, such as log files, timing variable. The second part on
the lower level calls OLPS algorithms and simulates the trading process for strategies
based on the data prepared in the first part. The third part in the upper right postpro-
cesses the outputs from the second part, that is, it statistically analyzes the returns and
calculates some risk-adjusted returns.

∗ More details are available at [Link]

95

T&F Cat #K23731 — K23731_C012 — page 95 — 9/30/2015 — 16:46


96 IMPLEMENTATIONS
OLPS: Preprocess OLPS: Postprocess
Statistical t-test
Data Load data
Volatility risk and Sharpe ratio
Initialize log files
Drawdown analysis and Calmar ratio

OLPS: Algorithmic trading

Benchmarks Follow the winner Follow the loser Pattern matching–based


Market Universal portfolios Anticorrelation Nonparametric kernel-based log-optimal
Best stock Exponential gradient Passive–aggressive mean reversion Nonparametric nearest neighbor log-optimal
BCRP Online Newton step Confidence-weighted mean reversion Correlation-driven nonparametric learning
Switching portfolios Online moving average reversion

Figure 12.1 Structure of the OLPS toolbox.

12.1.1 Preprocess
This step aims to prepare trading environments. As existing datasets are often in MAT
files,∗ OLPS accepts datasets in MAT format. The dataset often contains an n × m
matrix, where n denotes the number of trading periods and m refers to the number of
assets. It is straightforward to incorporate market feeds† from real markets, such that
the toolkit can handle real-time data and conduct paper or even real trading.‡

12.1.2 Algorithmic Trading


This step conducts simulations based on historical real-market data. In our framework,
implementing a new strategy generally requires four files: a start file, a run file, a
kernel file, and an expert file. The start (entry) file extracts parameters and call the
corresponding run file. The run file simulates a whole trading process and calls its
kernel file to construct a portfolio for each period, which is used for rebalancing.
The kernel file outputs a final portfolio, while it facilitates the development of meta-
algorithms, which effectively combines multiple experts’ portfolios. The expert file
outputs one portfolio depending on the input data and specific parameters. In case of
only one expert, the kernel file is not necessary and directly enters the expert file.
OLPS implements the following OLPS algorithms:
• Benchmarks (Market, Best stock, and BCRP).
• Follow the winner approaches (UP, EG, and ONS): make portfolio decisions fol-
lowing the assumption that the next price relatives (or experts for UP) will follow
the previous one.
• Follow the loser approaches (Anticor, PAMR, CWMR, and OLMAR): make
portfolio decisions by assuming that next price relatives will revert to previous
trends.
∗A full description about MAT files can be found at [Link]
matfile_format.pdf
† For example, Interactive Brokers ([Link] provides free APIs.
‡ Both paper and real trading require users to implement an order submission step, while back-test
does not.

T&F Cat #K23731 — K23731_C012 — page 96 — 9/30/2015 — 16:46


DATA 97
• Pattern matching–based approaches (BK , BNN , and CORN): locate a set containing
similar price relatives and make optimal portfolios based on the set.
• Others: some are ad hoc algorithms, such as M0/T0.

12.1.3 Postprocess
After the algorithmic trading simulation, this step processes the results by providing
the following performance metrics:
• Cumulative return: The most widely used in related studies;
• Volatility and Sharpe ratio: Typically used to measure risk-adjusted return in the
investment industry;
• Drawdown and Calmar ratio: Used to measure downside risk and related risk-
adjusted return;
• T-test statistics: Tests whether a strategy’s return is significantly different from that
of the market.

12.2 Data
In our study, we focus on historical daily closing prices in stock markets, which are
easy to obtain from public domains (such as Yahoo Finance and Google Finance∗ ),
and thus are publicly available to other researchers. Data from other types of markets,
such as high-frequency intraday quotes† and Forex markets, are either too expensive
or hard to obtain and process, and thus may reduce the experimental reproducibil-
ity. Summarized in Table 12.1, six real and diverse datasets from several financial
markets‡ are employed.
The first dataset, “NYSE (O),” is one “standard” dataset pioneered by Cover
(1991) and followed by others (Helmbold et al. 1998; Borodin et al. 2004; Agarwal
et al. 2006; Györfi et al. 2006, 2008). This dataset contains 5651 daily price relatives
of 36 stocks§ in the New York Stock Exchange (NYSE) for a 22-year period from
July 3, 1962, to December 31, 1984.
The second dataset is an extended version of the NYSE (O) dataset. For consis-
tency, we collected the latest data in the NYSE from January 1, 1985, to June 30,
2010, a period that consists of 6431 trading days. We denote this new dataset as
“NYSE (N).”¶ Note that the new dataset consists of 23 stocks rather than the pre-
vious 36 stocks owing to amalgamations and bankruptcies. All self-collected price

∗Yahoo Finance: [Link] and Google Finance: [Link]


† We did evaluate certain algorithms using high-frequency data and weekly data, as in Li et al. (2013).
‡All related codes and datasets, including their compositions, are available at [Link]
Borodin et al. (2004)’s datasets (NYSE (O), TSE, SP500, and DJIA) are also available at
[Link]
§According to Helmbold et al. (1998), the dataset was originally collected by Hal Stern. The stocks are
mainly large cap stocks in NYSE; however, we do no know the criteria of choosing these stocks.
¶ The dataset before 2007 was collected by Gábor Gelencsér ([Link]
we collected the remaining data from 2007 to 2010 via Yahoo Finance.

T&F Cat #K23731 — K23731_C012 — page 97 — 9/30/2015 — 16:46


98 IMPLEMENTATIONS
Table 12.1 Summary of the six datasets from real markets
Dataset Market Region Time Frame # Periods # Assets
NYSE (O) Stock USA July 3, 1962– 5651 36
December 31, 1984
NYSE (N) Stock USA January 1, 1985– 6431 23
June 30, 2010
TSE Stock CA January 4, 1994– 1259 88
December 31, 1998
SP500 Stock USA January 2, 1998– 1276 25
January 31, 2003
MSCI Index Global April 1, 2006– 1043 24
March 31, 2010
DJIA Stock USA January 14, 2001– 507 30
January 14, 2003

relatives are adjusted for splits and dividends, which is consistent with the previous
“NYSE (O)” dataset.
The third dataset, “TSE,” is collected by Borodin et al. (2004), and it consists
of 88 stocks from the Toronto Stock Exchange (TSE) containing price relatives of
1259 trading days, ranging from January 4, 1994, to December 31, 1998. The fourth
dataset, SP500, is collected by Borodin et al. (2004), and it consists of 25 stocks
with the largest market capitalizations in the 500 SP500 components. It ranges from
January 2, 1998, to January 31, 2003, containing 1276 trading days.
The fifth dataset is “MSCI,” which is a collection of global equity indices that
constitute the MSCI World Index.∗ It contains 24 indices that represent the equity
markets of 24 countries around the world, and it consists of a total of 1043 trading
days, ranging from April 1, 2006, to March 31, 2010. The final dataset is the DJIA
dataset (Borodin et al. 2004), which consists of 30 Dow Jones composite stocks. DJIA
contains 507 trading days, ranging from January 14, 2001, to January 14, 2003.
Besides the six real-market data, in the main experiments (i.e., Experiment 1 in
Section 13.1), we also evaluate each dataset in their reversed form (Borodin et al.
2004). For each dataset, we create a reversed dataset, which reverses the original
order and inverts the price relatives. We denote these reverse datasets using a ‘−1’
superscript on the original dataset names. In nature, these reverse datasets are quite
different from the original datasets, and we are interested in the behaviors of the
proposed algorithms on such artificial datasets.
Unlike previous studies, the above testbed covers much longer trading peri-
ods from 1962 to 2010 and much more diversified markets, which enables us to
examine the behaviors of the proposed strategies under different events and crises.
For example, it covers several well-known events in the stock markets, such as the

∗ The constituents of the MSCI World Index are available on MSCI Barra ([Link]
accessed on 28 May 2010.

T&F Cat #K23731 — K23731_C012 — page 98 — 9/30/2015 — 16:46


SETUPS 99
dot-com bubble from 1995 to 2000 and the subprime mortgage crisis from 2007 to
2009. The five stock datasets are mainly chosen to test the capability of the pro-
posed algorithms on regional stock markets, while the index dataset aims to test their
capability on global indices, which may be potentially applicable to a fund of funds
(FOF).∗ As a remark, although we numerically test the proposed algorithms on stock
and exchange traded funds (ETF) markets, we note that the proposed strategies could
be generally applied to any type of financial market.

12.3 Setups
In our experiments, we implemented all the proposed approaches: CORN-U,
CORN-K, PAMR, PAMR-1, PAMR-2, CWMR-Var, CWMR-Stdev, OLMAR-1, and
OLMAR-2. For CWMR algorithms, we only present the results achieved by the
deterministic versions. The results of the stochastic versions are presented in Li et al.
(2013). Besides individual algorithms, we also designed their buy and hold (BAH)
versions whose results can be found on their respective studies (Li et al. 2011b,
2012, 2013; Li and Hoi 2012). Without ambiguity, when referring to CORN, PAMR,
CWMR, and OLMAR, we often focus on their representative versions, that is,
CORN-U, PAMR, CWMR-Stdev, and OLMAR-1, respectively.
As the proposed algorithms are all online, we follow the existing work and simply
set the parameters empirically without tuning for each dataset separately. Note that
the best values for these parameters are often dataset dependent, and our choices are
not always the best, as we will further evaluate in Section 13.3. Below, we introduce
the parameter settings of the proposed algorithms.
For the proposed CORN experts, two possible parameters can affect their perfor-
mance, that is, the correlation coefficient threshold ρ and the window size w. In our
evaluations, we simply fix ρ = 0.1 and W = 5 for the CORN-U algorithm, which is
not always the best. And for the CORN-K algorithm, we first fix W = 5, P = 10,
and K = 50, which means choose all experts in the experiments and denote it as
“CORN-K1.” We also provide “CORN-K2,” whose parameters are fixed as W = 5,
P = 10, and K = 5.
There are two key parameters in the proposed PAMR algorithms. One is the
sensitivity parameter , and the other is the aggressiveness parameter C. Specifically,
for all datasets and experiments, we set the sensitivity parameter  to 0.5 in the
three algorithms, and set the aggressiveness parameter C to 500 in both PAMR-1
and PAMR-2, with which the cumulative wealth achieved tends to be stable on most
datasets. Our experiments on the parameter sensitivity show that the proposed PAMR
algorithms are quite robust with respect to different parameter settings.
CWMR has two key parameters, that is, the confidence parameter φ and the
sensitivity parameter . We set the sensitivity parameter  to 0.5 and set the confi-
dence parameter φ to 2.0, or equivalently 95% confidence level, in both CWMR-Var
and CWMR-Stdev. As the results show, the proposed CWMR algorithm is generally

∗ Note that not every index is tradable through ETFs.

T&F Cat #K23731 — K23731_C012 — page 99 — 9/30/2015 — 16:46


100 IMPLEMENTATIONS
robust with respect to different parameter settings and our choices are not always the
best.
For OLMAR, in all cases, we empirically set the mean reversion parameter,
that is,  = 10, which provides consistent results. Individually, we set w = 5 for
OLMAR-1 and α = 0.5 for OLMAR-2. As the results show, it is easy to choose
satisfying parameters for the proposed OLMAR algorithms.

12.3.1 Comparison Approaches and Their Setups


We compare the proposed algorithms with a number of benchmarks and representative
strategies. Below we summarize a list of compared algorithms, all of which provide
extensive empirical evaluations in their respective studies. Focusing on empirical stud-
ies, we ignore certain algorithms that focus on theoretical analysis and lack thorough
empirical evaluations.∗ All parameters are set following their original studies.†
1. Market: Market strategy, that is, uniform BAH strategy
2. Best-Stock: Best stock in the market, which is a strategy in hindsight
3. BCRP: Best constant rebalanced portfolios strategy in hindsight
4. UP: Cover’s universal portfolios implemented according to Kalai and Vempala
(2002), where the parameters are set as δ0 = 0.004, δ = 0.005, m = 100, and
S = 500
5. EG: Exponential gradient algorithm with the best learning rate η = 0.05 as
suggested by Helmbold et al. (1998)
6. ONS: Online Newton step with the parameters suggested by Agarwal et al. (2006),
that is, η = 0, β = 1, γ = 18
7. Anticor: BAH30 (Anticor(Anticor)) as a variant of Anticor to smooth the perfor-
mance, which achieves the best performance among the three solutions proposed
by Borodin et al. (2004)
8. BK : Nonparametric kernel-based moving window strategy with W = 5, L = 10,
and threshold c = 1.0, which has the best empirical performance according
to Györfi et al. (2006)
9. BNN : Nonparametric nearest-neighbor-based strategy with parameters W = 5,
L = 10, and p = 0.02 + 0.5 L−1
−1
, as the authors suggested (Györfi et al. 2008)

12.4 Performance Metrics


We adopt the most common metric, cumulative wealth, to primarily compare different
trading strategies. In addition to the cumulative wealth, we also adopt the annual-
ized Sharpe ratio (SR) to compare the performance of different trading algorithms.
In general, higher values of the cumulative wealth and annualized SR indicate bet-
ter algorithms. Besides, we also adopt maximum drawdown (MDD) and the Calmar
ratio (CR) for analyzing a strategy’s downside risk. The lower the MDD values,
∗ Our OLPS platform provides these algorithms.
† We can tune their parameters for better performance, but it is beyond the scope of this book.

T&F Cat #K23731 — K23731_C012 — page 100 — 9/30/2015 — 16:46


SUMMARY 101
Table 12.2 Summary of the performance metrics used in the evaluations
Criteria Performance Metrics
Absolute return Cumulative wealth (Sn ) Annualized percentage yield
Risk Annualized standard deviation Maximum drawdown
Risk-adjusted return Annualized Sharpe ratio (SR) Calmar ratio (CR)

the less the strategy’s (downside) risk. The higher the CR values, the better the strat-
egy’s (downside) risk-adjusted return. We summarize them in Table 12.2 and present
their details as follows.

12.5 Summary
A strategy has to be back-tested using historical market data, such that we have
confidence that it will continue to be effective in the unseen future markets. This
chapter introduces some implementation issues for the empirical studies, including the
platform, data, and various setups. In future, we can further extend the online portfolio
selection (OLPS) system using real-market feeds and execute the orders using a
paper trading account or real trading account. The next chapter will demonstrate the
empirical results obtained from the implementation and corresponding back-tests.

T&F Cat #K23731 — K23731_C012 — page 101 — 9/30/2015 — 16:46


Chapter 13

Empirical Results

This chapter introduces the empirical results of the algorithms using the historical
market data. These results will demonstrate the effectiveness of these strategies
and provide confidence on their practicability in real trading. We also relax some
constraints to evaluate their capability in real trading scenarios.
This chapter is organized as follows. Section 13.1 conducts the experiments
to evaluate the cumulative wealth for all the algorithms. Section 13.2 shows the
experimental results of risk-adjusted returns. Section 13.3 measures the sensitivity
of parameters for these algorithms. Section 13.4 relaxes transaction costs and mar-
gin buying constraints. Section 13.5 compares the computational times for different
algorithms. Section 13.6 further analyzes the behaviors of the proposed algorithms.
Finally, Section 13.7 summarizes this chapter and proposes some future directions.

13.1 Experiment 1: Evaluation of Cumulative Wealth


First, we compare the performance of the competing approaches based on their cumu-
lative return, which is the main metric of this study. From the experimental results
shown in Table 13.1, we can draw several observations.
First of all, we observe that most online portfolio selection (OLPS) strategies gen-
erally perform better than the market and the best stock in a market, which indicates
that it is promising to investigate learning algorithms for portfolio selection. Second,
although the follow the winner approaches (UP, EG, and ONS) achieve higher cumu-
lative wealth than the market strategy, their performance is significantly less than that
of the follow the loser approach (Anticor) or the pattern matching–based strategies
(BK and BNN ). Thus, to achieve better investment return, it is more powerful and
promising to exploit the latter two approaches. Third, on all original datasets (except
the DJIA dataset), the proposed strategies significantly outperform most competitors,
including Anticor, BK , and BNN , which are the state of the art. In particular, the pro-
posed algorithms sequentially beat existing strategies. For example, on the benchmark
dataset NYSE (O), the state-of-the-art performance is 3.35E+11 achieved by BNN .
Our proposed algorithms achieve much better performances of 1.48E+13, 5.14E+15,
6.51E+15, and 3.68E+16 for CORN, PAMR, CWMR, and OLMAR, respectively.

103

T&F Cat #K23731 — K23731_C013 — page 103 — 9/28/2015 — 21:35


104 EMPIRICAL RESULTS
Table 13.1 Cumulative wealth achieved by various trading strategies on the six datasets and
their reversed datasets
Algorithms NYSE (O) NYSE (N) TSE SP500 MSCI DJIA
Market 14.50 18.06 1.61 1.34 0.91 0.76
Best-stock 54.14 83.51 6.28 3.78 1.50 1.19
BCRP 250.60 119.81 6.78 4.07 1.51 1.24
UP 26.68 31.49 1.60 1.62 0.92 0.81
EG 27.09 31.00 1.59 1.63 0.93 0.81
ONS 109.19 21.59 1.62 3.34 0.86 1.53
Anticor 2.41E+08 6.21E+06 39.36 5.89 3.22 2.29
BK 1.08E+09 4.64E+03 1.62 2.24 2.64 0.68
BNN 3.35E+11 6.80E+04 2.27 3.07 13.47 0.88
CORN-U 1.48E+13 5.37E+05 3.56 6.35 26.10 0.84
CORN-K1 3.19E+13 1.94E+05 1.65 4.64 16.32 0.79
CORN-K2 6.10E+13 4.86E+05 1.74 9.12 80.41 0.82
PAMR 5.14E+15 1.25E+06 264.86 5.09 15.23 0.68
PAMR-1 5.13E+15 1.26E+06 260.26 5.08 15.51 0.69
PAMR-2 4.88E+15 1.36E+06 249.95 5.00 16.87 0.71
CWMR-Var 6.51E+15 1.44E+06 328.61 5.94 17.27 0.69
CWMR-Stdev 6.49E+15 1.41E+06 332.62 5.90 17.28 0.68
OLMAR-1 3.68E+16 2.54E+08 424.80 5.83 16.39 2.12
OLMAR-2 1.02E+18 4.69E+08 732.44 9.59 22.51 1.16

Algorithms NYSE (O)−1 NYSE (N)−1 TSE−1 SP500−1 MSCI−1 DJIA−1


Market 0.12 1.27 1.67 0.88 1.26 1.44
Best-stock 0.33 24.59 37.65 1.65 3.45 2.77
BCRP 2.86 56.60 58.61 1.91 3.45 2.98
UP 0.23 0.3 1.18 1.10 1.26 1.54
EG 0.22 0.38 1.21 1.08 1.27 1.53
ONS 0.84 1.01 1.62 2.97 1.73 2.35
Anticor 1.38E+03 4.26E+04 7.24 9.64 6.31 4.58
BK 2.77E+07 162.74 8.81 1.01 4.47 1.43
BNN 4.60E+09 3.57E+04 66.09 1.89 30.06 1.85
CORN-U 1.74E+10 8.01E+03 53.06 1.81 36.05 1.83
CORN-K1 4.99E+09 6.79E+03 12.88 1.67 17.87 1.75
CORN-K2 3.19E+10 7.27E+03 40.87 2.63 66.64 1.66
PAMR 2.03E+04 3.07E+04 2.67 7.42 40.33 6.61
PAMR-1 2.02E+04 3.09E+04 2.68 7.43 39.82 6.62
PAMR-2 2.11E+04 3.21E+04 2.75 7.32 39.83 6.65
CWMR-Var 1.67E+04 6.35E+04 4.04 8.09 40.46 6.90
CWMR-Stdev 1.66E+04 6.49E+04 4.05 8.07 40.42 6.91
OLMAR-1 2.07E+04 3.99E+07 2.90 18.40 42.25 9.56
OLMAR-2 3.41E+04 2.26E+07 7.04 40.99 51.51 8.80
Note: Numbers in bold indicate the best results on the corresponding datasets.

T&F Cat #K23731 — K23731_C013 — page 104 — 9/28/2015 — 21:35


EXPERIMENT 2 105
This observation supports the effectiveness of proposed algorithms, and, to the best
of our knowledge, no one has ever claimed such a fantastic performance.
Fourthly, it is promising to see that all pattern matching–based algorithms, espe-
cially CORN, have better performance than the benchmarks on most datasets. And
the proposed CORN significantly outperforms the existing pattern matching–based
algorithms, including BK and BNN , validating its motivating idea of improving the
matching process. Fifthly, the encouraging results achieved by the last three strategies
(PAMR, CWMR, and OLMAR) validate the importance of exploiting mean reversion
in financial markets via an effective learning algorithm.
In addition, we can see that most algorithms perform poorly on the DJIA
dataset, including CORN, PAMR, and CWMR. While the failure of CORN is still
unexplainable, the failure of the two mean reversion algorithms indicates that the
motivating (single-period) mean reversion may not exist in the dataset, as analyzed in
Sections 10.1 and 11.1.2. While OLMAR is proposed to explore (multiple-period)
moving average reversion, it can achieve much better performance, which thus
validates its motivating idea.
On the reversed datasets, though not as shiny as the original datasets, the pro-
posed algorithms also perform excellently. In all cases, the proposed algorithms not
only beat the benchmarks, including the market and BCRP, but also achieve the best
performance. Note that these reversed datasets are artificial datasets, which never
exist in real markets. However, algorithms’ behaviors on these datasets still provide
strong evidence that the proposed algorithms can effectively exploit the markets and
outperform the benchmarks and the state of the art.
Besides the final cumulative wealth, we are also interested in examining how the
cumulative wealth changes over the entire trading periods. Figure 13.1 shows the
trends of cumulative wealth by the proposed algorithms and four existing algorithms
(two benchmarks and two state-of-the-art algorithms). Note that PAMR and CWMR
almost overlap on the figures; thus, we only present the trends of PAMR. Clearly, the
proposed strategies consistently surpass the benchmarks and the competing strategies
over the entire trading periods on most datasets (except DJIA dataset), which again
validates the efficacy of the proposed techniques.
Finally, to measure whether such excess returns can be obtained by simple luck,
we conduct a statistical t-test as described in Section 12.4. Table 13.2 shows the
statistical results on the four proposed algorithms. The results clearly show that the
observed excess return is impossible to obtain by simple luck on most datasets. To be
specific, on datasets except DJIA, the probabilities for achieving the excess return by
luck are almost 0. On the DJIA dataset, though PAMR and CWMR have a probability
of 40% to achieve the excess return by luck, OLMAR has a probability of only 1.69%.
Nevertheless, the results show that the proposed strategies are promising and reliable
to achieve high returns with high confidence.

13.2 Experiment 2: Evaluation of Risk and Risk-Adjusted Return


We now evaluate the volatility risk and drawdown risk, and the risk-adjusted return in
terms of an annualized Sharpe ratio (SR) and Calmar ratio (CR). Figure 13.2 shows

T&F Cat #K23731 — K23731_C013 — page 105 — 9/28/2015 — 21:35


106 EMPIRICAL RESULTS
109
1016

Total wealth achieved


Total wealth achieved

1012 106

108
103

4
10

100
100 1 1500 3000 4500 6000
1 1500 3000 4500
Trading days Trading days

Market BCRP Anticor ΒΝΝ Market BCRP Anticor ΒΝΝ


CORN PAMR OLMAR CORN PAMR OLMAR

(a) (b)

103
Total wealth achieved
Total wealth achieved

5
102

101

100 1
1 300 600 900 1200 1 300 600 900 1200
Trading days Trading days

Market BCRP Anticor ΒΝΝ Market BCRP Anticor ΒΝΝ


CORN PAMR OLMAR CORN PAMR OLMAR

(c) (d)

27
Total wealth achieved
Total wealth achieved

9 2

1
1

1 300 600 900 1 250 500


Trading days Trading days

Market BCRP Anticor ΒΝΝ Market BCRP Anticor ΒΝΝ


CORN PAMR OLMAR CORN PAMR OLMAR

(e) (f)

Figure 13.1 Trends of cumulative wealth achieved by various strategies on the six datasets:
(a) NYSE (O); (b) NYSE (N); (c) TSE; (d) SP500; (e) MSCI; and (f) DJIA.

T&F Cat #K23731 — K23731_C013 — page 106 — 9/28/2015 — 21:35


EXPERIMENT 2 107
Table 13.2 Statistical t-test of the proposed algorithms on the six datasets
Algorithms Statistics NYSE (O) NYSE (N) TSE SP500 MSCI DJIA
Market Size 5651 6431 1259 1276 1043 507
MER 0.0005 0.0005 0.0004 0.0003 0.0000 −0.0004
CORN WR 53.78% 52.64% 51.55% 52.82% 62.22% 51.08%
MER 0.0058 0.0023 0.0014 0.0018 0.0033 −0.0002
α 0.0052 0.0017 0.0008 0.0014 0.0032 0.0002
β 1.2351 1.0552 1.5050 1.3096 0.8622 0.9047
t-statistics 13.8069 8.2069 1.2119 2.9073 9.8834 0.3466
p-value 0.0000 0.0000 0.1129 0.0019 0.0000 0.3645
PAMR WR 55.87% 51.75% 56.87% 53.37% 59.25% 51.87%
MER 0.0069 0.0026 0.0054 0.0017 0.0029 −0.0003
α 0.0063 0.0021 0.0049 0.0013 0.0029 0.0002
β 1.2095 1.1241 1.4982 1.2375 1.1177 1.2393
t-statistics 15.7829 5.9979 3.9241 2.0020 6.1358 0.2195
p-value 0.0000 0.0000 0.0000 0.0227 0.0000 0.4132
CWMR WR 56.17% 52.08% 56.00% 53.92% 59.44% 51.08%
MER 0.0070 0.0027 0.0057 0.0019 0.0030 −0.0003
α 0.0064 0.0021 0.0051 0.0015 0.0030 0.0002
β 1.2139 1.1325 1.5139 1.2512 1.1161 1.2476
t-statistics 15.9510 5.9496 3.9190 2.1806 6.4078 0.2482
p-value 0.0000 0.0000 0.0000 0.0147 0.0000 0.4020
OLMAR WR 56.91% 53.13% 55.12% 51.49% 58.39% 52.47%
MER 0.0074 0.0036 0.0061 0.0019 0.0030 0.0020
α 0.0068 0.0030 0.0056 0.0015 0.0030 0.0025
β 1.2965 1.1768 1.5320 1.2854 1.1763 1.2627
t-statistics 15.2405 7.3704 3.4583 1.9423 1.1763 1.2627
p-value 0.0000 0.0000 0.0003 0.0262 0.0000 0.0169
Note: MER denotes mean excess return, which equals the mean of daily returns over a risk-
free return. WR denotes winning ratio, which is the ratio of trading periods with a higher
return than the market.

the evaluation results on the six datasets. In addition to the proposed four algorithms,
we also plot two benchmarks (Market and BCRP) and two state-of-the-art algorithms
(Anticor and BNN ). In particular, Figure 13.2a and 13.2b depicts the volatility risk
(standard deviation of daily returns) and the drawdown risk (maximum drawdown)
on the six stock datasets. Figure 13.2c and 13.2d compares their corresponding SRs
and CRs.
In the preceding results on cumulative wealth, we find that the proposed methods
achieve the highest cumulative return on most original datasets. However, high return
is associated with high risk, as no real financial instruments can guarantee high return

T&F Cat #K23731 — K23731_C013 — page 107 — 9/28/2015 — 21:35


108 EMPIRICAL RESULTS
100

80
100
Volatility risk (%)

MDD risk (%)


60 80

60
40
40

20
20

0 0
NYSE (O) NYSE (N) TSE SP500 MSCI DJIA NYSE (O) NYSE (N) TSE SP500 MSCI DJIA
Datasets Datasets

Market Anticor CORN CWMR Market Anticor CORN CWMR


BCRP ΒΝΝ PAMR OLMAR BCRP ΒΝΝ PAMR OLMAR

(a) (b)
10

10
Calmar ratio
Sharpe ratio

0 0
NYSE (O) NYSE (N) TSE SP500 MSCI DJIA NYSE (O) NYSE (N) TSE SP500 MSCI DJIA
Datasets Datasets
Market Anticor CORN CWMR Market Anticor CORN CWMR
BCRP ΒΝΝ PAMR OLMAR BCRP ΒΝΝ PAMR OLMAR

(c) (d)

Figure 13.2 Risk and risk-adjusted performance of various strategies on the six datasets.
In each diagram, the rightmost four bars represent the results of our proposed strategies:
(a) volatility risk; (b) drawdown risk; (c) Sharpe ratio; and (d) Calmar ratio.

without high risk.∗ The volatility risk in Figure 13.2a shows that the proposed four
methods almost achieve the highest risk in terms of volatility risk on most datasets. On
the other hand, the drawdown risk in Figure 13.2b shows that the proposed methods
also achieve high drawdown risk in most datasets. These results validate the notion
that high return is often associated with high risk.
To further evaluate the return and risk, we examine the risk-adjusted return in
terms of an annualized SR and CR. The results in Figure 13.2c and 13.2d clearly
show that CORN, PAMR, and CWMR achieve excellent performance in most cases,

∗ It is true for the long-only portfolio, which is our setting. However, such a statement may be suspect
in regard to long-short portfolios.

T&F Cat #K23731 — K23731_C013 — page 108 — 9/28/2015 — 21:35


EXPERIMENT 3 109
except the DJIA dataset; and OLMAR achieves excellent performance in all datasets.
These encouraging results show that the proposed methods are able to reach a good
trade-off between return and risk, even though we do not explicitly consider risk in
the method formulations.∗

13.3 Experiment 3: Evaluation of Parameter Sensitivity


In the following four subsections, we evaluate how different choices of parameters
affect the proposed four strategies.

13.3.1 CORN’s Parameter Sensitivity


The proposed CORN has two parameters, that is, the correlation coefficient thres-
hold ρ and the window size for the experts w (or W ).
First, let us see the effects of ρ with fixed W , in Figure 13.3. Clearly, the figures
validate the preliminary analysis in Section 8.4. In general, CORN achieves the best
performance when ρ is around 0, as the figures often peak around 0 or some small
positive values; and, when ρ approaches −1 or 1, CORN’s performance degrades.
Although CORN does not perform well on the TSE and DJIA datasets, on which the
cumulative wealth is often less than the BCRP strategy, it significantly outperforms
the two benchmarks on other datasets. Based on the above observation, choosing a
satisfying ρ for CORN is straightforward, as some small positive values often give
good performance on all datasets.
We also examine the effects of W with fixed ρ in Figure 13.4. Note that here CORN
denotes the CORN experts with a specified w and CORN-U denotes the uniform com-
bination of CORN experts with w from 1 to W . Although the cumulative wealth
achieved by CORN experts fluctuates with different w’s, CORN-U’s cumulative
wealth is much more robust with respect to W . Such an observation validates the
effectiveness of the proposed CORN-U and eases the selection of a satisfying W .

13.3.2 PAMR’s Parameter Sensitivity


In this section, we examine PAMR’s parameters, that is, the mean reversion threshold
 for the three algorithms and the aggressiveness parameter C for the two variants.
First, we examine the effect of  on PAMR’s cumulative wealth. As  is greater
than 1, PAMR degrades to uniform constant rebalanced portfolios (CRP) strategy, and
the wealth stabilizes at a constant value achieved by uniform CRP. Thus, we show the
effect of  in the range of [0, 1.5]. Figure 13.5 shows the cumulative wealth achieved
by PAMR with varying  and two benchmarks, that is, Market and BCRP. Results on
most datasets, except the DJIA dataset, show that the cumulative wealth achieved by
PAMR consistently grows as  approaches 0. That is, the smaller the threshold, the
higher the cumulative wealth is, which validates that the motivating mean reversion
does exist on most stock markets. Moreover, in most cases, the cumulative wealth
tends to stabilize as  crosses certain dataset-dependent thresholds. As stated before,
∗ We will study it in future.

T&F Cat #K23731 — K23731_C013 — page 109 — 9/28/2015 — 21:35


110 EMPIRICAL RESULTS

106
1012
Total wealth achieved

Total wealth achieved


104
108

104 102

100 100
–1 –0.5 0 0.5 1 –1 –0.5 0 0.5 1
ρ ρ

CORN Market BCRP CORN Market BCRP

(a) (b)

9 15

Total wealth achieved


Total wealth achieved

5 8

1
1
–1 –0.5 0 0.5 1 –1 –0.5 0 0.5 1
ρ ρ

CORN Market BCRP CORN Market BCRP


(c) (d)

2
64
Total wealth achieved

Total wealth achieved

16

4 1

–1 –0.5 0 0.5 1 –1 –0.5 0 0.5 1


ρ ρ

CORN Market BCRP CORN Market BCRP

(e) (f)

Figure 13.3 Parameter sensitivity of CORN-U with respect to ρ with fixed W (W = 5):
(a) NYSE (O); (b) NYSE (N); (c) TSE; (d) SP500; (e) MSCI; and (f) DJIA.

we choose  = 0.5 in the experiments, with which the cumulative wealth stabilizes
in most cases. Contrarily, on the DJIA dataset, as  approaches 0, the cumulative
wealth achieved by PAMR drops. Such phenomena can be interpreted to mean that
the motivating (single-period) mean reversion does not exist on the dataset, at least in

T&F Cat #K23731 — K23731_C013 — page 110 — 9/28/2015 — 21:35


EXPERIMENT 3 111

1012
Total wealth achieved

Total wealth achieved


106

108
104

104 102

100 100
1 10 20 30 1 10 20 30
W W

CORN-U CORN CORN-U CORN


Market BCRP Market BCRP
(a) (b)

9 15
Total wealth achieved

Total wealth achieved

5
8

1
1
1 10 20 30 1 10 20 30
W W
CORN-U CORN CORN-U CORN
Market BCRP Market BCRP
(c) (d)

2
64
Total wealth achieved
Total wealth achieved

16

4 1

1 10 20 30 1 10 20 30
W W

CORN-U CORN CORN-U CORN


Market BCRP Market BCRP
(e) (f)

Figure 13.4 Parameter sensitivity of CORN-U with respect to w (W) with fixed ρ (ρ = 0.1):
(a) NYSE (O); (b) NYSE (N); (c) TSE; (d) SP500; (e) MSCI; and (f) DJIA.

the sense of our motivation. We also note that, on some datasets, PAMR with  = 0
achieves the best. Though  = 0 means moving more weights to underperforming
stocks, it may not mean moving everything to the worst stock. On the one hand, the
objectives in the formulations would prevent the next portfolio from being far from the

T&F Cat #K23731 — K23731_C013 — page 111 — 9/28/2015 — 21:35


112 EMPIRICAL RESULTS
1016
106
Total wealth achieved

Total wealth achieved


12
10

108
103

104

100 100
0 0.5 1 1.5 0 0.5 1 1.5
ε ε
PAMR Market BCRP PAMR Market BCRP
(a) (b)

103 16

Total wealth achieved


Total wealth achieved

102

4
101

100
1
0 0.5 1 1.5 0 0.5 1 1.5
ε ε
PAMR Market BCRP PAMR Market BCRP

(c) (d)

16
Total wealth achieved
Total wealth achieved

4
1

0 0.5 1 1.5 0 0.5 1 1.5


ε ε

PAMR Market BCRP PAMR Market BCRP

(e) (f)

Figure 13.5 Parameter sensitivity of PAMR with respect to : (a) NYSE (O); (b) NYSE (N);
(c) TSE; (d) SP500; (e) MSCI; and (f) DJIA.

last portfolio. On the other hand, PAMR-1 and PAMR-2 are designed to alleviate the
huge changes. In summary, the experimental results indicate that the proposed PAMR
is robust with respect to the mean reversion sensitivity parameter, in most cases.
Second, we evaluate the other important parameter for both PAMR-1 and
PAMR-2, that is, the aggressiveness parameter C. Figure 13.6 shows the effects on the

T&F Cat #K23731 — K23731_C013 — page 112 — 9/28/2015 — 21:35


EXPERIMENT 3 113
16
10
106
Total wealth achieved

Total wealth achieved


1012

108
103

4
10

100 100
50 500 5000 50 500 5000
C C

PAMR Market BCRP PAMR Market BCRP

(a) (b)

103

4
Total wealth achieved
Total wealth achieved

102

101

100
1
50 500 5000 50 500 5000
C C

PAMR Market BCRP PAMR Market BCRP

(c) (d)

2
16
Total wealth achieved

Total wealth achieved

4
1

50 500 5000 50 500 5000


C C

PAMR Market BCRP PAMR Market BCRP

(e) (f )

Figure 13.6 Parameter sensitivity of PAMR-1 (or PAMR-2) with respect to C with fixed 
( = 0.5): (a) NYSE (O); (b) NYSE (N); (c) TSE; (d) SP500; (e) MSCI; and (f) DJIA.

cumulative wealth by varying C from 50 to 5000 on PAMR-1, with fixed  = 0.5. We


only show the figures on PAMR-1, as the effects on PAMR-2 are similar to those on
PAMR-1. Clearly, the proposed PAMR is insensitive to C, and a wide range of values
correspond to the highest cumulative wealth. This again exhibits that the proposed

T&F Cat #K23731 — K23731_C013 — page 113 — 9/28/2015 — 21:35


114 EMPIRICAL RESULTS
PAMR is robust with respect to its parameters. Similarly, the figure on DJIA again
indicates that the mean reversion effect may not exist on the dataset, in the sense of
our motivation.

13.3.3 CWMR’s Parameter Sensitivity


The proposed CWMR algorithms contain two parameters, that is, the confidence
parameter φ and the mean reversion sensitivity parameter . Throughout the algo-
rithms, the mean reversion sensitivity parameter decisively influences the final
performance, while the confidence parameter has little effect on the final perfor-
mance, whose effects are similar to C on PAMR-1/2. Figure 13.7 depicts CWMR’s
robustness with respect to , plus the final cumulative wealth achieved by Market and
BCRP. The results first show that the final cumulative wealth increases as the sen-
sitivity parameter decreases and stabilizes after  falls below certain data-dependent
thresholds, which means that the mean reversion idea has been completely exploited.
Then, the results again verify that the mean reversion trading idea works in the finan-
cial markets and the proposed CWMR algorithm can successfully exploit it, which
generates significant final cumulative wealth on most datasets. Moreover, as ana-
lyzed in Section 10.4, CWMR degrades to uniform CRP strategy when  is larger
than 1. Needless to say, our empirical setting  = 0.5 is not the best one; however, the
proposed CWMR still significantly surpasses existing approaches. Finally, similar to
PAMR, CWMR fails on DJIA, which still indicates that the motivating (single-period)
mean reversion may not exist on the dataset.

13.3.4 OLMAR’s Parameter Sensitivity


Now we evaluate OLMAR’s sensitivity to its parameters, that is,  for both OLMAR-1
and OLMAR-2, and w and α for OLMAR-1 and OLMAR-2, respectively. Figure 13.8
shows OLMAR-1’s sensitivity to  with fixed w = 5. Since OLMAR-2 and OLMAR-1
have the similar figures for , we only list its effects on OLMAR-1. Figure 13.9 shows
its sensitivity to w with fixed  = 10, and Figure 13.10 shows OLMAR-2’s sensitivity
to α with fixed  = 10.
From Figure 13.8, we can observe that, in general, the cumulative wealth sharply
increases if  approaches 1 and flattens if  crosses a threshold. From Figure 13.9,
we can see that as w increases, the performance initially increases, spikes at a data-
dependent value, and then decreases. Regardless, its performance with most choices
of  and w is much better than that of the market and BCRP. To smooth the volatility of
its performance, Figure 13.9 also shows OLMAR’s BAH versions (Li and Hoi 2012)
by combining a set of OLMAR experts with varying w’s. We can see that the BAH
version provides a much smoother cumulative return than its underlying experts. Note
that on DJIA, OLMAR performs much better than PAMR/CWMR as  varies, which
validates the motivating multiperiod mean reversion. From Figure 13.10, we can
find that on most datasets, α provides significant high cumulative wealth in a wide
range of values, except the two extreme endpoints, that is, 0 and 1. If α = 1, then
all expected price relatives are always 1 and OLMAR-2 outputs bt+1 = bt , which is

T&F Cat #K23731 — K23731_C013 — page 114 — 9/28/2015 — 21:35


EXPERIMENT 3 115
1016
106
Total wealth achieved

Total wealth achieved


1012

108
103
4
10

100
100
0 0.5 1 1.5
0 0.5 1 1.5
ε ε
CWMR Market BCRP CWMR Market BCRP

(a) (b)

103
16

Total wealth achieved


Total wealth achieved

102

4
101

100
1
0 0.5 1 1.5 0 0.5 1 1.5
ε ε

CWMR Market BCRP CWMR Market BCRP

(c) (d)

3
16
Total wealth achieved

Total wealth achieved

0 0.5 1 1.5 0 0.5 1 1.5


ε ε
CWMR Market BCRP CWMR Market BCRP
(e) (f)

Figure 13.7 Parameter sensitivity of CWMR with respect to : (a) NYSE (O); (b) NYSE (N);
(c) TSE; (d) SP500; (e) MSCI; and (f) DJIA.

initialized to uniform portfolio. If α = 0, then its expected price relative vector equals
x̃t+1 = t 1 x . Such price relatives inversely relate to all of an asset’s historical price
i=1 i
relatives and produce bad results.
Nevertheless, all the above observations show that OLMARs’ performance is
robust to their parameters, and it is convenient to choose satisfying parameters.

T&F Cat #K23731 — K23731_C013 — page 115 — 9/28/2015 — 21:35


116 EMPIRICAL RESULTS
9
10
1016

Total wealth achieved


Total wealth achieved

1012 106

108
103
104

100 100
1 10 100 1000 1 10 100 1000
ε ε

OLMAR Market BCRP OLMAR Market BCRP

(a) (b)

103
Total wealth achieved
Total wealth achieved

10
102

101

1
100
1 10 100 1000 1 10 100 1000
ε ε

OLMAR Market BCRP OLMAR Market BCRP

(c) (d)

30

3
Total wealth achieved
Total wealth achieved

20

10

1
1 10 100 1000 1 10 100 1000
ε ε
OLMAR Market BCRP OLMAR Market BCRP

(e) (f)

Figure 13.8 Parameter sensitivity of OLMAR-1 with respect to  with fixed w (w = 5):
(a) NYSE (O); (b) NYSE (N); (c) TSE; (d) SP500; (e) MSCI; and (f) DJIA.

13.4 Experiment 4: Evaluation of Practical Issues


For real-world portfolio management, there are some important practical issues,
including transaction costs and margin buying. In this section, we examine how the
two issues affect the proposed strategies.

T&F Cat #K23731 — K23731_C013 — page 116 — 9/28/2015 — 21:35


EXPERIMENT 4 117
109

1016
Total wealth achieved

Total wealth achieved


1012 106

108
103
104

100 100
25 50 75 100 25 50 75 100
W W
BAH (OLMAR) OLMAR BAH (OLMAR) OLMAR
Market BCRP Market BCRP
(a) (b)

103

102
30
Total wealth achieved
Total wealth achieved

101

100 20

10

1
25 50 75 100 25 50 75 100
W W
BAH (OLMAR) OLMAR BAH (OLMAR) OLMAR
Market BCRP Market BCRP
(c) (d)

30

4
Total wealth achieved
Total wealth achieved

20
3

10 2

1
1
25 50 75 100 25 50 75 100
W W
BAH (OLMAR) OLMAR BAH (OLMAR) OLMAR
Market BCRP Market BCRP
(e) (f)

Figure 13.9 Parameter sensitivity of OLMAR-1 with respect to w with fixed  ( = 10):
(a) NYSE (O); (b) NYSE (N); (c) TSE; (d) SP500; (e) MSCI; and (f) DJIA.

T&F Cat #K23731 — K23731_C013 — page 117 — 9/28/2015 — 21:35


118 EMPIRICAL RESULTS

109
1016
Total wealth achieved

Total wealth achieved


1012 106

108
103
104

100 100
0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1
α α
OLMAR Market BCRP OLMAR Market BCRP

(a) (b)

25

103

Total wealth achieved


Total wealth achieved

102

5
101

100

1
0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1
α α
OLMAR Market BCRP OLMAR Market BCRP

(c) (d)

25
3
Total wealth achieved

Total wealth achieved

5
2

1 1

0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1


α α
OLMAR Market BCRP OLMAR Market BCRP

(e) (f)

Figure 13.10 Parameter sensitivity of OLMAR-2 with respect to α with fixed  ( = 10):
(a) NYSE (O); (b) NYSE (N); (c) TSE; (d) SP500; (e) MSCI; and (f) DJIA.

First, the transaction cost is an important and unavoidable issue that should be
addressed in practice. To test the effects of transaction cost on the proposed strategies,
we adopt the proportional transaction cost model stated in Section 2.2. Figure 13.11
depicts the effects of proportional transaction cost when the algorithms are applied
on the six datasets, where the transaction cost rate γ varies from 0% to 1%. We
present only the results achieved by three representative algorithms (CORN, PAMR,

T&F Cat #K23731 — K23731_C013 — page 118 — 9/28/2015 — 21:35


EXPERIMENT 4 119
9
10
1016
Total wealth achieved

Total wealth achieved


1012 106

108
103
104

100 100
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Transaction costs (γ%) Transaction costs (γ%)

Market BCRP Anticor ΒΝΝ Market BCRP Anticor ΒΝΝ


CORN PAMR OLMAR CORN PAMR OLMAR

(a) (b)

20

103
Total wealth achieved
Total wealth achieved

5
102

101

1
100

0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1


Transaction costs (γ%) Transaction costs (γ%)

Market BCRP Anticor ΒΝΝ Market BCRP Anticor ΒΝΝ


CORN PAMR OLMAR CORN PAMR OLMAR

(c) (d)

20 4
Total wealth achieved

Total wealth achieved

2
5

0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1


Transaction costs (γ%) Transaction costs (γ%)

Market BCRP Anticor ΒΝΝ Market BCRP Anticor ΒΝΝ


CORN PAMR OLMAR CORN PAMR OLMAR

(e) (f)

Figure 13.11 Scalability of the proposed strategies with respect to the transaction cost rate (γ):
(a) NYSE (O); (b) NYSE (N); (c) TSE; (d) SP500; (e) MSCI; and (f) DJIA.

T&F Cat #K23731 — K23731_C013 — page 119 — 9/28/2015 — 21:35


120 EMPIRICAL RESULTS
and OLMAR) and ignore the results of CWMR, whose curves often overlap that
of PAMR. For comparison, we also plot the results achieved by two state-of-the-art
strategies (Anticor and BNN ) and two benchmarks (BCRP and Market). Since most
follow the winner approaches try to approach BCRP, we ignore their figures.
From the figures, we can observe that the proposed algorithms can withstand
reasonable transaction cost rates, on most datasets. For example, the break-even rates
with respect to the market index vary from 0.2% to 0.8%, except DJIA, on which only
OLMAR can withstand around 0.3%. As CORN and PAMR/CWMR fail to beat the
markets on the DJIA dataset without transaction costs, their failures with transaction
costs can be naturally expected. On the other hand, the behaviors of the proposed
algorithms diverge. With a similar pattern-matching principle, CORN often performs
similar to BNN , while both of them generally underperform the mean reversion algo-
rithms. Since the three mean reversion algorithms (PAMR, CWMR, and OLMAR)
revert to the mean more actively than Anticor and thus result in more drastic portfolio
rebalances, they surpass Anticor with low or medium transaction costs and underper-
form Anticor with high transaction costs. Note that the transaction cost rate in the
real market is low∗ ; thus, the results clearly indicate the practical applicability of the
proposed strategies even when we consider reasonable transaction costs.
Second, margin buying is another practical concern for a real-world portfo-
lio selection task. To evaluate the impact of margin buying, we adopt the model
described in Section 2.2 and present the cumulative wealth achieved by the com-
peting approaches with or without margin buying in Table 13.3. The results clearly
show that if margin buying is allowed, the profitability of the proposed algorithms
on most datasets increases. Similar to the results without margin buying, certain pro-
posed algorithms often achieve the best results with margin buying. In summary, the
proposed strategies can be extended to handle the margin-buying issue and benefit
from it, and thus are practically applicable.

13.5 Experiment 5: Evaluation of Computational Time


Our next experiment is to evaluate the computational time costs of different
approaches, which is also an important issue in developing a practical trading strategy.
As previously analyzed, CORN has a batch-learning step on each period and is time
consuming in both its sample selection step and portfolio optimization step,† while
PAMR, CWMR, and OLMAR are online learning algorithms and cost linear time
per iteration. Table 13.4 presents the computational time cost (in seconds) of three
performance-comparable approaches (Anticor, BK , and BNN ) on the six datasets. All
the experiments were conducted on an Intel Core 2 Quad 2.66 GHz processor with
4 GB RAM, using MATLAB 2009b on Windows XP.‡
∗ For example, without considering taxes and bid–ask, Interactive Broker ([Link])
charges $0.005 per share. Since the average price of Dow Jones Composites is around $50.00 (as of June
2011), the transaction cost rate is about 0.01%.
† In its MATLAB implementation, the latter step costs more than 80% of the total time.
‡ We use MATLAB function tic/toc to measure the time. There are preprocessing (such as data loader,
variable initialization, etc.) and postprocessing (such as result analysis, etc.), whose time is all excluded
from the time statistics in Table 13.4.

T&F Cat #K23731 — K23731_C013 — page 120 — 9/28/2015 — 21:35


EXPERIMENT 5 121
Table 13.3 Cumulative wealth achieved by various strategies on the six datasets without and
with margin loans (MLs)
NYSE (O) NYSE (N) TSE
Algorithms No ML With ML No ML With ML No ML With ML
Market 14.50 15.75 18.06 17.68 1.61 1.71
Best-stock 54.14 54.14 83.51 173.18 6.28 10.53
BCRP 250.6 3755.09 120.32 893.63 6.78 21.23
UP 27.41 62.99 31.49 57.03 1.60 1.69
EG 27.09 63.28 31.00 55.55 1.59 1.68
ONS 109.19 517.21 21.59 228.37 1.62 0.88
Anticor 2.41E+08 1.05E+15 6.21E+06 5.41E+09 39.36 18.69
BK 1.08E+09 6.29E+15 4.64E+03 3.72E+06 1.62 1.53
BNN 3.35E+11 3.17E+20 6.80E+04 5.58E+07 2.27 2.17
CORN 1.48E+13 6.59E+25 5.37E+05 7.31E+07 3.56 5.00
PAMR 5.14+15 5.57E+25 1.25E+06 1.12E+09 264.86 720.42
CWMR 6.49E+15 6.59E+25 1.41E+06 7.31E+07 332.62 172.36
OLMAR 3.68E+16 5.67E+30 2.54E+08 1.73E+12 424.80 31.63
SP500 MSCI DJIA
Algorithms No ML With ML No ML With ML No ML With ML
Market 1.34 1.03 0.91 0.69 0.76 0.59
Best-stock 3.78 3.78 1.50 1.50 1.19 1.19
BCRP 4.07 6.48 1.51 1.54 1.24 1.24
UP 1.62 1.75 0.92 0.71 0.81 0.66
EG 1.63 1.70 0.93 0.72 0.81 0.65
ONS 3.34 7.76 0.86 0.33 1.53 2.21
Anticor 5.89 10.73 3.22 3.40 2.29 2.89
BK 2.24 1.88 2.64 6.56 0.68 0.56
BNN 3.07 3.29 14.47 150.49 0.88 0.67
CORN 6.35 14.59 26.10 835.08 0.84 0.55
PAMR 5.09 15.91 15.23 68.83 0.68 0.84
CWMR 5.90 23.50 17.28 76.29 0.68 0.88
OLMAR 5.83 5.60 16.39 57.79 2.12 1.46

From the results, we can clearly see that CORN and the state-of-the-art algorithms
have high costs, and in all cases the proposed PAMR, CWMR, and OLMAR take
significantly less computational time than others. Even though the computational
time in daily back-tests, especially per trading day, is small, it is important in certain
scenarios such as high-frequency trading (Aldridge 2010), where transactions may
occur in fractions of a second. Nevertheless, the results obviously demonstrate the
computational efficiency of three proposed mean reversion strategies, which further
enhances their real-world large-scale applicability.

T&F Cat #K23731 — K23731_C013 — page 121 — 9/28/2015 — 21:35


122 EMPIRICAL RESULTS
Table 13.4 Computational time cost (in seconds) on the six real datasets
Algorithms NYSE (O) NYSE (N) TSE SP500 MSCI DJIA
Anticor 2.57E+03 1.93E+03 2.15E+03 387 306 175
BK 7.89E+04 5.78E+04 6.35E+03 1.95E+03 2.60E+03 802
BNN 4.93E+04 3.39E+04 1.32E+03 2.91E+03 2.55E+03 1.28E+03
CORN 8.78E+03 1.03E+04 1.59E+03 563 444 172
PAMR 8 7 2 1.1 1.0 0.3
CWMR 12 11 3 1.4 1.3 0.5
OLMAR 4 3 0.7 0.6 0.5 0.3

13.6 Experiment 6: Descriptive Analysis of Assets and Portfolios


Existing experiments on related studies (refer to Experiments 1 to 5) focus on com-
paring different algorithms based on various preceding aspects. In this section, we
perform a preliminary analysis of the behaviors of asset returns and portfolios, which
may reflect some insights into future study. While the analysis on different datasets is
similar, we focus on the standard benchmark dataset, NYSE (O) (Cover 1991).∗ We
also append the data statistics and top five average allocations of our strategies and
the state-of-the-art algorithms on other datasets,† such as Appendix C.
Before analyzing their portfolios, we list some descriptive statistics on NYSE (O),
including each asset’s cumulative return over the whole periods, their (arithmetic)
return mean and standard deviation, and their autocorrelation with lag 1, in Table 13.5.
Then, we plot some representative approaches’ mean weights and standard devi-
ations in Figure 13.12, and list the top five average weights in Table 13.6. First, let
us analyze the BCRP strategy, whose portfolio has the same weights for every period
and thus has zero standard deviation. BCRP is essentially different from the best stock
strategy (asset #30), as the weight on the stock is zero. Interestingly, BCRP focuses
on the five most volatile stocks (refer to the highest Std values in Table 13.5), which
means that the portfolio selections are undiversified and verifies the “volatility pump-
ing” (Luenberger 1998) nature. Even though asset #23 does not perform as good as
most assets, its high volatility makes it the second weighted asset. This shows that
exploiting volatile stocks, even though some of them may perform poorly, can give
good performance.
For both EG and ONS, their portfolios have much lower volatility than other
strategies. In particular, EG’s portfolios always slightly drift around the initial uni-
1
form portfolio (for NYSE (O), 36 1). Such a phenomenon can be explained by its
learning rate (λ > 0), which has to be small such that the algorithm is universal.
However, decreasing the learning rate (λ → 0) ultimately approaches the algorithm
to uniform CRP.‡ Our observation on EG’s portfolios verifies the previous analysis
on its parameter, in Section 4.2.

∗ Due to the table constraints, we use indices to represent individual assets, whose symbols are available
at [Link]
† We ignore TSE, which has too much assets (m = 88) to show.
‡ Uniform CRP will be constant at uniform portfolio. This is approachable but not achievable as λ > 0.

T&F Cat #K23731 — K23731_C013 — page 122 — 9/28/2015 — 21:35


Table 13.5 Some descriptive statistics on the NYSE (O) dataset
Stat. 1 2 3 4 5 6 7 8 9
Cum 13.10 4.35 16.10 16.90 13.36 52.02 8.76 3.07 13.71
Mean 1.0005 1.0004 1.0006 1.0006 1.0006 1.0010 1.0005 1.0003 1.0011
EXPERIMENT 6

Std 0.0135 0.0171 0.0128 0.0170 0.0140 0.0257 0.0156 0.0136 0.0371
Ac 0.1344 0.1206 0.0817 0.0952 0.0927 0.0378 0.0615 0.0479 −0.0217
10 11 12 13 14 15 16 17 18
Cum 14.16 10.70 6.85 7.86 6.75 7.64 32.65 30.61 12.21
Mean 1.0005 1.0006 1.0005 1.0005 1.0004 1.0004 1.0009 1.0008 1.0005
Std 0.0115 0.0175 0.0159 0.0138 0.0137 0.0130 0.0224 0.0202 0.0134
Ac 0.1114 0.1312 0.0455 0.0766 0.0637 0.0744 0.0449 0.0626 0.0064
19 20 21 22 23 24 25 26 27
Cum 4.81 8.92 17.22 10.36 4.13 6.21 4.31 22.92 14.43
Mean 1.0004 1.0010 1.0006 1.0005 1.0015 1.0004 1.0005 1.0010 1.0006
Std 0.0146 0.0346 0.0151 0.0148 0.0505 0.0149 0.0230 0.0313 0.0142
Ac 0.1301 0.0243 0.0956 0.1047 −0.2089 −0.0042 0.0226 −0.0915 0.1002
28 29 30 31 32 33 34 35 36
Cum 5.98 15.21 54.14 6.98 16.20 43.13 4.25 6.54 5.39
Mean 1.0004 1.0006 1.0008 1.0004 1.0006 1.0008 1.0004 1.0005 1.0004
Std 0.0139 0.0161 0.0153 0.0117 0.0159 0.0174 0.0143 0.0178 0.0143
Ac 0.0858 0.0697 0.1004 0.0870 0.0880 0.1024 0.0873 0.0626 0.0257
Note: “Cum” denotes the cumulative return (product of all price relatives) of an asset. “Mean” refers to one asset’s

T&F Cat #K23731 — K23731_C013 — page 123 — 9/28/2015 — 21:35


arithmetic mean, and “Std” denotes the asset’s standard deviation. “Ac” denotes the autocorrelation (with lag 1) of an
asset. Numbers in bold denote the top five in the corresponding rows.
123
124 EMPIRICAL RESULTS
The three pattern matching–based approaches (BK , BNN , and CORN) have similar
patterns in their allocation weights, while their top five allocations vary. In general,
their volatilities are much higher than EG and ONS. Their concentration on asset
#23, which has the highest weight, confirms the observation (Györfi et al. 2006;

0.6 0.035

0.4
Portfolio mean and std

Portfolio mean and std


0.032

0.2 0.03

0.028
0

–0.2 0.025
0 10 20 30 0 10 20 30
Asset # Asset #
(a) (b)

0.6 0.8

0.6
Portfolio mean and std

0.4
Portfolio mean and std

0.4
0.2
0.2

0
0

–0.2 –0.2
0 10 20 30 0 10 20 30
Asset # Asset #
(c) (d)

0.8 0.8

0.6 0.6
Portfolio mean and std.
Portfolio mean and std

0.4 0.4

0.2 0.2

0 0

–0.2 –0.2
0 10 20 30 0 10 20 30
Asset # Asset #
(e) (f)

Figure 13.12 Distributions of portfolio weights. The x-axis denotes indices of assets, and the
y-axis is each asset’s average weight. For each asset, the center of an error bar denotes its
portfolio mean (over 5651 trading days), and vertical lines denote its standard deviations:
(a) BCRP; (b) EG; (c) ONS; (d) BK ; (e) BNN ; and (f) CORN. (Continued)

T&F Cat #K23731 — K23731_C013 — page 124 — 9/28/2015 — 21:35


EXPERIMENT 6 125
0.6 0.6

0.4 0.4
Portfolio mean and std

Portfolio mean and std


0.2 0.2

0 0

–0.2 –0.2
0 10 20 30 0 10 20 30
Asset # Asset #
(g) (h)

0.6

0.4
Portfolio mean and std

0.2

–0.2

0 10 20 30
Asset #
(i)

Figure 13.12 (Continued) Distributions of portfolio weights. The x-axis denotes indices of
assets, and the y-axis is each asset’s average weight. For each asset, the center of an error
bar denotes its portfolio mean (over 5651 trading days), and vertical lines denote its standard
deviations: (g) Anticor; (h) PAMR; and (i) OLMAR.

Li et al. 2011a) that the asset is important in all these approaches. Moreover, the
increasing top five weights, which indicate more active exploitations, may lead to
their increased performance. However, their volatilities also show that the subsets of
assets are changing from day to day, which is inconvenient from the point of view
of transaction costs. Anyway, such observations confirm that their pattern-matching
process is improving and validate CORN’s motivation.
The three mean reversion algorithms (Anticor, PAMR, and OLMAR) generally
concentrate on the top five volatile stocks, as shown in Figure 13.12g through i and
Table 13.6, while their orders may vary. Since Anticor, PAMR/CWMR, and OLMAR,
in general, achieve the best performance on most other datasets, we also plot their
average allocations in Table C.5,∗ in Appendix C. From the figure and tables, we can
have several observations. First, similar to the pattern matching–based approaches,
these algorithms have much higher volatilities than EG or ONS. However, different
from the pattern matching–based algorithms, which only have higher volatilities on
∗ We ignore their corresponding figures, which are similar to Figure 13.12.

T&F Cat #K23731 — K23731_C013 — page 125 — 9/28/2015 — 21:35


126 EMPIRICAL RESULTS
Table 13.6 Top five (average) allocation weights of some strategies on NYSE (O)
Asset # 6 23 9 26 20 Asset # 23 6 20 9 16
BCRP 0.28 0.25 0.20 0.18 0.09 EG 0.032 0.030 0.029 0.029 0.029
Asset # 8 35 2 3 22 Asset # 23 20 26 33 9
ONS 0.25 0.17 0.13 0.07 0.06 BK 0.21 0.11 0.08 0.07 0.06

Asset # 23 20 9 6 26 Asset # 23 9 26 6 20
BNN 0.21 0.15 0.08 0.08 0.08 CORN 0.38 0.09 0.09 0.09 0.08

Asset # 20 23 9 26 6 Asset # 23 20 9 26 6
Anticor 0.11 0.10 0.10 0.06 0.05 PAMR 0.19 0.11 0.11 0.08 0.06

top five weighted assets, the three algorithms also have much higher volatilities on
other assets. Concerning their performance, it is possible that to achieve better per-
formance, a portfolio has to be frequently rebalanced, not only on certain assets as
the pattern matching–based algorithms do but also on all assets.
Second, most average weights of the state-of-the-art algorithms are assigned to
the assets with the highest volatilities (highest Std values). It is common knowl-
edge that high return is often associated with high risk,∗ while the reverse is not
always true. That is, although a portfolio has to be rebalanced among volatile assets,
such that the portfolio can gain profits from market volatility, high volatility can-
not guarantee high profit. For example, on the NYSE (O) dataset, although Anticor
and PAMR have the same top five average allocation pool, their performances are
drastically different.
Third, PAMR, which systematically exploits the mean reversion property, rebal-
ances more actively than Anticor, and OLMAR rebalances even more actively.
Connecting the rebalance activities to their performance, we may conclude that even
though both are based on the same principle, more active rebalance leads to better
performance, as it can better exploit market volatility. PAMR’s concentration on asset
#23, which has the highest negative autocorrelation, sheds lights on the possible con-
nection between mean reversion algorithms and the autocorrelation among assets (Lo
and MacKinlay 1990; Conrad and Kaul 1998; Lo 2008). Moreover, from Table C.5,
we can observe that most of the top average allocation weights of the mean reversion
algorithms are assets with negative autocorrelations, except DJIA.

13.7 Summary
In this chapter, we empirically evaluated the four proposed algorithms. The empiri-
cal results clearly validate the effectiveness of the proposed algorithms. In terms of
cumulative wealth, which is the main performance metric, our proposed algorithms
sequentially beat the state-of-the-art algorithms. In terms of (volatility/drawdown)
risk-adjusted return, the proposed algorithms achieve high risk-adjusted returns,
∗ Such a statement is true in traditional finance. However, in recent years, some arbitrage strategies,
which can earn return without high risk, have emerged.

T&F Cat #K23731 — K23731_C013 — page 126 — 9/28/2015 — 21:35


SUMMARY 127
although they also have higher risk. The evaluations of parameter sensitivity show that
the proposed algorithms are always robust to their parameters and have a wide range
of satisfying choices such that they have good performance. The proposed algorithms
are also scalable to two practical issues, that is, margin buying and transaction costs.
Finally, although correlation-driven nonparametric learning (CORN) takes similar
time as the state of the art, the three mean reversion costs significantly less time,
which thus is suitable for practical large-scale applications, such as high-frequency
trading.
In the future, we plan to study the sources of profits among online portfolio
selection (OLPS). One way is to remove the possible “bid–ask bounce” by using a
different methodology for computing the closing prices, such as averaging the prices
of several transactions, which would reduce or even eliminate the bid–ask bounce.
Moreover, incorporating other sources of information, such as volume information,
is also possible to improve the proposed algorithms.

T&F Cat #K23731 — K23731_C013 — page 127 — 9/28/2015 — 21:35


Chapter 14

Threats to Validity

Profitable real trading systems are complex systems, involving varying market
scenarios. While the empirical results have demonstrated the effectiveness of the
proposed strategies, there is still a long way to the production stage. In this chapter,
we provides some arguments of various assumptions made during the trading model,
back tests, and so on.
This chapter is organized as follows: Section 14.1 discusses the assumptions on
the model, and Section 14.2 discusses the assumptions on the mean reversion princi-
ples. Section 14.3 discusses the proposed algorithms from a theoretical perspective.
Section 14.4 validates the empirical studies. Finally, Section 14.5 summarizes this
chapter and proposes some future directions.

14.1 On Model Assumptions


Any statement about such encouraging empirical results achieved by the proposed
algorithms would be incomplete without acknowledging the simplified assump-
tions. To recall, we had made several assumptions regarding transaction cost, market
liquidity, and market impact that would affect the algorithms’ practical deployment.
The first assumption is that no transaction cost exists. In Section 13.4, we have
already examined the effects of varying transaction costs, and the results show that
the proposed algorithms can withstand moderate transaction costs in most cases.
Currently, with the widespread adoption of electronic communication networks and
multilateral trading facilities in financial markets, various online trading brokers
charge very small transaction cost rates, especially for large institutional investors.
They also use a flat rate,∗ which is based on the volume one reaches. Such measures
can facilitate the portfolio managers to lower their transaction cost rates.
The second assumption is that the market is liquid and one can buy and sell any
quantity at quoted prices. In practice, low market liquidity often means a large bid–
ask spread —the gap between prices quoted for an immediate bid and an immediate
ask. As a result, the execution of orders may incur a discrepancy between the prices

∗ For example, for US equities and options, E*Trade ([Link] accessed on


16 March 2011.) charges only $9.99 for $50,000+ or 30+ stocks per quarter.

129

T&F Cat #K23731 — K23731_C014 — page 129 — 9/26/2015 — 8:12


130 THREATS TO VALIDITY
sent by algorithms and the prices actually executed. Moreover, stocks are often traded
in multiples of lots, which is the standard trading unit containing a number of stock
shares. In this situation, the quantity of the stocks may not be arbitrarily divisible. In
our numerical evaluations, we have tried to minimize the effect of market liquidity by
choosing the stocks that have large market capitalizations, which usually have small
bid–ask spreads and discrepancies, and thus have high market liquidity.∗
The third assumption is that a portfolio strategy would have no impact on the
market, that is, the stock market will not be affected by any trading algorithms. In
practice, the impact can be neglected if the market capitalization of a portfolio is not too
large. However, as the experimental results show, the portfolio wealth generated by the
proposed algorithms increases very fast, which would inevitably impact the markets.
One simple way to handle this issue is to scale down the portfolio, as is done by
many quantitative funds. Moreover, the development of sell-side algorithmic trading,
which slices a big order into multiple smaller orders and schedules these orders to
minimize their market impact, can significantly decrease the potential market impact
of the proposed algorithms.
Here, we emphasize again that our current study assumes a “perfect market,”
which is consistent with existing studies in literature. It is important to note that
even in such a perfect financial market, no algorithm has ever claimed such high
performance, especially on the standard NYSE (O) dataset. Though past performance
may not be a reliable indicator of future performance, such encouraging results do
provide us confidence that the proposed algorithms may work well in future unseen
markets.

14.2 On Mean Reversion Assumptions


Though the proposed mean reversion algorithms perform well on most datasets, we
do not claim that they perform well on arbitrary portfolio pools. Note that passive–
aggressive mean reversion (PAMR)/confidence-weighted mean reversion (CWMR)
relies on the assumption that (single-period) mean reversion exists in a portfolio
pool, that is, buying underperforming stocks in previous periods is profitable. Pre-
ceding experiments seem to show that, in most cases, such mean reversion does
exist. However, it is still possible that this assumption fails to exist in certain cases,
especially when portfolio components are incorrectly selected. PAMR/CWMR’s per-
formance on the DJIA dataset indicates that (single-period) mean reversion may not
exist in the dataset. Although both are based on mean reversion, PAMR and Anticor
are formulated with different time periods of mean reversion, which may be inter-
preted as meaning that Anticor achieves a good performance on DJIA. This also
motivates the proposed online moving average reversion (OLMAR), which exploits
multiple-period instead of single-period mean reversion. Thus, before investing in
a real market, it is of crucial importance to ensure that the motivating mean rever-
sion, either single period or multiple period, does exist among the portfolio pools. In
academia, the mean reversion property in a single stock has been extensively studied

∗ However, we cannot say that we have removed or eliminated the impact of the bid–ask spread.

T&F Cat #K23731 — K23731_C014 — page 130 — 9/26/2015 — 8:12


ON THEORETICAL ANALYSIS 131
(Poterba and Summers 1988; Hillebrand 2003; Exley et al. 2004); one natural way
is to calculate the sign of its autocorrelation (Poterba and Summers 1988). On the
contrary, the mean reversion property among a portfolio lacks academic attention.
Our motivation in CWMR (Table 10.1) provides a preliminary method to test single-
period mean reversion. Different from the mean reversion in a single stock, mean
reversion in a portfolio concerns not only the mean reversion in individual stocks but
also the interactions among different stocks.

14.3 On Theoretical Analysis


In this book, our evaluations focus on empirical aspects of the strategies, which is
unfair to some theoretically guaranteed methods, such as UP, EG, and ONS. Although
the proposed four algorithms are not designed to asymptotically achieve the expo-
nential growth of a specific experts, such as BCRP, it is better for us to explain the
aspect of theoretical analysis, which is missing in our study.
On the one hand, we give no theoretical guarantee, or universal property, for
the four proposed algorithms. In particular, we find it hard to prove the universal
property for CORN, as it utilizes a correlation coefficient to select a similarity set.
For the three mean reversion algorithms, since the mean reversion trading idea is
counterintuitive, it is difficult to provide a traditional regret bound.∗ Although we
cannot prove the traditional regret bound, the proposed algorithms do provide strong
empirical evidence, which sequentially advances the state of the art.
On the other hand, it is possible to utilize certain meta-algorithms (Li et al. 2012,
2013; Li and Hoi 2012) that combine the proposed algorithms and some universal
portfolio selection algorithms, such that the entire meta-system enjoys the universal
property (Das and Banerjee 2011, Corollary 1). Meanwhile, such a meta-system can
also benefit from the proposed algorithms and can produce significant high empirical
performance. Note that even with a worst-case guarantee, some existing universal
algorithms perform poorly on the datasets. Anyway, even though it is convenient to
propose a universal meta-system, the original algorithms’ theoretical aspects are still
an open question and deserve further exploration.

14.4 On Back-Tests
Due to the unavailability of the intraday data and order books, we have conducted
all the experiments based on public daily data, even though it may suffer from cer-
tain potential problems. One potential problem is that our algorithms may be earning
“dealer’s profits” in an uncontrolled and unfair way, or simply they are earning from
the “bid–ask bounce” (Mcinish and Wood 1992; Porter 1992), which denotes a result
of trades replacing the market maker’s bid or ask quotes. This suspicion is compat-
ible with the algorithms being contrarian strategies, such as PAMR, CWMR, and
OLMAR. To eliminate this possibility, it would be good to try to eliminate the bid–
ask bounce by replacing the market prices by the midpoint of the best bid and ask
∗ Borodin et al. (2004) failed to provide a regret bound for Anticor strategy, which passively exploits
the mean reversion idea.

T&F Cat #K23731 — K23731_C014 — page 131 — 9/26/2015 — 8:12


132 THREATS TO VALIDITY
prices (Gosnell et al. 1996). However, calculating the midpoints of the best bid and
ask prices requires access to the order book, which is usually private and not free,
rather than simply the log of transactions. Another possibility would be to take into
account only “sell-type” (or only “buy-type”) transactions, meaning the transactions
in response to market orders to sell, in which case the buying counterpart would be
the one issuing a limit order. However, addressing the possibility also requires one
to find out the order type (Keim and Madhavan 1995; Foucault et al. 2005) of each
trade, which is usually not available to the public.
Back-tests in historical markets may suffer from “data-snooping bias” issues, one
of which is the dataset selection issue. On the one hand, we selected four datasets,
the NYSE (O), TSE, SP500, and DJIA datasets, based on previous studies with-
out consideration to the proposed approaches. On the other hand, we developed the
proposed algorithms solely based on the NYSE (O) dataset, while the other five
datasets (NYSE (N), TSE, SP500, MSCI, and DJIA) were obtained after the algo-
rithms were fully developed. However, even though we are cautious about the dataset
selection issue, it may still appear in the experiments, especially for the datasets with
a relatively long history, that is, NYSE (O) and NYSE (N). The NYSE (O) dataset,
pioneered by Cover (1991) and followed by other researchers, is a “standard” dataset
in the online portfolio selection community. Since it contains 36 large-cap NYSE
stocks that survived for 22 years, it suffers from extreme survival bias. Nevertheless,
it still has the merit to compare different algorithms as done in all previous studies. The
NYSE (N) dataset, as a continuation of NYSE (O), contains 23 assets that survived
from the previous 36 stocks for another 25 years. Therefore, it becomes even worse
than its precedent in terms of survival bias. In summary, even though the empirical
results on these datasets clearly show the effectiveness of the proposed algorithms,
one cannot make claims without noticing the deficiencies of these datasets.
Another common bias is the asset selection issue. Four of the six datasets (the
NYSE (O), TSE, SP500, and DJIA) are collected by others, and to the best of our
knowledge, their assets are mainly the largest blue chip stocks in their respective
markets. As a continuation of NYSE (O), we self-collected NYSE (N), which again
contains several of the largest survival stocks in NYSE (O). The remaining dataset
(MSCI)∗ is chosen according to the world indices. In summary, we try to avoid
the asset selection bias via arbitrarily choosing some representative stocks in their
respective markets, which usually have large capitalization and high liquidity and
thus reduce the market impact caused by any proposed portfolio strategies.
Moreover, there are some critics regarding the datasets’ liquidity issue, which
assumes that the assets are available in unbounded quantities for buying or selling
at any given trading period. In Table 13.1, we observe cumulative of 1013 or more,
and there are assets with capitalization less than 1010 ; then, obviously, the liquidity
assumption is not fulfilled. In NYSE (O), there are many such assets, and even in
NYSE (N) there are four such assets: SHERW, KODAK, COMME, and KINAR. The
most “dangerous” asset is KINAR, identified as asset #23 in Table 13.5, where there

∗ In fact, we collected this dataset following Li et al. (2012)’s review comments, which means the dataset
does not exist before its third-round submission.

T&F Cat #K23731 — K23731_C014 — page 132 — 9/26/2015 — 8:12


SUMMARY 133
are no data on its capitalization, but certainly it is a very small asset. One remedy is
to only consider the remaining 19 assets out of the 23 in the experiments, as done
by Györfi et al. (2012, Chapter 2).
Finally, following existing model assumptions and experimental setting, we do not
consider the low-quality assets, such as the bankrupt and penny stocks. The bankrupt
stock data are difficult to acquire; thus, we cannot observe their behaviors and predict
the behaviors of the proposed algorithms. In reality, the bankruptcy situation rarely
happens for blue chip stocks because typically a bankrupt stock would be removed
from the list of blue chip stocks before it actually goes into bankruptcy. The penny
stocks lack sufficient liquidity to support the trading frequency required for our current
research. Besides, one could also explore many practical strategies to exclude such
low-quality stocks from the asset pool at some early stage, such as technical and
fundamental analysis.

14.5 Summary
This chapter argued some assumptions in our models and back-tests, which will be
faced by various empirical research in trading strategies. When back-testing a strategy,
researchers should be aware of these assumptions and thus can take measures to
weaken their impacts on the profits in real trading.

T&F Cat #K23731 — K23731_C014 — page 133 — 9/26/2015 — 8:12


Part V

Conclusion

135

T&F Cat #K23731 — K23731_S005 — page 135 — 9/26/2015 — 8:13


Chapter 15

Conclusions

If there’s one thing I learned in prison


it’s that money is not the prime commodity in our lives. . .
time is.
– Wall Street 2: Money Never Sleeps

15.1 Conclusions
This book aims to advance the state of the art in online portfolio selection (OLPS).
Here, our objective is to achieve better performance on real markets. The main
principles we adopted are the principles of pattern matching and mean reversion.
For the principle of pattern analysis, we try to locate similar patterns from the
historical market and construct optimal portfolios based on these patterns. Observing
that existing pattern matching–based approaches often adopt Euclidean distance to
measure the similarity between two patterns, we find that Euclidean distance ignores
their linear similarity and whole-market movements. Thus, we proposed to measure
the similarity via a correlation coefficient, which considers both ignored aspects,
and designed the CORrelation-driven Nonparametric learning (CORN) approach for
OLPS. The proposed CORN performs much better than existing pattern matching–
based strategies, which validates its motivations.
For mean reversion, we directly output portfolios based on the principle, which
assumes that the price trends will revert to their previous trends. Firstly, we proposed
to exploit the principle via passive–aggressive learning, resulting in the passive–
aggressive mean reversion (PAMR). In particular, PAMR tries to obtain portfolios
that perform worse than a threshold on the last price relatives, and also close to
the last portfolio. PAMR’s formulation is clear to understand, and its closed-form
solutions reflect the mean reversion principle. PAMR can achieve the best empirical
performance on most datasets at the time.
Observing that most existing algorithms only exploit the first-order information
of a portfolio, we proposed to exploit the second-order information and the mean
reversion property via confidence-weighted learning, resulting in a new family of
strategies called confidence-weighted mean reversion (CWMR). It models the port-
folio as a Gaussian distribution and sequentially updates the distribution similar to

137

T&F Cat #K23731 — K23731_C015 — page 137 — 9/26/2015 — 8:12


138 CONCLUSIONS
PAMR, but exploits both first-order and second-order information. CWMR’s closed-
form updates effectively trade off between the first- and second-order information of
a portfolio. Empirically, it generally outperforms other existing strategies, including
CORN and PAMR, on most datasets.
Analyzing the existing algorithms via Kelly’s framework, we find that the above
two mean reversion algorithms follow the assumption of single-period mean rever-
sion, which leads to performance degradation on certain datasets. To handle the
degradation and to further exploit the market, we proposed two forms of multiperiod
mean reversion, both of which are based on a moving average, and the online mov-
ing average reversion (OLMAR), which is more robust than PAMR. Empirically,
OLMAR is currently the best strategy, beating all existing algorithms, including
CORN, PAMR, and CWMR.
We conducted an extensive set of empirical evaluations, in which the results clearly
validate the effectiveness of the proposed algorithms. In particular, the proposed
algorithms sequentially advance the state of the art in terms of cumulative return,
which is the main performance metric of our studies. Besides, they are fairly robust
to their parameter settings, and most of the algorithms are generally computationally
efficient and are thus suitable for real-life large-scale environments.

15.2 Future Directions


15.2.1 On Existing Work
We have presented a family of algorithms for OLPS that represent the state of the art
of OLPS in academia. However, there is still room to further improve the existing
algorithms.
First of all, we calculate CORN’s correlation coefficient using a univariate cor-
relation by concatenating each market window to a column vector, during which we
may lose useful structural information. Thus, it is possible to improve its performance
using a multivariate correlation coefficient on the matrices such that the information
is retained. Since information is crucial for a portfolio selection task, we can con-
struct more effective portfolios. Another potential improvement is to redesign the
CORN algorithm when transaction costs exist, since the cumulative wealth achieved
by CORN decreases exponentially with increasing transaction costs. One possible
solution is to add a regularization term to the portfolio maximization step, such that
we can maximize expected return and meanwhile constrain expected turnovers.
Moreover, currently we only consider the similarity between two market win-
dows with the same length and same interval; however, locating patterns with
varying timing is also attractive in the pattern matching–based approaches. Dynamic
time warping (DTW) (Rabiner and Levinson 1981; Sakoe and Chiba 1990; Keogh
2002) is a dynamic programming approach proposed to recognize humans’ spoken
words, which vary a lot in timing and pronunciation. Since similarity among time
series also varies in time, DTW has been successfully applied to find patterns in
time series (Berndt and Clifford 1994; Yi et al. 1998; Keogh and Pazzani 2000;
Rakthanmanon et al. 2012). In finding patterns in time series, DTW’s basic idea
is to stretch or compress the time axis, such that the distance between time series

T&F Cat #K23731 — K23731_C015 — page 138 — 9/26/2015 — 8:12


FUTURE DIRECTIONS 139
and a template is minimized. In this way, one template can match various time
series with varying time. Since the pattern matching–based approaches try to locate
similar patterns among the historical time series, DTW may be directly applied to
locate time-warping patterns. To the best of our knowledge, no existing algorithm
has ever considered the time compression or stretch of market windows. However,
one associated problem with DTW in practice is its expensive time cost (Wang et al.
2013).
Second, although the three mean reversion algorithms outperform the previous
algorithms on most datasets, they may fail in certain cases. For example, if the market
contains one stock that drops continuously and significantly over some periods, then
all the three algorithms may fail. Such scenarios may simulate a financial crisis when
all stocks continuously drop for a certain period, or the case that one stock is hit
by a series of bad news, and its price slashes continuously over a certain period. In
fact, for such cases, most current good-performing algorithms, including Anticor and
all pattern matching–based algorithms, will underperform the naive market strategy.
In our experiments, we choose blue chip stocks, by assuming that they would not
drop continuously for a long period. However, it is always possible that the “black
swan” (Taleb 2008) may exist in the financial market and hit your portfolio. Such a
hypothetical market sequence poses a serious challenge: How to achieve a reasonable
performance in such a market?
Third, in Section 9.4, we briefly connect PAMR’s update and the general form
of return-based contrarian strategies. Besides PAMR, we find that GP’s update in
Section 4.2 also coincides with the general form of return-based momentum strategies.
This finding is not abnormal as both adopt similar learning techniques but contrary
trading ideas, that is, contrarian for PAMR and momentum for GP. Based on such
a connection, we can anatomize a portfolio’s expected return and find its sources of
profit. Although pure trading strategies have been anatomized for a long time, OLPS
algorithms have yet to be anatomized. Such an analysis will help understand the
behaviors of OLPS algorithms, which is largely unknown in current research. The
resulting empirical observations can help validate the analysis, as is done in previous
studies (Lo and MacKinlay 1990; Conrad and Kaul 1998; Lo 2008).
Fourth, as empirically analyzed in Section 13.6 and discussed in Section 14,
different asset pools are suitable for different algorithms. Also, as there exist thousands
of assets in the financial markets, it would be more computationally efficient to select
a subset of assets. This poses an open challenge, that is, how to prepare a proper asset
pool for specific OLPS algorithms. This challenge is similar to the task of feature
selection, and we plan to explore general methods to automatically select effective
subsets of assets.
Finally, although all the proposed algorithms significantly outperform the state of
the art, one important aspect missing in current study is their theoretical guarantees.
As discussed in Section 14.3, there exist several explanations and measures to handle
the issue. Nevertheless, the theoretical guarantees, either the universal property or
others, are still missing for the proposed algorithms. In future, we plan to present
some nontrivial theoretical guarantees for the proposed algorithms, such that we can
be more confident regarding the algorithms’ practical applicability.

T&F Cat #K23731 — K23731_C015 — page 139 — 9/26/2015 — 8:12


140 CONCLUSIONS
15.2.2 On Practical Issues
To improve the practicability of OLPS algorithms, one should also tackle the practical
issues in real markets, so as to relax the ideal assumptions of the market model and
provide possible applications.
A crucial practical issue is transaction cost (Kissell et al. 2003), which has been
addressed in some literatures (Blum and Kalai 1999; Iyengar 2005; Lobo et al. 2007;
Györfi and Vajda 2008; Kozat and Singer 2008, 2010). Among them, a research direc-
tion is to investigate a simple linear model with a proportional transaction cost (Blum
and Kalai 1999; Borodin et al. 2004; Györfi and Vajda 2008), which is easier to handle
than nonlinear models (Takano and Gotoh 2011). Different from existing solutions,
which mainly trade off between the expected return and expected transaction costs,
one can solve this by the regularization methods (Tibshirani 1996) to resolve ill-posed
problems and minimize portfolio turnover at a reasonable expected return.
In addition to addressing explicit transaction costs, another interesting future
direction is to ignore commission fees and taxes but consider them as part of the
bid–ask spread, which allows trading strategies to issue not only market but also limit
orders. This would allow for negative transaction costs,∗ which become profit of the
dealer/algorithm.
Another practical issue in the task is to incorporate useful side information, such
as experts’ opinions, firms’ fundamental information, and technical indicators. As
shown in existing studies (Cover and Ordentlich 1996; Fagiuoli et al. 2007), such
side information may facilitate the portfolio selection task. Moreover, other state-of-
the-art research (Zhang and Skiena 2008, 2010) focuses on learning real-time news
to facilitate a single stock trading, while portfolio trading is still new. Most existing
works manually evaluate the side information as integers, and one potential work is to
sequentially learn appropriate integers with the side information. The learned integers
can help us to handle the side information and thus improve the performance of the
proposed approaches.

15.2.3 Learning for Index Tracking


The literature (Meade and Salkin 1989, 1990) shows that most active management
funds focusing on absolute return usually do not outperform the corresponding mar-
ket index. Thus, index mutual funds, which try to track the market index, have
emerged in recent decades. Briefly speaking, index tracking is passive portfolio man-
agement aiming to construct a portfolio that tracks one real or virtual index using
stock shares of as few companies as possible. Currently, some computational intelli-
gence approaches (Gilli and Këllezi 2002; Beasley et al. 2003; Coleman et al. 2006;
Maringer 2008; Canakgoz and Beasley 2009) have been proposed to tackle this task.
We also notice the development of lasso techniques (Tibshirani 1996; Fu 1998; Zou
2006; Bach 2008; Yang et al. 2010; Zhou et al. 2010), which mainly control the spar-
sity of a decision variable. Thus, it is possible to handle the index-tracking problem
via the lasso techniques (McWilliams and Montana 2010). On the one hand, it can
∗ By issuing limit orders, the algorithm would play the traditional role of a dealer.

T&F Cat #K23731 — K23731_C015 — page 140 — 9/26/2015 — 8:12


FUTURE DIRECTIONS 141
ensure that the required tracking error is minimized on the constraint that the number
of chosen companies is less than a predefined number. On the other hand, we can
stratify the companies into groups and achieve the sparsity among groups or within a
group. The main challenge to learn an index-tracking portfolio using lasso techniques
is the contradiction between the simplex constraint and the lasso constraint, as the
former deactivates the lasso constraint. Brodie et al. (2009) solved this challenge in
the traditional mean variance model, while how to tackle this challenge in the context
of online index tracking can be further investigated.

T&F Cat #K23731 — K23731_C015 — page 141 — 9/26/2015 — 8:12


Appendix A

OLPS: A Toolbox for Online Portfolio


Selection

This appendix presents an open-source software toolbox for online portfolio


selection (OLPS), which implements a collection of classical and state-of-the-art
OLPS strategies powered by state-of-the-art machine learning techniques.
OLPS aims to sequentially allocate capital among a set of assets to maximize
long-term return. In recent years, a variety of machine learning algorithms have been
proposed to address this challenging problem, but there is no comprehensive open-
source toolbox available due to various reasons. This may have significantly hindered
the progresses of research and development of new techniques in this field.
OLPS was designed and developed to facilitate the investigation and development
of new methods and enable performance benchmarking and comparisons of different
existing strategies. OLPS is an open-source project implemented in MATLAB and
is compatible with Octave. The software toolbox has been released under Apache
License (version 2.0), and it is freely available at: [Link]

A.1 Introduction
A.1.1 Target Task
In this section, we briefly formulate the OLPS model, which will be used in our model.
Suppose we have a finite number of m ≥ 2 investment assets, over which an investor
can invest for a finite number of n ≥ 1 periods.
At the t-th period, t = 1, . . . , n, the asset (close) prices are represented by a
vector pt ∈ Rm + , and each element pt,i , i = 1, . . . , m represents the close price of
asset i. Their price changes are represented by a price relative vector xt ∈ Rm + , each
component of which denotes the ratio of the t-th close price to the last close price,
pt,i
that is, xt,i = pt−1,i . Thus, an investment in asset i throughout period t changes by a
factor of xt,i . Let us denote by x1n = {x1 , . . . , xn } a sequence of price relative vectors
for n periods, and xse = {xs , . . . , xe }, 1 ≤ s < e ≤ n as a market window.

143

T&F Cat #K23731 — K23731_A001 — page 143 — 9/28/2015 — 20:46


144 OLPS: A TOOLBOX FOR ONLINE PORTFOLIO SELECTION
An investment in the market for the t-th period is specified by a portfolio vector
bt = (bt,1 , . . . , bt,m ), where bt,i , i = 1, . . . , m, represents the proportion of wealth
invested in asset i at the beginning of the t-th period. Typically, a portfolio is
self-financed and no margin/short sale is allowed; therefore, each entry of a port-
folio
 is nonnegative and adds up to one, that is, bt ∈ m , where m = {bt : bt 
0, m i=1 bt,i = 1}.∗ The investment procedure is represented by a portfolio strategy,
 
that is, b1 = m1 , . . . , m1 and the following sequence of mappings:

m(t−1)
bt : R+ → m , t = 2, 3, . . . ,

where bt = bt (x1t−1 ) is the portfolio determined at the beginning of the t-th period
upon observing past market behaviors. We denote by bn1 = {b1 , . . . , bn } the strategy
for n periods, which is the output of an OLPS strategy.
At the t-th period, a portfolio bt produces  a portfolio period return st , that is,
the wealth changes by a factor of st = b x
t t = m
i=1 bt,i xt,i . Since we reinvest and
adopt relative prices, the wealth would change multiplicatively. Thus, after n periods,
a portfolio strategy bn1 will produce a portfolio cumulative wealth of Sn , which changes

the initial wealth by a factor of nt=1 b t xt :

n
Sn (bn1 , x1n ) = S0 b
t xt ,
t=1

where S0 denotes the initial wealth and is set to $1 for convenience.

Algorithm A.1: Online portfolio selection.


Input: x1n : Historical market price relative sequence
Output: Sn : Final cumulative wealth
 
Initialize S0 = 1, b1 = m1 , . . . , m1 ;
for t = 1, 2, . . . , n do
Portfolio manager learns a portfolio bt ;
Market reveals a price relative vector xt ;
Portfolio incurs period return st = bt xt and updates cumulative return
St = St−1 × (b t xt );
Portfolio manager updates his/her decision rules;
end

We present the framework of the above task in Algorithm A.1. In this task,
a portfolio manager’s goal is to produce a portfolio strategy (bn1 ) upon the market
price relatives (x1n ), aiming to achieve certain targets. He or she computes the portfo-
lios in a sequential manner. At each period t, the manager has access to the sequence
of past price relative vectors x1t−1 . He or she then computes a new portfolio bt for next
price relative vector xt , where the decision criterion varies among different managers.
∗  0 denotes that each element of the vector is nonnegative.

T&F Cat #K23731 — K23731_A001 — page 144 — 9/28/2015 — 20:46


INTRODUCTION 145
Then the manager will rebalance to the new portfolio via buying and selling the
underlying stocks. At the end of a trading period, the market will reveal xt . The
resulting portfolio bt is scored based on portfolio period return st . This procedure
is repeated until the end, and the portfolio strategy is finally scored by the portfolio
cumulative wealth Sn .
Note that we have made several general and common nontrivial assumptions in
the above model:
1. Transaction cost: no explicit or implicit transaction costs∗ exist.
2. Market liquidity: one can buy and sell the required amount, even fractional, at the
last close price of any given trading period.
3. Market impact: any portfolio selection strategy shall not influence the market or
any other stocks’ prices.
All the implemented strategies follow the same architecture in Algorithm A.1, and
they are called at Line 3.

A.1.2 Installation
A.1.2.1 Supported Platforms
OLPS is based on MATLAB (both 32- and 64-bit) and Octave (except the Graph-
ical User Interface [GUI] Part); thus, it is supported on 32- and 64-bit versions of
Linux, Mac OS, and Windows. The first version of OLPS is developed and tested on
MATLAB 2009a, while the latest version of OLPS is tested on MATLAB 2013a.

A.1.2.2 Installation Instructions


Installation of the toolbox consists of two steps:
1. Retrieve the latest version of OLPS from the project website. The package can be
downloaded as either a .zip file or a .[Link] file.
2. Unpack the package to any folder. The root directory is named “OLPS.”
Then the toolbox is available in the folder. Note that the directory structure of the
toolbox is predefined, which decides the running datasets and logs.

A.1.2.3 Folders and Paths


The toolbox consists of five folders in relative path: “/Strategy,” “/Data,” “/GUI,”
“/Log,” and “/Documentation.” The folder “/Strategy” consists of the core strategies
for online portfolios selection, which will be introduced in Section A.3. The folder
also consists of the commands used in the Command Line Interface (CLI), which will
be introduced in Section A.2.2. The folder “/Data” includes some popular datasets in
forms of .mat, which will be detailed in Section A.1.4. The folder “/GUI” includes the
files to run the Graphical User Interface, which will be detailed in Section A.2.1. The
folder “/Log” stores the experimental details of a strategy on a dataset, which will be
∗ Explicit costs include commissions, taxes, stamp duties, and fees. Implicit costs include the bid–ask
spread, opportunity costs, and slippage costs.

T&F Cat #K23731 — K23731_A001 — page 145 — 9/28/2015 — 20:46


146 OLPS: A TOOLBOX FOR ONLINE PORTFOLIO SELECTION
generated after the simulation process. The folder “/Documentation” contains some
documentations of the toolbox, including one summary paper and one comprehensive
documentation of the toolbox.

A.1.3 Implemented Strategies


Table A.1 illustrates all implemented strategies in the toolbox.

Table A.1 All implemented strategies in the toolbox


Categories Strategies Sections Strategy Names
Benchmarks Uniform Buy and A.3.1.1 ubah
Hold
Best Stock A.3.1.2 best
Uniform Constant A.3.1.3 ucrp
Rebalanced
Portfolios
Best Constant A.3.1.4 bcrp
Rebalanced
Portfolios
Follow the Universal Portfolios A.3.2.1 up
Winner Exponential A.3.2.2 eg
Gradient
Online Newton Step A.3.2.3 ons
Follow the Anticorrelation A.3.3.1 anticor/anticor anticor
Loser Passive–Aggressive A.3.3.2 pamr/pamr 1/pamr 2
Mean Reversion
Confidence- A.3.3.3 cwmr var/cwmr stdev
Weighted Mean
Reversion
Online Moving A.3.3.4 olmar1/olmar2
Average Reversion
Pattern Nonparametric A.3.4.1 bk
Matching Kernel-Based
Log-Optimal
Nonparametric A.3.4.2 bnn
Nearest Neighbor
Log-Optimal
Correlation-Driven A.3.4.3 corn/cornu/cornk
Nonparametric
Learning
Others M0 m0
T0 t0

T&F Cat #K23731 — K23731_A001 — page 146 — 9/28/2015 — 20:46


FRAMEWORK AND INTERFACES 147
Table A.2 All included datasets in the toolbox
File Names
(.mat) Dataset Region Time Frame # Periods # Assets
nyse-o NYSE (O) US 07/03/1962–12/31/1984 5651 36
nyse-n NYSE (N) US 01/01/1985–06/30/2010 6431 23
tse TSE CA 01/04/1994–12/31/1998 1259 88
sp500 SP500 US 01/02/1998–01/31/2003 1276 25
msci MSCI Global 04/01/2006–03/31/2010 1043 24
djia DJIA US 01/14/2001–01/14/2003 507 30

A.1.4 Included Datasets


As shown in Table A.2, six main datasets are widely used for the OLPS task. We do not
include the high-frequency datasets (Li et al. 2013) as they are private. Other variants,
such as the revered datasets (Borodin et al. 2004) and margin datasets (Helmbold et al.
1998), which users may generate themselves, will not be provided in the toolbox.

A.1.5 Quick Start


To quick start the OLPS toolbox, we provide two fast-entry files. One is GUI_start.m,
which starts the GUI. The other is CLI_demo.m, which provides fast executions of
all strategies one by one in the command line. All the parameters used in the file are
set according to their original studies, respectively.

A.2 Framework and Interfaces


In this toolbox, we provide two interfaces to call the implemented strategies, that is,
Graphical User Interface (GUI) and Command Line Interface (CLI). The framework
can be easily extended to include new algorithms and datasets.

A.2.1 Graphical User Interface


In the GUI, users will call the implemented algorithms via interaction with the GUI.
We provide a menu-driven interface for the user to select datasets and algorithms,
and input the desired arguments. After providing the inputs and hitting the start but-
ton, the algorithm(s) execute. Upon completion, the results and relevant graphs are
displayed.

A.2.1.1 Getting Started


To start the GUI, we type the following command in MATLAB:

>> OLPS_gui

After executing the above command, the Trading Manager starts. As shown in
Figure A.1, the opening window has five buttons. The About and Exit buttons are
self-explanatory. The other three are the main functional buttons. The Algorithm

T&F Cat #K23731 — K23731_A001 — page 147 — 9/28/2015 — 20:46


148 OLPS: A TOOLBOX FOR ONLINE PORTFOLIO SELECTION

Figure A.1 Starting the Trading Manager.

Analyser button will start a new window, in which the user can run a single algorithm
and analyze its performance relative to the basic benchmarks. The Experimenter but-
ton is used for selecting multiple algorithms and comparing their performances. The
Configuration button is used to add or delete algorithms and datasets that can be used
by the toolbox.

A.2.1.2 Algorithm Analyser


On pressing the Algorithm Analyser button, a new window opens that will be used
for running and analyzing an algorithm. Figure A.2 depicts the Algorithm Analyser
running the Online Moving Average Reversion on the S&P500 dataset. There are
drop-down menus for selecting the algorithm and the dataset. The input parame-
ter fields will dynamically change depending on the inputs the algorithm requires
(default parameters have been provided). When a particular dataset is selected, some
preliminary performance details of the algorithm are displayed. There are three types
of preliminary results displayed. Basic Benchmarks displays the cumulative returns
for four simple algorithms—Uniform Buy and Hold, Uniform Constant Rebalanced
Portfolio, Best Stock in hindsight, and Best Constant Rebalanced Portfolio (BCRP).
For more details on these algorithms, refer to Section A.3. The Returns Distribution
shows the annualized mean return and standard deviation of each asset in the dataset.
The All Assets option shows the performance graph of cumulative returns of all the
assets in the dataset.

T&F Cat #K23731 — K23731_A001 — page 148 — 9/28/2015 — 20:46


FRAMEWORK AND INTERFACES 149

Figure A.2 Various components of the Algorithm Analyser.

A.2.1.3 Experimenter
When devising trading strategies, we usually want to compare the performance of
these strategies relative to each other. For this purpose, we provide the Experimenter.
On pressing the Experimenter button, a new window opens that will offer us the plat-
form for comparing different strategies. First, the dataset is selected. From the list of
algorithms, a subset can be selected to be executed. Among the selected algorithms,
the input parameters have to be provided and saved (default values are already there).
Figure A.3 gives an example of comparing six strategies on the MSCI World Index
dataset. The six algorithms being compared are Uniform Buy & Hold, Uniform Con-
stant Rebalanced Portfolio, Best Constant Rebalanced Portfolio, Passive–Aggressive
Mean Reversion, Confidence Weighted Mean Reversion, and Online Moving Average
Reversion.

A.2.1.4 Results Manager


After hitting the Start button in the Algorithm Analyser or Experimenter, the execution
starts. In the Algorithm Analyzer, a progress bar indicates the execution of a single
algorithm. In the case of the Experimenter, there are two progress bars. One indicates
the number of algorithms executed (along with which algorithm is being executed
currently) and the other shows the status of completion of that individual algorithm.

T&F Cat #K23731 — K23731_A001 — page 149 — 9/28/2015 — 20:46


150 OLPS: A TOOLBOX FOR ONLINE PORTFOLIO SELECTION

Figure A.3 Various components of the Experimenter.

When the execution is over, the Results Manager shows all the basic performance
metrics of the algorithms. Since we have two different managers—one for analyzing a
single algorithm and one for comparing multiple algorithms—we made two different
Results Managers.
Results Manager 1 The first Results Manager for the Algorithm Analyzer is shown
in Figure A.4. The table in the window quantifies the results of the algorithm as
compared to the basic benchmarks. The numbers from this table can directly be
copied and pasted. There is a large graph space that displays the information on a
particular attribute selected in the left column.
Returns It contains information about the daily performance of the algorithm.
The user can choose to view the cumulative returns and the daily returns. The option
of a log (base 10) plot is provided for easier visualization when the difference in
performance of the algorithm and the benchmarks is significantly high.
Risk Analysis There are five metrics to evaluate the risk and risk-adjusted returns
of the algorithm. They are the Sharpe Ratio, Calmar Ratio, Sortino Ratio, Value at
Risk, and Maximum Draw down. An input box called Window is provided next to each
metric. The purpose of the window is to analyze the consistency of the algorithm,
instead of just the final result. For example, entering 252 in the Sharpe Ratio Window
will plot a graph of the Sharpe Ratio of the algorithm for time period t − 252 to t, for
all t. When the window size is large such that t is less than the window size, then the
computation starts from t = 1. The risk metrics are assumed to be zero for the first
50 time periods. This has been done to avoid extreme values due to lack of data in
the initial periods.
Portfolio Analysis The Portfolio Allocation shows the distribution of wealth
allocated to each asset by the algorithm. The Step by Step helps us look at the port-
folio allocation for any particular given day. Lastly, we have a portfolio Animation
that accepts an input called Window. Visualizing portfolio changes based on daily
frequency can be overwhelming and difficult to interpret, especially when the daily

T&F Cat #K23731 — K23731_A001 — page 150 — 9/28/2015 — 20:46


FRAMEWORK AND INTERFACES 151

Figure A.4 Results Manager for Algorithm Analyser.

portfolio changes are significant. Instead, we allow the user to choose a moving aver-
age portfolio of the last Window number of days. This results in a smoother change
of the portfolio allocation.
Result Manager 2 The second Results Manager is very similar to the first manager,
except that it is designed for the Experimenter. The table in the window quantifies
the performance of the algorithms relative to each other. Like the first manager, this
manager also has three sections. A preview of this manager can be seen in Figure A.5.
Returns The daily returns across the entire time period of the dataset for all the
algorithms can be overwhelming to view. A time period can be selected, and the daily
performance of the algorithms is displayed for only that time period.
Risk Analysis This section is almost identical to that of the first Results Manager.
The only difference is that here the metrics are evaluated for every algorithm and
displayed together.
Portfolio Analysis This shows the distribution of portfolio allocation for all the
algorithms.

A.2.1.5 Configuration Manager


Here, we describe how to add or delete new algorithms and datasets via the
Configuration Manager, as shown in Figure A.6.

T&F Cat #K23731 — K23731_A001 — page 151 — 9/28/2015 — 20:46


152 OLPS: A TOOLBOX FOR ONLINE PORTFOLIO SELECTION

Figure A.5 Results Manager for Experimenter.

New Strategy A template (“template.m”) has been provided in the Strategy folder
that is based on the general framework for OLPS (as described in Algorithm A.1).
The user should enter his code to learn the new portfolio within the specified region
of the loop. Without any changes to the code, the template will behave as a Uniform
Constant Rebalanced Portfolio strategy, owing to the fact that we start with a uniform
portfolio and never update it. All new strategies coded must remain in the Strategy
folder. Once the files are created in the folders, the configuration should be changed
using the Configuration Manager GUI, which controls the loading of algorithms and
datasets into the Trading Manager.
New Dataset A dataset is in the form of price relative vectors of various assets. The
t-th row represents the price relative of all the assets at time t. The user just has to
save the new price relative matrix in the Data folder. Data of different frequencies
can be used as well. All the datasets provided in the toolbox are of daily frequency.
Once the files are created in the folders, the configuration should be changed using the
Configuration Manager GUI, which controls the loading of algorithms and datasets
into the Trading Manager.
Configuration The configuration determines the algorithms and datasets being used
in the toolbox. Within the config folder, there is a file called config. This is the active
configuration, which means the toolbox uses this file to determine which algorithms
and datasets would be preloaded. There is another file config_default, which is the

T&F Cat #K23731 — K23731_A001 — page 152 — 9/28/2015 — 20:46


FRAMEWORK AND INTERFACES 153

Figure A.6 Configuration Manager.

configuration provided by the toolbox. Initially, the content of the default and the
active configuration are the same. A new configuration can be created by clicking
on the Configuration button in the start window. It automatically loads the active
configuration, to which the user can add or delete new algorithms or datasets.

A.2.2 Command Line Interface


In the CLI, users can run algorithms by calling the commands. In particular, we provide
a meta-function named manager, which is responsible for preprocessing (such as ini-
tializing datasets and variables, etc.), calling specified strategies, and postprocessing
(such as analyzing and outputting the results, etc.).

T&F Cat #K23731 — K23731_A001 — page 153 — 9/28/2015 — 20:46


154 OLPS: A TOOLBOX FOR ONLINE PORTFOLIO SELECTION
A.2.2.1 Trading Manager

Algorithm A.2: Trading manager for online portfolio selection.


Input: strategy_name: A string of the specified strategy;
dataset_name: A string of the specified dataset;
varargins: A variable-length input argument list for the specified strategy;
opts: A variable for options controlling the trading environment.
Output: cumulative_ret: Final cumulative wealth;
Cumprod_ret: Cumulative wealth at the end of each period;
daily ret: Daily return for each period;
ra_et: Analyzed results, including risk-adjusted returns;
run_time: Time for the strategy (in seconds).
begin
Initialize market data from dataset;
Open the log file and mat file;
Start the time variables;
Call strategy with parameters in varargins;
Terminate the time variables;
Analyze the results;
Close the log file and mat file;
end

The Trading Manager, as shown in Algorithm A.2, controls the whole simulation
of OLPS. At the start (Line 2), it loads market data from the specified dataset. Note
that this can be easily extended to load data from real brokers. Then, Lines 3 and 8
open and close two logging files, one for text and one for .MAT format. Lines 4 and
6 measure the computational time of the execution of a specified strategy. Measuring
the time in the trading manager ensures a fair comparison of the computational time
among different strategies. Line 5 is the core component, which calls the specified
strategy with specified parameters. Section A.3 will illustrate all included strategies
and their usages. Line 7 analyzes the executed results of the strategy, which will be
introduced later. The “manager.m” usage is shown as follows.

Usage
function [cum_ret, cumprod_ret, daily_ret, ra_ret,
run_time]...
= manager(strategy name, dataset name, varargins,
opts);
• cum_ret: cumulative return;
• cumprod_ret: a vector of cumulative returns at the end of every trading day;
• daily_ret: a vector of daily returns at the end of every trading day;
• ra_ret: analyzed result;
• run_time: computational time of the core strategy (excluding the manager routine);

T&F Cat #K23731 — K23731_A001 — page 154 — 9/28/2015 — 20:46


FRAMEWORK AND INTERFACES 155
• strategy_name: the name of the strategy (all implemented strategies’ names are
listed in the fourth column of Table A.1);
• dataset_name: the name of the dataset;
• varargins: variable-length input argument list; and
• opts: options for behavioral control.

Example This example calls the ubah (Uniform Buy and Hold, or commonly
referred to as the market strategy) strategy on the “NYSE (O)” dataset.

[cum_ret, cumprod_ret, daily_ret, ra_ret, run_time]...


= manager(’ubah’, ’nyse-o’, {0}, opts);

To facilitate the debugging of trading strategies, we also use controlling variables


to control the trading environment. In particular, the last parameter opts in the above
example contains the controlling variables. As shown in Table A.3, it consists of five
controlling variables.
The Results Manager analyzes the results and returns an array containing the basic
statistics, the Sharpe ratio and Calmar ratio, and their related statistics. Details about
the returned statistics are described in Table A.4.

Usage

function [ra_ret] ...


= ra_result_analyze(fid, data, cum_ret, cumprod_ret,
daily_ret, opts);

Adding Your Own Strategy or Data Adding new strategies and datasets in the CLI
mode is similar to that in the GUI mode. Adding the strategy involves replacing the
portfolio update component of the algorithms, and adding a dataset involves storing
the market matrix and placing the files in the data folder.

Table A.3 Controlling variables


Possible Explanation
Variables Descriptions Values for Values
opts.quiet_mode display debug info? 0 or 1 No or Yes
opts.display_interval display info time interval? Any number Display every
(e.g., 500) 500 periods
opts.log_record record the .log file? 0 or 1 No or Yes
opts.mat_record record the .mat file? 0 or 1 No or Yes
opts.analyze_mode analyze the algorithm? 0 or 1 No or Yes
[Link] show the progress bar? 0 or 1 No or Yes

T&F Cat #K23731 — K23731_A001 — page 155 — 9/28/2015 — 20:46


156 OLPS: A TOOLBOX FOR ONLINE PORTFOLIO SELECTION
Table A.4 Vector of the analyzed results
Index Descriptions
1 Number of periods
2 Strategy’s average period return
3 Market’s average period return
4 Strategy’s winning ratio over the market
5 Alpha (α)
6 Beta (β)
7 t-statistics
8 p-value
9 Annualized percentage yield
10 Annualized standard deviation
11 Sharpe ratio
12 Drawdown at the end
13 Maximum drawdown during the periods
14 Calmar ratio

A.2.2.2 Examples

Example 1 Calling a BCRP strategy on the SP500 dataset, mute verbosed outputs:

>> opts.quiet_mode = 1; opts.display_interval = 500;


opts.log_mode = 1; opts.mat_mode = 1;
opts.analyze_mode = 1; [Link] = 0;
>> manager(’bcrp’, ’sp500’, {0}, opts);

Then the algorithm outputs are listed below:

>> manager(’bcrp’, ’sp500’, {0}, opts);


----Begin bcrp on sp500-----
-------------------------------------
BCRP(tc=0.0000), Final return: 4.07
-------------------------------------
----End bcrp on sp500-----
>>

Example 2 Calling a BCRP strategy on the SP500 dataset, display verbosed outputs:

>> opts.quiet_mode = 0; opts.display_interval = 200;


opts.log_mode = 1; opts.mat_mode = 1;
opts.analyze_mode = 1; [Link] = 0;
>> manager(’bcrp’, ’sp500’, {0}, opts);

T&F Cat #K23731 — K23731_A001 — page 156 — 9/28/2015 — 20:46


FRAMEWORK AND INTERFACES 157
Then the algorithm outputs are listed below:

>> manager(’bcrp’, ’sp500’, {0}, opts);


Running strategy bcrp on dataset sp500
Loading dataset sp500.
Finish loading dataset sp500
The size of the dataset is 1276x25.
Start Time: 2013-0721-13-22-05-664.
----Begin bcrp on sp500-----
-------------------------------------
Parameters [tc:0.000000]
day Daily Return Total return
500 1.055339 4.634783
1000 1.018404 4.560191
BCRP(tc=0.0000), Final return: 4.07
-------------------------------------
----End bcrp on sp500-----
Stop Time: 2013-0721-13-22-08-144.
Elapse time(s): 2.486262.
Result Analysis
-------------------------------------
Statistical Test
Size: 1276
MER(Strategy): 0.0015
MER(Market):0.0003
WinRatio:0.5063
Alpha:0.0010
Beta:1.3216
t-statistics:2.1408
p-Value:0.0162
-------------------------------------
Risk Adjusted Return
Volatility Risk analysis
APY: 0.3240
Volatility Risk: 0.4236
Sharpe Ratio: 0.6705
Drawdown analysis
APY: 0.3240
DD: 0.3103
MDD: 0.5066
CR: 0.6395
-------------------------------------
>>

T&F Cat #K23731 — K23731_A001 — page 157 — 9/28/2015 — 20:46


158 OLPS: A TOOLBOX FOR ONLINE PORTFOLIO SELECTION
A.3 Strategies
This section focuses on describing the implemented strategies in the toolbox. We
describe the four implemented categories of algorithms: benchmarks, follow the
winner, follow the loser, and pattern matching–based approaches.

A.3.1 Benchmarks
In the financial markets, there exist various benchmarks (such as indices, etc.). In this
section, we introduce four benchmarks: Uniform Buy and Hold, Best Stock, Uniform
Constant Rebalanced Portfolios, and Best Constant Rebalanced Portfolios.

A.3.1.1 Uniform Buy and Hold


Description The “buy and hold” (BAH) strategy buys the set of assets at the begin-
ning and holds the allocation of assets till the end of trading periods. BAH with an
initial uniform portfolio is termed “uniform buy and hold” (UBAH), which is often
a market strategy in the related literature. The final cumulative wealth achieved by
a BAH strategy is the initial portfolio weighted average of individual stocks’ final
wealth,
 n 

Sn (BAH(b1 )) = b1 · xt ,
t=1
 
where b1 denotes the initial portfolio. In the case of UBAH, b1 = m1 , . . . , m1 . To
see its update clearly, BAH’s explicit portfolio update can also be written as

bt xt
bt+1 = 
, (A.1)
bt xt

where denotes the operation of element-wise product.

Usage
ubah(fid, data, {λ}, opts);
• fid: file handle for writing log file;
• data: market sequence matrix;
• λ ∈ [0, 1): proportional transaction cost rate; and
• opts: options for behavioral control.

Example Call market (uniform BAH) strategy on the “NYSE (O)” dataset with a
transaction cost rate of 0.

1: >> manager(’market’, ’nyse-o’, {0}, opts);

T&F Cat #K23731 — K23731_A001 — page 158 — 9/28/2015 — 20:46


STRATEGIES 159
A.3.1.2 Best Stock
Description “Best Stock” (Best) is a special BAH strategy that buys the best
stock in hindsight. The final cumulative wealth achieved by the Best strategy can
be calculated as
 n 

Sn (Best) = max b · xt = Sn (BAH(b◦ )),
b∈m
t=1

where the initial portfolio b◦can be calculated as


 n 

b◦ = arg max b · xt .
b∈m t=1

Its portfolio update can also be explicitly written as the same as Equation A.1, except
that the initial portfolio equals b◦ .
Usage
best(fid, data, {λ}, opts);

• fid: file handle for writing log file;


• data: market sequence matrix;
• λ ∈ [0, 1): transaction costs rate; and
• opts: options for behavioral control.

Example Call Best Stock strategy on the “NYSE (O)” dataset with a transaction
cost rate of 0.

1: >> manager(’best’, ’nyse-o’, {0}, opts);

A.3.1.3 Uniform Constant Rebalanced Portfolios


Description “Constant rebalanced portfolios” (CRP) is a fixed proportion strategy,
which rebalances to a preset portfolio at the beginning of every period. In particular,
the portfolio strategy can be represented as bn1 = {b, b, . . . }. The final cumulative
portfolio wealth achieved by a CRP strategy after n periods is defined as

n
Sn (CRP(b)) = b xt .
t=1

In
 1 particular,
1
 UCRP chooses a uniform portfolio as the preset portfolio, that is, b =
m , . . . , m .
Usage
ucrp(fid, data, {λ}, opts);
• fid: file handle for writing log file;
• data: market sequence matrix;

T&F Cat #K23731 — K23731_A001 — page 159 — 9/28/2015 — 20:46


160 OLPS: A TOOLBOX FOR ONLINE PORTFOLIO SELECTION
• λ ∈ [0, 1): transaction costs rate; and
• opts: options for behavioral control.

Example Call UCRP strategy on the “NYSE (O)” dataset with a transaction cost
rate of 0.

1: >> manager(’ucrp’, ’nyse-o’, {0}, opts);

A.3.1.4 Best Constant Rebalanced Portfolios


Description “Best constant rebalanced portfolio” (BCRP) is a special CRP strategy
that sets the portfolio as the portfolio that maximizes the terminal wealth in hindsight.
BCRP achieves a final cumulative portfolio wealth as follows:
Sn (BCRP) = max Sn (CRP(b)) = Sn (CRP(b )),
b∈m

and its portfolio is calculated in hindsight as



n
b = arg max log Sn (CRP(b)) = arg max log(b xt ).
bn ∈m b∈m t=1

Usage
bcrp(fid, data, {λ}, opts);
• fid: file handle for writing log file;
• data: market sequence matrix;
• λ ∈ [0, 1): transaction costs rate; and
• opts: options for behavioral control.

Example Call BCRP strategy on the “NYSE (O)” dataset with a transaction cost
rate of 0.

1: >> manager(’bcrp’, ’nyse-o’, {0}, opts);

A.3.2 Follow the Winner


The Follow the Winner approach is characterized by transferring portfolio weights
from the underperforming assets (experts) to the outperforming ones.

A.3.2.1 Universal Portfolios


Description Cover’s (1991) “Universal Portfolios” (UP) uniformly buys and holds
the whole set of CRP experts within the simplex domain. Its cumulative wealth is
calculated as 
Sn (UP) = Sn (b)dμ(b).
m

T&F Cat #K23731 — K23731_A001 — page 160 — 9/28/2015 — 20:46


STRATEGIES 161
Moreover, we adopt an implementation (Kalai and Vempala 2002), which is based on
nonuniform random walks that are rapidly mixing and which requires a polynomial
time.
Usage
up(fid, data, {λ}, opts);

• fid: file handle for writing log file;


• data: market sequence matrix;
• λ ∈ [0, 1): transaction costs rate; and
• opts: options for behavioral control.

Example Call Cover’s Universal Portfolios on the “NYSE (O)” dataset with default
parameters and a transaction cost rate of 0.

1: >> manager(’up’, ’nyse-o’, {0}, opts);

A.3.2.2 Exponential Gradient


Description “Exponential gradient” (EG) (Helmbold et al. 1996) tracks the best
stock and adopts a regularization term to constrain the deviation from the previous
portfolio, that is, EG’s formulation is
bt+1 = arg max η log b · xt − R(b, bt ),
b∈m

m η refersbito the learning rate and R(b, bt ) denotes relative entropy, or R(b, bt ) =
where
i=1 bi log bt,i . Solving the optimization, we can obtain EG’s portfolio explicit
update:  
xt,i
bt+1,i = bt,i exp η /Z, i = 1, . . . , m,
bt · xt
where Z denotes the normalization term such that the portfolio element sums to 1.
Usage
eg(fid, data, {η, λ}, opts);
• fid: file handle for writing log file;
• data: market sequence matrix;
• η: learning rate;
• λ: transaction costs rate; and
• opts: options for behavioral control.

Example Call EG on the “NYSE (O)” dataset with a learning rate of 0.05 and a
transaction cost rate of 0.

1: >> manager(’eg’, ’nyse-o’, {0.05, 0}, opts);

T&F Cat #K23731 — K23731_A001 — page 161 — 9/28/2015 — 20:46


162 OLPS: A TOOLBOX FOR ONLINE PORTFOLIO SELECTION
A.3.2.3 Online Newton Step
Description “Online Newton Step” (ONS) (Agarwal et al. 2006) tracks the best CRP
to date and adopts a L2-norm regularization to constrain the portfolio’s variability.
In particular, its formulation is


t
β
bt+1 = arg max log(b · xτ ) − b .
b∈m 2
τ=1

Solving the optimization, we can obtain the explicit portfolio update of ONS:
 
1 1 −1
b1 = ,..., , bt+1 = A t
m (δAt pt ),
m m
with
t 
    t
xτ xτ 1  xτ
At = + Im , pt = 1 + ,
(bτ · xτ )2 β bτ · xτ
τ=1 τ=1

where β is the trade-off parameter, δ is a scaling term, and At


m (·) is an exact pro-
jection to the simplex domain.
Usage
ons(fid, data, {η, β, δ, λ}, opts)
• fid: file handle for writing log file;
• data: market sequence matrix;
• η: mixture parameter;
• β: trade-off parameter;
• δ: heuristic tuning parameter;
• λ: transaction costs rate; and
• opts: options for behavioral control.

Example Call the ONS on the “NYSE (O)” dataset with a transaction cost rate of 0.

1: >> manager(’ons’, ’nyse-o’, {0, 1, 1/8, 0}, opts);

A.3.3 Follow the Loser


The Follow the Loser approaches assume that the underperforming assets will revert
and outperform others in the subsequent periods. Thus, their common behavior is to
move portfolio weights from the outperforming assets to the underperforming assets.

A.3.3.1 Anticorrelation
Description “Anticorrelation” (Anticor) (Borodin et al. 2004) transfers the
wealth from the outperforming stocks to the underperforming stocks via their

T&F Cat #K23731 — K23731_A001 — page 162 — 9/28/2015 — 20:46


STRATEGIES 163
cross-correlation and autocorrelation. Anticor adopts logarithmic price relatives in
two specific market windows, that is, y1 = log(xt−2w+1
t−w
) and y2 = log(xt−w+1
t ). It
then calculates the cross-correlation matrix between y1 and y2 :

1
Mcov (i, j ) = (y1,i − ȳ1 ) (y2,j − ȳ2 ),
w−1

Mcov (i,j )
Mcor (i, j ) = σ1 (i)∗σ2 (j ) σ1 (i), σ2 (j ) = 0 .
0 otherwise

Then, following the cross-correlation matrix, Anticor moves the proportions from the
stocks increased more to the stocks increased less, in which the corresponding amounts
are adjusted according to the cross-correlation matrix. In particular, if asset i increases
more than asset j and their sequences in the window are positively correlated, Anticor
claims a transfer from asset i to j with the amount equaling the crosscorrelation
value (Mcor (i, j )) minus their negative autocorrelation values (min{0, Mcor (i, i)}
and min{0, Mcor (j, j )}). These transfer claims are finally normalized to keep the
portfolio in the simplex domain.

Usage We implemented two Anticor algorithms, BAHW (Anticor) and BAHW


(Anticor(Anticor)). Their usages are listed below.

anticor(fid, data, {W, λ}, opts);


anticor_anticor(fid, data, {W, λ}, opts);

• fid: file handle for writing log file;


• data: market sequence matrix;
• W: maximal window size;
• λ: transaction cost rates; and
• opts: options for behavioral control.

Example Call both Anticor algorithms on the “NYSE (O)” dataset with a maximal
window size of 30 and a transaction cost rate of 0.

1: >> manager(’anticor’, ’nyse-o’, {30, 0}, opts);


2: >> manager(’anticor_anticor’, ’nyse-o’, {30, 0}, opts);

A.3.3.2 Passive–Aggressive Mean Reversion


Description Rather than tracking the best stock, “passive–aggressive mean rever-
sion” (PAMR) (Li et al. 2012) explicitly tracks the worst stocks, while adopting

T&F Cat #K23731 — K23731_A001 — page 163 — 9/28/2015 — 20:46


164 OLPS: A TOOLBOX FOR ONLINE PORTFOLIO SELECTION
regularization techniques to constrain the deviation from the last portfolio. In
particular, PAMR’s formulation is

1
bt+1 = arg min b − bt 2 s.t.  (b; xt ) = 0,
b∈m 2

where  (b; xt ) denotes a predefined loss function to capture the mean reversion
property,

0 b · xt ≤ 
 (b; xt ) = .
b · xt −  otherwise

Solving the optimization, we can obtain PAMR’s portfolio update:


" #
bt · xt − 
bt+1 = bt − τt (xt − x̄t 1), τt = max 0, .
xt − x̄t 1 2

Usage We implemented three PAMR algorithms (i.e., PAMR, PAMR-I, and


PAMR-II). Their usages are listed below.

pamr(fid, data, {, λ}, opts);


pamr_1(fid, data, {, C, λ}, opts);
pamr_2(fid, data, {, C, λ}, opts);

• fid: file handle for writing log file;


• data: market sequence matrix;
• : mean reversion threshold;
• C: aggressive parameter;
• λ: transaction cost rates; and
• opts: options for behavioral control.

Example Call the three PAMR algorithms on the “NYSE (O)” dataset with a mean
reversion threshold of 0.5, an aggressive parameter of 30, and a transaction cost rate
of 0.

1: >> manager(’pamr’, ’nyse-o’, {0.5, 0}, opts);


2: >> manager(’pamr_1’, ’nyse-o’, {0.5, 500, 0}, opts);
3: >> manager(’pamr_2’, ’nyse-o’, {0.5, 500, 0}, opts);

A.3.3.3 Confidence-Weighted Mean Reversion


Description “Confidence-weighted mean reversion” (CWMR) (Li et al. 2013)
models the portfolio vector on a Gaussian distribution and explicitly updates the

T&F Cat #K23731 — K23731_A001 — page 164 — 9/28/2015 — 20:46


STRATEGIES 165
distribution following the mean reversion principle. In particular, CWMR’s
formulation is
(μt+1 ,  t+1 ) = arg min DKL(N (μ, ) N (μt ,  t ))
μ∈m ,
s.t. Pr[μ · xt ≤ ] ≥ θ.
Expanding the constraint, the resulting optimization problem is not convex. The
authors provided two methods to solve the optimization (i.e., CWMR-Var and
CWMR-Stdev). CWMR-Var involves linearizing the constraint and solving the
resulting optimization, and one can obtain the closed form update scheme as

μt+1 = μt − λt+1  t (xt − x̄t 1),  −1 −1 


t+1 =  t + 2λt+1 φxt xt ,

where λt+1 corresponds to the Lagrangian multiplier calculated by Eq. (11) in Li



et al. (2013), and x̄t = 11t x1t denotes the confidence-weighted price relative average.
t
CWMR-Stdev involves the decomposition of the covariance matrix and can also
release similar portfolio update formulas.
Usage We implemented two CWMR algorithms (i.e., CWMR-Var and CWMR-
Stdev). Their usages are listed below.
cwmr_var(fid, data, {φ, , λ}, opts);
cwmr_stdev(fid, data, {φ, , λ}, opts);

• fid: file handle for writing log file;


• data: market sequence matrix;
• φ: confidence parameter;
• : mean reversion threshold;
• λ: transaction cost rates; and
• opts: options for behavioral control.

Example Call the two CWMR algorithms on the “NYSE (O)” dataset with a
confidence parameter of 2, a mean reversion parameter of 0.5, and a transaction cost
rate of 0.

1: >> manager(’cwmr_var’, ’nyse-o’, {2, 0.5, 0}, opts);


2: >> manager(’cwmr_stdev’, ’nyse-o’, {2, 0.5, 0}, opts);

A.3.3.4 Online Moving Average Reversion


Description “Online moving average reversion” (OLMAR) (Li and Hoi 2012)
explicitly predicts next price relatives following the mean reversion idea (i.e., MAR-1
borrows the simple moving average):
 
1 1 1
x̃t+1 (w) = 1 + + · · · + w−2 ,
w xt i=0 xt−i

T&F Cat #K23731 — K23731_A001 — page 165 — 9/28/2015 — 20:46


166 OLPS: A TOOLBOX FOR ONLINE PORTFOLIO SELECTION

where w is the window size and denotes the element-wise product; and MAR-2
borrows the exponential moving average:

x̃t
x̃t+1 (α) = α1 + (1 − α) ,
xt

where α ∈ (0, 1) denotes the decaying factor and the operations are all element-wise.
Then, OLMAR’s formulation is

1
bt+1 = arg min b − bt 2 s.t. b · x̃t+1 ≥ .
b∈m 2

Solving the optimization, we can obtain its portfolio update:

bt+1 = bt + λt+1 (x̃t+1 − x̄t+1 1),

where x̄t+1 = m1 (1 · x̃t+1 ) denotes the average predicted price relative and λt+1
is the Lagrangian multiplier calculated as
" #
 − bt · x̃t+1
λt+1 = max 0, .
x̃t+1 − x̄t+1 1 2

Usage We implemented two OLMAR algorithms (i.e., OLMAR-I and OLMAR-II).


Their usages are listed below.

olmar1(fid, data, {, W, λ}, opts);


olmar2(fid, data, {, α, λ}, opts);

• fid: file handle for writing log file;


• data: market sequence matrix;
• : mean reversion threshold;
• W: window size for simple moving average;
• α ∈ [0, 1]: decaying factor to calculate exponential moving average;
• λ ∈ [0, 1): transaction cost rates; and
• opts: options for behavioral control.

Example Call the two OLMAR algorithms on the “NYSE (O)” dataset with a mean
reversion threshold of 10, a window size of 5, a decaying factor of 0.5, and a trans-
action cost rate of 0.

1: >> manager(’olmar1’, ’nyse-o’, {10, 5, 0}, opts);


2: >> manager(’olmar2’, ’nyse-o’, {10, 0.5, 0}, opts);

T&F Cat #K23731 — K23731_A001 — page 166 — 9/28/2015 — 20:46


STRATEGIES 167
A.3.4 Pattern Matching–Based Approaches
The pattern matching–based approaches are based on the assumption that market
sequences with similar preceding market appearances tend to reappear. Thus, the
common behavior of these approaches is to first identify similar market sequences that
are deemed similar to the coming sequence, and then obtain a portfolio that maximizes
the expected return based on these similar sequences. Algorithm A.3 illustrates the first
step, or the sample selection procedure. The second step, or the portfolio optimization
procedure, often follows the following optimization:

bt+1 = arg max b · xi . (A.2)


b∈m
i∈C(x1t )

A.3.4.1 Nonparametric Kernel-Based Log-Optimal Strategy


Description “Nonparametric kernel-based sample selection” (BK ) (Györfi et al.
2006) identifies the similarity set by comparing two market windows via Euclidean
distance:
 c
CK (x1t , w) = w < i < t + 1 : xt−w+1
t
− xi−w
i−1
≤ ,


Algorithm A.3: Sample selection framework (C(x1t , w)).


Input: x1t : Historical market sequence; w: window size;
Output: C: Index set of similar price relatives.
Initialize C = ∅;
if t ≤ w + 1 then
return;
end
for i = w + 1, w + 2, . . . , t do
i−1 t
if xi−w is similar to xt−w+1 then
C = C ∪ {i};
end
end

where c and  are the thresholds used to control the number of similar samples. Then,
it obtains an optimal portfolio via solving Equation A.2.
Usage
bk_run(fid, data, {K, L, c, λ}, opts);

• fid: file handle for writing log file;


• data: market sequence matrix;
• K: maximal window size;
• L: used to split the parameter space of each k;

T&F Cat #K23731 — K23731_A001 — page 167 — 9/28/2015 — 20:46


168 OLPS: A TOOLBOX FOR ONLINE PORTFOLIO SELECTION
• c: similarity threshold;
• λ ∈ [0, 1): transaction cost rates; and
• opts: options for behavioral control.

Example Call the BK algorithm on the “NYSE (O)” dataset with default parameters
and a transaction cost rate of 0.

1: >> manager(’bk’, ’nyse-o’, {5, 10, 1, 0}, opts);

A.3.4.2 Nonparametric Nearest-Neighbor Log-Optimal Strategy


Description “Nonparametric nearest-neighbor-based sample selection” (BNN )
(Györfi et al. 2008) searches the price relatives whose preceding market windows
are within the  nearest neighbor of latest market window in terms of Euclidean
distance:
CN (x1t , w) = {w < i < t + 1 : xi−w
i−1 t
is among the  NNs of xt−w+1 },
where  is a threshold parameter. Then, the strategy obtains an optimal portfolio via
solving Equation A.2.
Usage
bnn(fid, data, {K, L, λ}, opts)

• fid: file handle for writing log file;


• data: market sequence matrix;
• K: maximal window size;
• L: parameter to split the parameter space of each k;
• λ ∈ [0, 1): transaction cost rates; and
• opts: options for behavioral control.

Example Call the BNN algorithm on the “NYSE (O)” dataset with default
parameters and a transaction cost rate of 0.

1: >> manager(’bnn’, ’nyse-o’, {5, 10, 0}, opts);

A.3.4.3 Correlation-Driven Nonparametric Learning Strategy


“Correlation-driven nonparametric sample selection” (CORN) (Li et al. 2011a)
identifies the similarity among two market windows via a correlation coefficient:
 i−1
!
cov(x , x t )
i−w t−w+1
CC (x1t , w) = w < i < t + 1 : i−1 t
≥ρ ,
std(xi−w )std(xt−w+1 )
where ρ is a predefined threshold. Then, it obtains an optimal portfolio via solving
Equation A.2.

T&F Cat #K23731 — K23731_A001 — page 168 — 9/28/2015 — 20:46


SUMMARY 169
Usage
corn(fid, data, {w, c, λ}, opts);
cornu(fid, data, {K, L, c, λ}, opts);
cornk_run(fid, data, {K, L, pc, λ}, opts)

• fid: file handle for writing log file;


• data: market sequence matrix;
• w: window size;
• K: maximal window size;
• L: used to split the parameter space of each k;
• c: correlation threshold;
• pc: percentage of experts to be selected;
• λ ∈ [0, 1): transaction cost rates; and
• opts: options for behavioral control.

Example Below we call three CORN algorithms with their default parameters.

1: >> manager(’corn’, ’nyse-o’, {5, 0.1, 0}, opts);


2: >> manager(’cornu’, ’nyse-o’, {5, 1, 0.1, 0}, opts);
3: >> manager(’cornk’, ’nyse-o’, {5, 10, 0.1, 0}, opts);

A.4 Summary
In this manual, we describe the OLPS toolbox in detail. OLPS is the first toolbox for
the research of OLPS problems. It is easy to use and can be extended to include new
algorithms and datasets. We hope this toolbox can facilitate further research on this
topic.

T&F Cat #K23731 — K23731_A001 — page 169 — 9/28/2015 — 20:46


Appendix B

Proofs and Derivations

B.1 Proof of CORN


B.1.1 Proof of Theorem 1∗
In this appendix, we give a detailed proof that the portfolio scheme correlation-driven
nonparametric learning (CORN) (Li et al. 2011a) is universal with respect to the
class of all ergodic processes. We first give a concise definition about “universal”
considered in this note.

Definition 1 An investment strategy B is called universal with respect to a class of


stationary and ergodic processes {Xn }+∞
−∞ , if, for each process in the class,

1
limlog Sn (B) = W ∗ almost surely.
n→∞ n

Before we give the theorem and its proof, we introduce some necessary lemmas.
Lemma B.1 (Breiman 1957 [Correction version 1960]) Let Z = {Zi }∞ −∞ be a sta-
i
tionary and ergodic process. For each positive integer i, let T denote the operator
that shifts any sequence {. . . , z−1 , z0 , z1 , . . .} by i digits to the left. Let f1 , f2 , . . . be
a sequence of real-valued functions such that limn→∞ fn (Z) = f (Z) almost surely
for some function f . Assume that E supn |fn (Z)| < ∞. Then,

1
n
lim fi (T i Z) = Ef (Z) almost surely.
n→∞ n
i=1

Lemma B.2 (Algoet and Cover 1988) Let Qn∈N ∪{∞} be a family of regular proba-
(j )
bility distributions over the set Rd+ of all market vectors such that E{| log Un |} < ∞
(1) (d)
for any coordinate of a random market vector Un = (Un , . . . , Un ) distributed

according to Qn . In addition, let B (Qn ) be the set of all log-optimal portfolios with

∗ The proof idea is mainly provided by Vladimir Vovk, and the proof is then finished by Dingjiang
Huang and Bin Li.

171

T&F Cat #K23731 — K23731_A002 — page 171 — 9/28/2015 — 20:47


172 PROOFS AND DERIVATIONS
respect to Qn , that is, the set of all portfolios b that attain maxb∈d E{logb, Un }.
Consider an arbitrary sequence bn ∈ B∗ (Qn ). If

Qn → Q∞ weakly as n → ∞,

then, for Q∞ -almost all u,

lim bn , u → b∗ , u,


n→∞

where the right-hand side is constant as b∗ ranges over B ∗ (Q∞ ).


Lemma B.3 (Algoet and Cover 1988) Let X be a random market vector defined on
a probability space(, F, P) satisfying E{| log X(j ) |} < ∞. If Fk is an increasing
sequence of sub-σ-fields of F with:

Fk  F∞ ⊆ F,

then  
E max E[logb, X|Fk ]  E max E[logb, X|F∞ ] ,
b b

as k → ∞ where the maximum on the left-hand side is taken on over all


Fk -measurable functions b, and the maximum on the right-hand side is taken on
over all F∞ -measurable functions b.

Lemma B.4 Let μ be the Lebesgue measure on the Euclidean space R n and A
be a Lebesgue measurable subset of R n . Define the approximate density of A in
a ε-neighborhood of a point x in R n as

μ(A ∩ Bε (x))
dε (x) = ,
μ(Bε (x))

where Bε denotes the closed ball of radius ε centered at x. Then for almost every point
x of A the density,
d(x) = lim dε (x)
ε→0

exists and is equal to 1.

Lemma B.5 The inequality

cov(X, X )
√ √ ≥ ρ,
Var(X) Var(X )

which describes the similarity of X and X in CORN strategy, is approximately


equivalent to
2Var(X)(1 − ρ) ≥ E{(X − X )2 }.

T&F Cat #K23731 — K23731_A002 — page 172 — 9/28/2015 — 20:47


PROOF OF CORN 173
Proof In general, from the covariance cov(X, X ) it is impossible to derive a topol-
ogy, since cov(X, X ) = 1 does not imply that E{(X − X )2 } = 0. However, because X
and X are relative prices, then we have E{(X − X )2 } ≈ 0. For the Euclidean
distance, we have that

E{(X − X )2 } = Var(X − X ) + (E{X − X })2


= Var(X) − 2cov(X, X ) + Var(X ) + (E{X − X })2 .

Thus, the similarity means that

Var(X) + Var(X ) + (E{X − X })2 − E{(X − X )2 }


√ √ ≥ 2ρ
Var(X) Var(X )
or, equivalently,
 
Var(X) + Var(X ) + (E{X − X })2 − 2ρ Var(X) Var(X ) ≥ E{(X − X )2 }.

Since both Var(X) and |E{X − X }| have the same order of magnitude,∗ they are in
the range 10−4 , 10−3 ; therefore, the previous inequality approximately means that

2Var(X)(1 − ρ) ≥ E{(X − X )2 }.

Lemma B.6 Assume that x1 , x2 , . . . are the realizations of the random vectors
X1 , X2 , . . . drawn from the vector-valued stationary and ergodic process {Xn }∞
−∞ .
The fundamental limits (determined in Algoet 1992, 1994; Algoet and Cover 1988),
reveal that the so-called log-optimum portfolio B∗ = {b∗ (·)} is the best possible
choice. More precisely, in trading period n, let b∗ (·) be such that

E{logb∗ (X1n−1 ), Xn |X1n−1 } = max E{logb(X1n−1 ), Xn |X1n−1 }.


b(·)

If Sn∗ = Sn (B∗ ) denotes the capital achieved by a log-optimum portfolio strategy B∗ ,


after n trading periods, then for any other investment strategy B with capital Sn =
Sn (B) and for any stationary and ergodic process {Xn }∞ −∞ ,

1 Sn
lim sup log ∗ ≤ 0 almost surely
n→∞ n Sn

and
1
lim log Sn∗ = W ∗ almost surely,
n→∞n
where 
−1 −1
W ∗ = E max E{logb(X−∞ ), X0 |X−∞ } ,
b(·)

is the maximal possible growth rate of any investment strategy.


∗ See Appendix C for more details.

T&F Cat #K23731 — K23731_A002 — page 173 — 9/28/2015 — 20:47


174 PROOFS AND DERIVATIONS
Now, we give the universal theorem and its proof.

Theorem B.1 The portfolio scheme CORN is universal with respect to the class of
all ergodic processes such that E{| log X (j ) |} < ∞, for j = 1, 2, . . . , d.

Proof To prove that the strategy CORN is universal with respect to the class of all
ergodic processes, we need to prove that if, for each process in the class,
1
lim log Sn (B) = W ∗ almost surely,
n→∞ n

where B denote the strategy CORN; and


1 −1 −1
W ∗ = lim log Sn∗ = E{max E{logb(X−∞ ), X0 |X−∞ }}.
n→∞ n b(·)

We divide the proof into three parts.  


(i) According to Lemma B.6, we know that lim n1 log Sn − n1 log Sn∗ ≤ 0, then
n→∞
lim 1
log Sn ≤ lim 1
log Sn∗ = W ∗ . So it suffices to prove that
n→∞ n n→∞ n

1
lim inf Wn (B) = lim inf log Sn (B) ≥ W ∗ almost surely.
n→∞ n→∞ n
Without loss of generality, we may assume S0 = 1, so that
1
Wn (B) = log Sn (B)
n  
1 
= log qω,ρ Sn ( (ω,ρ)
)
n ω,ρ
 
1
≥ log sup qω,ρ Sn ((ω,ρ) )
n ω,ρ
1
= sup(log qω,ρ + log Sn ((ω,ρ) ))
n ω,ρ
 
log qω,ρ
= sup Wn ( (ω,ρ)
)+ .
ω,ρ n
Thus,
 
log qω,ρ
lim inf Wn (B) = lim inf sup Wn ( (ω,ρ)
)+
n→∞ n→∞ ω,ρ n
 
log qω,ρ
≥ sup lim inf Wn ((ω,ρ) ) +
ω,ρ n→∞ n
= sup lim inf Wn ((ω,ρ) ). (B.1)
ω,ρ n→∞
The simple argument above shows that the asymptotic rate of growth of the
strategy B is at least as large as the supremum of the rates of growth of all elemen-
tary strategies (ω,ρ) . Thus, to estimate lim inf n→∞ Wn (B), it suffices to investigate

T&F Cat #K23731 — K23731_A002 — page 174 — 9/28/2015 — 20:47


PROOF OF CORN 175
the performance of expert (ω,ρ) on the stationary and ergodic market sequence
X0 , X−1 , X−2 , . . ..
(ii) First, let the integers ω, ρ, and the vector s = s−1
−ω ∈ R+ be fixed. From

i−1
cov(Xi−ω ,s)
Lemma B.5, we can get that the set {Xi : 1 − j + ω ≤ i ≤ 0, , i−1 √
≥ ρ}
Var(Xi−ω ) V ar(s)
can be expressed as {Xi : 1 − j + ω ≤ i ≤ 0, E{(Xi−ω
i−1
− s)2 } ≤ 2Var(s)(1 − ρ).
(ω,ρ)
Let Pj,s denote the (random) measure concentrated on {Xi : 1 − j + ω ≤ i ≤
0, E{(Xi−ω
i−1
− s)2 } ≤ 2Var(s)(1 − ρ), defined by

II A (Xi )
(ω,ρ) i:1−j +ω≤i≤0,E{(Xi−ω
i−1
−s)2 }≤2Var(s)(1−ρ)
Pj,s (A) = , A ⊂ Rd+
|{i : 1 − j + ω ≤ i ≤ 0, E{(Xi−ω
i−1
− s)2 } ≤ 2Var(s)(1 − ρ)}|

where I IA denotes the indicator of function of the set A. If the above set of Xi s is
(ω,ρ)
empty, then let Pj,s = δ(1,...,1) be the probability measure concentrated on the vector
(ω,ρ)
(1, . . . , 1). In other words, Pj,s (A) is the relative frequency of the vectors among
X1−j +ω , . . . , X0 that fall in the set A.
Observe that for all s, without probability 1,

→ P∗(ω,ρ)
(ω,ρ)
Pj,s s
 −1
P −1 2 if P(E{(X−ω − s)2 } ≤ 2Var(s)(1 − ρ)) > 0
= X0 |E{(X−ω −s) }≤2Var(s)(1−ρ)
−1
δ(1,...,1) if P(E{(X−ω − s)2 } ≤ 2Var(s)(1 − ρ)) = 0
(B.2)
∗(ω,ρ) (ω,ρ)
weakly as j → ∞, where Ps denotes the limit distribution of Pj,s , and
P −1 2 denotes the distribution of the vector X0 conditioned on
X0 |E{(X−ω −s) }≤2Var(s)(1−ρ)
−1 2
the event E{(X−ω − s) } ≤ 2Var(s)(1 − ρ). To see this, let f be a bounded continuous
−1
function defined on Rd+ . Then, the ergodic theorem implies that if P(E{(X−ω − s)2 } ≤
2Var(s)(1 − ρ)) > 0, then
1 
|1−j +ω| f (Xi )
 ∗(ω,ρ)
i−1 −s)2 }≤2Var(s)(1−ρ)
i:1−j +ω≤i≤0,E{(Xi−ω
f (x)Pj,s (dx) =
|1−j +ω| |{i:1−j +ω≤i≤0,E{(Xi−ω −s) }≤2Var(s)(1−ρ)}|
1 i−1 2

E{f (X0 )II −1 −s)2 }≤2Var(s)(1−ρ)}


}
{E{(X−ω
→ −1 2
P{E{(X−ω −s) }≤2Var(s)(1−ρ)}
−1 2
= E{f
 (X0 )|E{(X−ω − s) } ≤ 2Var(s)(1 − ρ)}
= f (x)P −1 2 almost surely, as j → ∞.
X0 |E{(X−ω −s) }≤2Var(s)(1−ρ)

−1
On the other hand, if P(E{(X−ω − s)2 } ≤ 2Var(s)(1 − ρ)) = 0, then with proba-
(ω,ρ)  (ω,ρ)
bility 1, Pj,s is concentrated on (1, . . . , 1) for all j, and f (x)Pj,s (dx) =
f (1, . . . , 1).

T&F Cat #K23731 — K23731_A002 — page 175 — 9/28/2015 — 20:47


176 PROOFS AND DERIVATIONS
−1
Recall that by definition, b(ω,ρ) (X1−j , s) is a log-optimal portfolio with respect
to the probability measure Pj,s . Let b∗ω,ρ (s) denote a log-optimal portfolio with
(ω,ρ)

∗(ω,ρ)
respect to the limit distribution Ps . Then, using Lemma B.2, we infer from
Equation B.2 that, as j tends to infinity, we have the almost sure convergence

−1
lim b(ω,ρ) (X1−j , s), x0  = b∗ω,ρ (s), x0 ,
j →∞

∗(ω,ρ)
for Ps (almost all x0 ) and hence for PX0 (almost all x0 ). Since s was arbitrary,
we obtain

−1 −1 −1
lim b(ω,ρ) (X1−j , X−& ), x0  = b∗ω,ρ (X−ω ), x0  almost surely, (B.3)
j →∞

Next, we apply Lemma B.1 for the function

∞ −1 −1 −1
fi (x−∞ ) = logh(ω,ρ) (x1−i ), x0  = logb(ω,ρ) (x1−i , x−& ), x0 

∞ = (. . . , x , x , x ). Note that
defined on x−∞ −1 0 1


d
∞ −1 (j )
|fi (X−∞ )| = | logh(ω,ρ) (X1−i ), x0 | ≤ | log X0 |,
j =1

which has finite expectation, and

∞ ∗ −1
fi (X−∞ ) → bω,ρ (X−ω ), X0  almost surely as i → ∞

by Equation B.3. As n → ∞, Lemma B.1 yields


n
Wn ((ω,ρ) ) = 1
n logh(ω,ρ) (X1i−1 ), Xi 
i=1
n
∞ )
= 1
n fi (T i X−∞
i=1
−1
→ E{log b∗ω,ρ (X−ω ), X0 }
def
= θω,ρ almost surely.

Therefore, by Equation B.1, we have

lim inf Wn (B) ≥ sup θω,ρ ≥ sup lim inf θω,ρ almost surely,
n→∞ ω,ρ ω ρ

and it suffices to show that the right-hand side is at least W ∗ .

T&F Cat #K23731 — K23731_A002 — page 176 — 9/28/2015 — 20:47


PROOF OF CORN 177
(iii) To this end, first, define, for Borel sets A, B ⊂ Rd+ ,
−1
mA (z) = P{X0 ∈ A|X−ω = z}

and
−1
μω (B) = P{X−ω ∈ B}.
Then, for any s ∈ support(μω ), and for all A,
∗(ω,ρ) −1 2
Ps (A) = P{X0 ∈ A|E{(X−ω − s) } ≤ 2Var(s)(1 − ρ)}
−1 2
P{X0 ∈A,E{(X−ω −s) }≤2Var(s)(1−ρ)}
= −1 2
P{E{(X−ω −s) }≤2Var(s)(1−ρ)}

= μω (Ss,2Var(s)(1−ρ) ) Ss,2Var(s)(1−ρ) mA (z)μω (dz)
1

−1
→ mA (s) = P{X0 ∈ A|X−ω = s}

as ρ → 1 and for μω , almost all s by Lebesgue density theorem (see Lemma B.4), and
therefore
∗(ω,ρ) −1
P −1 (A) → P{X0 ∈ A|X−ω }
X−ω

as ρ → 1 for all A. Thus, using Lemma B.2 again, we have

lim inf θω,ρ = lim θω,ρ


ρ ρ
 −1 
= lim E log b∗ω,ρ (X−ω ), X0
ρ
1 −1 2
= E{log b∗ω (X−ω ), X0 }
(where b∗ω (·) is the log-optimum portfolio with respect
−1
to the conditional probability P{X0 ∈ A|X−ω })
 1 −1 2 −1 
= E max E{log b(X−ω ), X0 |X−ω }
 b(·) 1 −1 2 −1 
= E E{log b∗ω (X−ω ), X0 |X−ω }
def
= θ∗ω .

Next, to finish the proof, we appeal to the submartingale convergence theorem.


First, note that the sequence
def −1 −1 −1 −1
Yω = E{logb∗ω (X−ω ), X0 |X−ω } = max E{logb(X−ω ), X0 |X−ω }
b(·)

−1
of random variables forms a submartingale, that is, E{Yω+1 |Y−ω ≥ Yω }. To see this,
note that
−1 −1 −1 −1
E{Yω+1 |X−ω } = E{E{logb∗ω+1 (X−ω−1 ), X0 |X−ω−1 }|X−ω }
−1 −1 −1
≥ E{E{logb∗ω (X−ω ), X0 |X−ω−1 }|X−ω }
−1 −1
= E{logb∗ω (X−ω ), X0 |X−ω−1 }
= Yω .

T&F Cat #K23731 — K23731_A002 — page 177 — 9/28/2015 — 20:47


178 PROOFS AND DERIVATIONS
This sequence is bounded by
−1 −1
max E{logb(X−∞ ), X0 |X−∞ },
b(·)

which has a finite expectation. The submartingale convergence theorem (see Stout
1974) implies that a submartingale is convergence almost surely, and supω θ∗ω is finite.
In particular, by the submartingale property, θ∗ω is a bounded increasing sequence,
so that
sup θ∗ω = lim θ∗ω .
ω ω→∞

Applying Lemma B.3 with the σ-algebras


−1 −1
σ(X−ω )  σ(X−∞ )

yields " #
−1 −1
sup θ∗ω = lim E max E{logb(X−ω ), X0 |X−ω }
ω→∞
ω " b(·) #
−1 −1
= E max E{logb(X−∞ ), X0 |X−∞ }
b(·)
= W ∗.
Then

lim inf Wn (B) ≥ sup θω,ρ ≥ sup lim inf θω,ρ = sup θ∗ω = W ∗ almost surely,
n→∞ ω,ρ ω ρ ω

and from the above three parts of proof, we can get that
1
lim log Sn (B) = W ∗ almost surely
n→∞ n
and the proof of Theorem B.1 is finished.

B.2 Derivations of PAMR


B.2.1 Proof of Proposition 9.1

Proof First, if t = 0, then bt satisfies the constraint and is clearly the optimal
solution.
To solve the problem in case of t = 0, we define the Lagrangian for the
optimization problem (9.2) as
1
L(b, τ, λ) = b − bt 2 + τ(xt · b − ) + λ(b · 1 − 1), (B.4)
2
where τ ≥ 0 is a Lagrange multiplier related to the loss function, λ is a Lagrange
multiplier associated with the simplex constraint, and 1 denotes a column vector of
m 1s. Note that the nonnegativity of portfolio b is not considered, since introducing

T&F Cat #K23731 — K23731_A002 — page 178 — 9/28/2015 — 20:47


DERIVATIONS OF PAMR 179
this term causes too much complexity, and alternatively we project the final portfolio
into a simplex to enforce the constraint.
Setting the partial derivatives of L with respect to b to zero gives

∂L
0= = (b − bt ) + τxt + λ1.
∂b

t ·1 t ·1
Multiplying both sides by 1 , we can get λ = −τ xm . Moreover, since x̄t = xm ,
where x̄t is the mean of t-th price relatives, or the market return, we can rewrite λ as

λ = −τx̄t . (B.5)

And the solution for L becomes

b = bt − τ(xt − x̄t 1). (B.6)

Plugging Equation B.5 and Equation B.6 to Equation B.4, we get

1 2
L(τ) = τ xt − x̄t 1 2 − τ2 xt · (xt − x̄t 1) + τ(bt · xt − )
2
1
= − τ2 xt − x̄t 1 2 + τ(bt · xt − ).
2

Note that in the above formula, we used the following formula:

xt − x̄t 1 2 = xt · xt − 2x̄t (xt · 1) + x̄t2 (1 · 1) = xt · xt − x̄t (xt · 1) = xt · (xt − x̄t 1).

Setting the derivative of L(τ) with respect to τ to 0, we get

∂L
0= = −τ xt − x̄t 1 2 + bt · xt − .
∂τ

Then τ can be set as


bt · xt − 
τ= .
xt − x̄t 1 2

Since τ ≥ 0, we project τ to [0, ∞); thus,


" #
bt · xt −  t
τ = max 0, = .
xt − x̄t 1 2 xt − x̄t 1 2

Note that in case of zero market volatility, that is, xt − x̄t 1 2 = 0, we just set τ = 0.
We can summarize the update scheme for the case of t = 0 and the case of t > 0 by
setting τ. Thus, we simplify the notation following Equation 9.1 and show the unified
update scheme.

T&F Cat #K23731 — K23731_A002 — page 179 — 9/28/2015 — 20:47


180 PROOFS AND DERIVATIONS
B.2.2 Proof of Proposition 9.2

Proof We derive the solution of PAMR-1 following the same procedure as the
derivation of PAMR. If the loss is nonzero, we get a Lagrangian
1
L(b, ξ, τ, μ, λ) = b − bt 2 + τ(xt · b − ) + ξ(C − τ − μ) + λ(1 · b − 1).
2
Setting the partial derivatives of L with respect to b to zero gives
∂L
0= = (b − bt ) + τxt + λ1,
∂b
t ·1
Multiplying both sides by 1 , we can get λ = −τ xm = −τx̄t . And the solution is
b = bt − τ(xt − x̄t 1).
Next, note that the minimum of the term ξ(C − τ − μ) with respect to ξ is zero when-
ever C − τ − μ = 0. If C − τ − μ = 0, then the minimum can be made to approach
−∞. Since we need to maximize the dual, we can rule out the latter case and pose
the following constraint on the dual variables, C − τ − μ = 0. The KKT conditions
confine μ to be nonnegative, so we conclude that τ ≤ C. We can project τ to the
interval [0, C] and get
" " ## " #
bt · xt −  t
τ = max 0, min C, = min C, .
xt − x̄t 1 2 xt − x̄t 1 2
Again, we simplify the notation according to Equation 9.1 and show a unified update
scheme.

B.2.3 Proof of Proposition 9.3

Proof We derive the solution similar to the derivations of PAMR and PAMR-1. In
case that the loss is not 0, we can get the Lagrangian,
1
L(b, ξ, τ, μ, λ) = b − bt 2 + τ(b · xt − ) + Cξ2 − τξ + λ(1 · b − 1).
2
Setting the partial derivatives of L with respect to b to zero gives
∂L
0= = (b − bt ) + τxt + λ1,
∂b
t ·1
Multiplying both sides by 1 , we can get λ = −τ xm = −τx̄. And the solution is
b = bt − τ(xt − x̄1).
Setting the partial derivatives of L with respect to ξ to zero gives
∂L τ
0= = 2Cξ − τ =⇒ ξ= .
∂ξ 2C

T&F Cat #K23731 — K23731_A002 — page 180 — 9/28/2015 — 20:47


DERIVATIONS OF CWMR 181
Expressing ξ as above and replacing b, we rewrite the Lagrangian as
 
τ2 1
L̃(τ) = − xt − x̄t 1 2 + + τ(bt · xt − ).
2 2C
Taking the derivative with respect to τ and setting it to zero, we can get
 
∂ L̃ 1
0= = −τ xt − x̄t 1 2 + + (bt · xt − ).
∂τ 2C
Then we get the update scheme of τ and project it to [0, ∞):
 !
bt · xt −  t
τ = max 0, = .
xt − x̄t 1 2 + 2C
1
xt − x̄t 1 2 + 2C
1

B.3 Derivations of CWMR


B.3.1 Proof of Proposition 10.1

Proof Since considering the nonnegativity constraint introduces too much complex-
ity, first we relax the optimization problem without it, and later we project the solution
to the simplex domain to obtain the required portfolio.
The Lagrangian for the optimization problem (10.3) is
   
1 det t
L= log + Tr( −1
t ) + (μ t − μ)  −1
 t (μ t − μ)
2 det
+ λ(φxt xt + μ xt − ) + η(μ 1 − 1).
Taking the derivative of the Lagrangian with respect to μ and setting it to zero, we
can get the update of μ
∂L
0= =  −1
t (μ − μt ) + λxt + η1 =⇒ μt+1 = μt −  t (λxt + η1), (B.7)
∂μ
where  t is assumed to be nonsingular. Multiplying both sides by 1 , we can get η
1 = 1 − 1  t (λxt + η1) =⇒ η = −λx̄t , (B.8)

where x̄t = 11t x1t denotes the confidence-weighted average of t-th price relatives.
t
Plugging Equation B.8 to Equation B.7, we can get
μt+1 = μt − λ t (xt − x̄t 1). (B.9)
Moreover, taking the derivative of the Lagrangian with respect to  and setting it to
zero, we can have the update of :
∂L 1 1
0= = −  −1 +  −1 + λφxt xt =⇒  −1 −1 
t+1 =  t + 2λφxt xt .
∂ 2 2 t
(B.10)

T&F Cat #K23731 — K23731_A002 — page 181 — 9/28/2015 — 20:47


182 PROOFS AND DERIVATIONS
Now let us solve the Lagrange multiplier λt+1 using KKT conditions. First, follow-
ing Dredze et al. (2008), we can compute the inverse using Woodbury identity (Golub
and Van Loan 1996):
2λφ
 t+1 = ( −1  −1
t + 2λφxt xt ) =  t −  t xt xt  t . (B.11)
1 + 2λφxt  t xt
The KKT conditions imply that either λ = 0, and no update is needed; or the constraint
in the optimization problem (10.3) is an equality after the update. Taking Equation B.9
and Equation B.11 to the equality version of the first constraint, we can get
   
2λφ
 − (μt − λ t (xt − x̄t 1)) · xt = φ xt  t −  t xt x 
 t xt .
1 + 2λφxt  t xt
t

Let Mt = μ 
t xt be the return mean, Vt = xt  t xt be the return variance of the t-th
trading period before updating, and Wt = xt  t 1 be the return variance of the t-th
price relative with cash. We can simplify the preceding equation to
λ2 (2φVt2 − 2φx̄t Vt Wt ) + λ(2φVt − 2φVt Mt + Vt − x̄t Wt ) + ( − Mt − φVt ) = 0.
(B.12)

Let us define a = 2φVt2 − 2φx̄t Vt Wt , b = 2φVt − 2φVt Mt + Vt − x̄t Wt , and c =


 − Mt − φVt . Note that the above quadratic form equation may have two, one, or
zero real roots. We can calculate its real roots (two real roots case: γt1 and γt2 ; one
real root case: γt3 ) as follows:
√ √
−b + b2 − 4ac −b − b2 − 4ac c
γt1 = , γt2 = , or γt3 = − .
2a 2a b
To ensure the nonnegativity of the Lagrangian multiplier, we can project its value to
[0, +∞):
λ = max{γt1 , γt2 , 0}, or λ = max{γt3 , 0}, or λ = 0.
Note that the above equations, respectively, correspond to three cases of real roots
(two, one, or zero).
In practical computation, as we only adopt the diagonal elements of a covariance
matrix, it is equivalent to compute λ from Equation B.12 but update the covariance
matrix with the following rule instead of Equation B.10:
 −1 −1
t+1 =  t + 2λφdiag (xt ),
2

where diag(xt ) denotes a diagonal matrix with the elements of xt on its main diagonal.

B.3.2 Proof of Proposition 10.2

Proof Similar to the proof of Proposition 10.1, we relax the optimization problem
without the nonnegativity constraint and project the solution to the simplex domain
to obtain the required portfolio.

T&F Cat #K23731 — K23731_A002 — page 182 — 9/28/2015 — 20:47


DERIVATIONS OF CWMR 183
The Lagrangian for the optimization problem (10.4) is
   
1 detϒ 2t −2 2  −2
L= log + Tr(ϒ t ϒ ) + (μt − μ) ϒ t (μt − μ)
2 detϒ 2
+ λ(φ ϒxt + μ xt − ) + η(μ 1 − 1).

Taking the derivative of the Lagrangian with respect to μ and setting it to zero, we
can get the update of μ,
∂L
0= = ϒ −2
t (μ − μt ) + λxt + η1 =⇒ μt+1 = μt − ϒ 2t (λxt + η1),
∂μ

where ϒ t is nonsingular. Multiplying both sides by 1 , we can get

1 = 1 − 1 ϒ 2t (λxt + η1) =⇒ η = −λx̄t ,


1 ϒ 2t xt
where x̄t = is the confidence-weighted average of t-th price relatives.
1 ϒ 2t 1
Plugging it into the update scheme of μt+1 , we can get

μt+1 = μt − λϒ 2t (xt − x̄t 1).

Moreover, taking the derivative of the Lagrangian with respect to ϒ and setting it to
zero, we have
∂L 1 1 xt xt ϒ ϒxt xt
0= = −ϒ −1 + ϒ −2 ϒ + ϒϒ −2
+ λφ , + λφ , .
∂ϒ 2 t 2 t
2 xt ϒ 2 xt 2 xt ϒ 2 xt

We can solve the preceding equation to obtain ϒ −2 :

xt xt
ϒ −2 −2
t+1 = ϒ t + λφ , .
xt ϒ 2t+1 xt

The preceding two updates can be expressed in terms of the covariance matrix,

xt xt
μt+1 = μt − λ t (xt − x̄t 1),  −1 −1
t+1 =  t + λφ , . (B.13)
xt  t+1 xt

Here,  t+1 is positive semidefinite (PSD) and nonsingular.


Now, let us solve the Lagrangian multiplier using its KKT condition. Follow-
ing Crammer et al. (2008), we compute the inverse using Woodbury identity (Golub
and Van Loan 1996):
⎛ ⎞
⎜ λφ ⎟ 
 t+1 =  t −  t xt ⎝ , ⎠ xt  t . (B.14)
xt  t+1 xt + λφxt  t xt

T&F Cat #K23731 — K23731_A002 — page 183 — 9/28/2015 — 20:47


184 PROOFS AND DERIVATIONS
Similar to the proof of Proposition 10.1, we set Mt = μ 
t xt , Vt = xt  t xt , Wt =
  
xt  t 1, and Ut = xt  t+1 xt . Multiplying the preceding equation by xt (left) and xt
 
(right), we get Ut = Vt − Vt √U λφ +λφV
Vt , which can be solved for Ut :
t t

,
 −λφVt + λ2 φ2 Vt2 + 4Vt
Ut = . (B.15)
2
The KKT condition implies that either λ = 0, and no update is needed; or the con-
straint in the optimization problem (10.4) is an equality after the update. Substituting
Equations B.13 and B.15 into the equality version of the constraint, after rearranging
in terms of λ, we get
 2   
φ2 Vt φ4 Vt2 φ2 Vt
λ 2
Vt − x̄t Wt + − +2λ( − Mt ) Vt − x̄t Wt +
2 4 2 (B.16)
+( − Mt )2 − φ2 Vt = 0.
 2 φ4 V 2  
Let a = Vt − x̄t Wt + φ 2Vt − 4 t , b = 2( − Mt ) Vt − x̄t Wt + φ 2Vt , and c =
2 2

( − Mt )2 − φ2 Vt . Note that we only consider real roots of the quadratic form equa-
tion. Thus, we can obtain γt as its roots (two real roots case: γt1 and γt2 ; one real
root case: γt3 ):
√ √
−b + b2 − 4ac −b − b2 − 4ac c
γt1 = , γt2 = or γt3 = − .
2a 2a b
To ensure the nonnegativity of the Lagrangian multiplier, we project the roots to
[0, +∞):

λ = max{γt1 , γt2 , 0}, or λ = max{γt3 , 0}, or λ = 0,

which corresponds to three cases (two, one, or zero real roots), respectively.
Following the Proof of Proposition 10.1, we can update the diagonal covariance
matrix as
φ
 −1 −1
t+1 =  t + λ √ diag2 (xt ),
Ut
where diag(xt ) denotes the diagonal matrix with the elements of xt on its main
diagonal.

B.4 Derivation of OLMAR


B.4.1 Proof of Proposition 11.1

Proof Since introducing a nonnegative constraint of the simplex constraint causes


much difficulty (Helmbold et al. 1998), first we do not consider it and finally project
on the simplex domain.

T&F Cat #K23731 — K23731_A002 — page 184 — 9/28/2015 — 20:47


DERIVATION OF OLMAR 185
The Lagrangian of the optimization problem OLMAR is
1
L(b, λ, η) = b − bt 2 + λ( − b · x̃t+1 ) + η(b · 1 − 1),
2
where λ ≥ 0 and η are the Lagrangian multipliers. Taking the gradient with respect
to b and setting it to zero, we get
∂L
0= =(b − bt ) − λx̃t+1 + η1 =⇒ b = bt + λx̃t+1 − η1,
∂b
Multiplying both sides by 1 , we get

1 = 1 + λx̃t+1 · 1 − ηm =⇒ η = λx̄t+1 ,

where x̄t+1 denotes the average predicted price relative (market). Plugging the above
equation to the update of b, we get the update of b,

b = bt + λ(x̃t+1 − x̄t+1 1),

To solve the Lagrangian multiplier, let us plug the above equation to the Lagrangian,
1
L(λ) =λ( − bt · x̃t+1 ) − λ2 x̃t+1 − x̄t+1 1 2
2
Taking derivative with respect to λ and setting it to zero, we get

∂L  − bt · x̃t+1
0= =( − bt · x̃t+1 ) − λ x̃t+1 − x̄t+1 1 2 =⇒ λ= .
∂λ x̃t+1 − x̄t+1 1 2
 
Further projecting λ to [0, +∞), we get λ = max 0, x̃ −b−t x̄·x̃t+11 2 .
t+1 t+1

T&F Cat #K23731 — K23731_A002 — page 185 — 9/28/2015 — 20:47


Appendix C

Supplementary Data and Portfolio


Statistics

This section provides some supplementary data and portfolio statistics, which mainly
complement the observations in Section 13.6.
Similar to Table 13.5, Tables C.1, C.2, C.3, and C.4 show some descriptive
statistics on the NYSE (N) dataset, SP500 dataset, MSCI dataset, and DJIA dataset,
respectively.
Table C.5 illustrates the top five average allocation weights of the proposed strate-
gies on the datasets except NYSE (O). Connecting these weights with the descriptive
statistics, we can have similar observations as that of NYSE (O). That is, the pro-
posed algorithms put more weights on the volatile assets so as to exploit the volatility
of the assets, and most of the top average allocation weights of the mean reversion
algorithms are assets with negative autocorrelations, except DJIA.

187

T&F Cat #K23731 — K23731_A003 — page 187 — 9/26/2015 — 8:06


188

Table C.1 Some descriptive statistics on the NYSE (N) dataset


Stat. 1 2 3 4 5 6 7 8
Cum 15.0504 3.7815 9.9188 32.7579 13.0692 9.1587 10.2382 9.3461
Mean 1.0006 1.0005 1.0005 1.0007 1.0007 1.0006 1.0005 1.0007
Std 0.0184 0.0237 0.0174 0.0163 0.0243 0.0203 0.0180 0.0251
Ac −0.0084 0.0321 0.0365 −0.0140 −0.0098 −0.0134 −0.0104 0.0060
9 10 11 12 13 14 15 16
Cum 12.3507 0.0407 16.8484 6.8311 19.7812 43.8699 19.6711 2.0467
Mean 1.0006 1.0002 1.0007 1.0005 1.0007 1.0007 1.0006 1.0010
Std 0.0185 0.0356 0.0246 0.0183 0.0221 0.0153 0.0157 0.0436
Ac −0.0193 0.0291 −0.0177 −0.0266 0.0145 0.0149 −0.0507 −0.0963
17 18 19 20 21 22 23
Cum 0.3934 28.7380 17.2551 83.5067 17.4584 10.4056 32.7829
Mean 1.0002 1.0007 1.0006 1.0009 1.0008 1.0006 1.0007
Std 0.0245 0.0179 0.0154 0.0181 0.0251 0.0226 0.0191
Ac 0.0206 0.0139 −0.0368 −0.0192 0.0055 −0.0339 −0.0478
Note: The top five of each statistics are highlighted. Each column denotes the index of an asset.

T&F Cat #K23731 — K23731_A003 — page 188 — 9/26/2015 — 8:06


SUPPLEMENTARY DATA AND PORTFOLIO STATISTICS
Table C.2 Some descriptive statistics on the SP500 dataset
Stat. 1 2 3 4 5 6 7 8
Cum 0.9381 1.4478 2.4279 1.1038 1.2034 1.2841 1.6480 1.3822
Mean 1.0002 1.0007 1.0010 1.0002 1.0004 1.0006 1.0006 1.0005
Std 0.0229 0.0280 0.0245 0.0180 0.0233 0.0273 0.0181 0.0223
Ac −0.0090 −0.0103 0.0032 −0.0363 0.0256 0.0033 0.0783 0.0510
9 10 11 12 13 14 15 16
Cum 1.4807 1.0293 1.0572 0.8625 1.1531 0.6044 1.3816 0.8419
Mean 1.0006 1.0002 1.0003 1.0005 1.0004 0.9998 1.0010 1.0001
Std 0.0255 0.0206 0.0215 0.0363 0.0247 0.0202 0.0392 0.0234
Ac −0.0389 0.0590 −0.0182 −0.0386 0.0063 0.0508 −0.0618 −0.0962
17 18 19 20 21 22 23 24
Cum 1.2405 3.7792 2.2260 1.1244 0.6523 1.1119 0.8263 1.8606
Mean 1.0004 1.0016 1.0013 1.0003 1.0000 1.0003 1.0000 1.0009
Std 0.0210 0.0323 0.0369 0.0213 0.0244 0.0212 0.0176 0.0296
Ac −0.0206 −0.0431 0.0094 −0.0755 −0.0150 0.0017 −0.0401 0.0229
25
SUPPLEMENTARY DATA AND PORTFOLIO STATISTICS

Cum 0.8738
Mean 1.0002
Std 0.0242
Ac 0.0299

T&F Cat #K23731 — K23731_A003 — page 189 — 9/26/2015 — 8:06


Note: The top five of each statistics are highlighted. Each column denotes the index of an asset.
189
190

Table C.3 Some descriptive statistics on the MSCI dataset


Stat. 1 2 3 4 5 6 7 8
Cum 0.9095 1.2541 0.8248 1.0638 1.0133 1.0242 1.3189 0.8063
Mean 1.0000 1.0005 1.0000 1.0002 1.0003 1.0002 1.0004 0.9999
Std 0.0164 0.0219 0.0190 0.0152 0.0242 0.0198 0.0180 0.0162
Ac −0.1305 −0.0099 −0.0426 −0.0240 −0.0105 −0.0072 0.0082 0.0323
9 10 11 12 13 14 15 16
Cum 0.8742 0.7265 0.9160 0.6586 1.5040 0.2894 1.2153 0.5191
Mean 1.0003 0.9999 1.0001 0.9998 1.0005 0.9991 1.0003 0.9996
Std 0.0281 0.0178 0.0187 0.0196 0.0125 0.0256 0.0177 0.0232
Ac −0.0322 0.0415 −0.0225 −0.0103 0.0237 0.0188 −0.0257 0.0624
17 18 19 20 21 22 23 24
Cum 0.9653 0.8714 0.8123 1.2018 1.1802 0.5670 0.5080 0.7285
Mean 1.0001 1.0001 1.0000 1.0004 1.0004 0.9996 0.9997 0.9998
Std 0.0192 0.0195 0.0220 0.0191 0.0202 0.0196 0.0250 0.0170
Ac −0.0228 −0.0549 −0.0324 0.0302 0.0275 0.0600 0.0410 −0.1259
Note: The top five of each statistics are highlighted. Each column denotes the index of an asset.

T&F Cat #K23731 — K23731_A003 — page 190 — 9/26/2015 — 8:06


SUPPLEMENTARY DATA AND PORTFOLIO STATISTICS
Table C.4 Some descriptive statistics on the DJIA dataset
Stat. 1 2 3 4 5 6 7 8
Cum 0.7085 0.5378 1.1414 1.1884 0.6870 0.7389 0.5449 1.1572
Mean 0.9997 0.9991 1.0004 1.0007 0.9996 0.9997 0.9993 1.0004
Std 0.0263 0.0260 0.0174 0.0269 0.0278 0.0256 0.0314 0.0161
Ac −0.0315 0.0094 0.0230 −0.0691 0.0270 −0.0961 −0.0479 −0.0895
9 10 11 12 13 14 15 16
Cum 0.5459 0.4287 0.7718 0.5806 0.6904 0.5323 0.5034 0.3608
Mean 0.9991 0.9988 0.9996 0.9992 0.9996 0.9992 0.9988 0.9990
Std 0.0256 0.0291 0.0166 0.0248 0.0268 0.0309 0.0198 0.0398
Ac 0.0597 0.0011 0.0280 −0.0360 −0.0066 −0.0033 0.0240 0.0041
17 18 19 20 21 22 23 24
Cum 1.0191 0.6048 1.0863 0.8560 0.9178 0.9361 0.9787 0.8797
Mean 1.0003 0.9996 1.0003 1.0001 1.0001 1.0002 1.0002 1.0000
Std 0.0223 0.0357 0.0180 0.0264 0.0212 0.0249 0.0208 0.0202
Ac −0.0506 0.0208 −0.0622 −0.0963 −0.0658 0.0257 −0.0330 −0.0578
25 26 27 28 29 30
SUPPLEMENTARY DATA AND PORTFOLIO STATISTICS

Cum 0.5947 0.5197 0.6710 0.8309 0.9872 0.9311


Mean 0.9994 0.9995 0.9994 0.9998 1.0003 1.0001
Std 0.0281 0.0398 0.0197 0.0179 0.0238 0.0209
Ac 0.0426 −0.0154 0.0873 −0.0434 0.0335 0.0288

T&F Cat #K23731 — K23731_A003 — page 191 — 9/26/2015 — 8:06


Note: The top five of each statistics are highlighted. Each column denotes the index of an asset.
191
Table C.5 The top five (average) allocation weights of the proposed strategies on five datasets
192

NYSE (N)
Asset # 16 10 5 17 11 Asset # 13 4 16 23 20
Anticor 0.15 0.08 0.06 0.06 0.06 CORN 0.15 0.15 0.15 0.07 0.05
Asset # 16 11 5 10 22 Asset # 16 11 10 22 5
PAMR 0.18 0.07 0.06 0.06 0.06 OLMAR 0.18 0.07 0.07 0.06 0.06
TSE
Asset # 18 24 71 79 74 Asset # 51 87 25 71 18
Anticor 0.09 0.08 0.07 0.06 0.06 CORN 0.14 0.13 0.12 0.10 0.05
Asset # 24 18 71 79 74 Asset # 24 18 71 79 32
PAMR 0.12 0.08 0.07 0.04 0.04 OLMAR 0.10 0.09 0.08 0.05 0.04
SP500
Asset # 15 12 19 25 21 Asset # 19 24 15 18 8
Anticor 0.09 0.08 0.07 0.06 0.05 CORN 0.20 0.14 0.14 0.13 0.08
Asset # 15 19 12 18 24 Asset # 15 12 19 24 18
PAMR 0.10 0.08 0.07 0.07 0.06 OLMAR 0.09 0.08 0.07 0.06 0.06
MSCI
Asset # 9 2 7 1 20 Asset # 2 24 1 7 6
Anticor 0.17 0.16 0.14 0.07 0.06 CORN 0.15 0.14 0.11 0.10 0.08
Asset # 24 14 9 16 10 Asset # 14 16 24 9 10
PAMR 0.11 0.10 0.09 0.09 0.07 OLMAR 0.12 0.10 0.09 0.08 0.08
DJIA
Asset # 16 18 26 14 7 Asset # 19 24 15 18 8
Anticor 0.08 0.07 0.07 0.06 0.05 CORN 0.20 0.14 0.14 0.13 0.08

T&F Cat #K23731 — K23731_A003 — page 192 — 9/26/2015 — 8:06


Asset # 18 26 16 14 7 Asset # 26 10 16 18 7
PAMR 0.10 0.07 0.07 0.06 0.06 OLMAR 0.11 0.07 0.06 0.06 0.05
SUPPLEMENTARY DATA AND PORTFOLIO STATISTICS

Note: “Asset #” denotes the indices of the allocated assets.


Bibliography

J. Abernethy, A. Agarwal, P. L. Barlett, and A. Rakhlin. A stochastic view of opti-


mal regret through minimax duality. In Proceedings of Annual Conference on
Learning Theory, Montreal, Quebec, 2009.
A. Agarwal, E. Hazan, S. Kale, and R. E. Schapire. Algorithms for portfolio manage-
ment based on the newton method. In Proceedings of International Conference
on Machine Learning, Pittsburgh, PA, 9–16, 2006.
A. Agarwal, P. Bartlett, and M. Dama. Optimal allocation strategies for the dark pool
problem. In Proceedings of International Conference on Artificial Intelligence
and Statistics, Chia Laguna Resort, Sardinia, 9–16, 2010.
R. Agrawal and R. Srikant. Mining sequential patterns. In Proceedings of the
Eleventh International Conference on Data Engineering, Taipei, Taiwan, 3–14,
1995.
D. W. Aha. Case-based learning algorithms. In Proceedings of the DARPA Case-Based
Reasoning Workshop, 147–158, 1991.
D. W. Aha, D. Kibler, and M. K. Albert. Instance-based learning algorithms. Machine
Learning, 6(1):37–66, 1991.
K. Akcoglu, P. Drineas, and M.-Y. Kao. Fast universalization of investment strategies.
SIAM Journal on Computing, 34(1):1–22, 2005.
I. Aldridge. High-Frequency Trading: A Practical Guide to Algorithmic Strategies
and Trading Systems. Hoboken, NJ: Wiley, 2010.
P. Algoet. Universal schemes for prediction, gambling and portfolio selection. The
Annals of Probability, 20(2):901–941, 1992.
P. Algoet and T. Cover. Asymptotic optimality asymptotic equipartition properties of
log-optimum investments. Annals of Probability, 16:876–898, 1988.
P. H. Algoet. The strong law of large numbers for sequential decisions
under uncertainty. IEEE Transactions on Information Theory, 40:609–633,
1994.
R. F. Almgren and N. Chriss. Optimal execution of portfolio transactions. Journal of
Risk, 12:61–63, 2000.
F. R. Bach. Consistency of the group lasso and multiple kernel learning. Journal of
Machine Learning Research, 9:1179–1225, 2008.

193

T&F Cat #K23731 — K23731_A004 — page 193 — 9/26/2015 — 8:06


194 BIBLIOGRAPHY
L. Bachelier. Théorie de la spéculation. Annales Scientifiques de l’École Normale
Supérieure, 3(17):21–86, 1900.
P. Baldi and P. Brunak. Bioinformatics: The Machine Learning Approach, 2nd
Edition. Cambridge, MA: MIT Press, 2001.
N. Barberis and R. Thaler. A survey of behavioural finance. In Handbook of the
Economics of Finance, G. M. Constantinides, M. Harris, and R. Stulz (eds.),
Elsevier, North Holland, Amsterdam, 1053–1128, 2003.
E. Bayraktar. Optimal trade execution in illiquid markets. Mathematical Finance,
21(4):681–701, 2011.
J. E. Beasley, N. Meade, and T. J. Chang. An evolutionary heuristic for the index
tracking problem. European Journal of Operational Research, 148(3):621–643,
2003.
C. Y. Belentepe. A Statistical View of Universal Portfolios. PhD thesis, University of
Pennsylvania, 2005.
D. J. Berndt and J. Clifford. Using dynamic time warping to find patterns in time
series. In KDD Workshop, 359–370, 1994.
D. Bernoulli. Exposition of a new theory on the measurement of risk. Econometrica,
23:23–36, 1954.
D. Bertsimas and A. W. Lo. Optimal control of execution costs. Journal of Financial
Markets, 1(1):1–50, 1998.
J. R. Birge and F. Louveaux. Introduction to Stochastic Programming. New York:
Springer, 1997.
A. Blum and A. Kalai. Universal portfolios with and without transaction costs.
Machine Learning, 35(3):193–205, 1999.
A. Blum and Y. Mansour. From external to internal regret. Journal of Machine
Learning Research, 8:1307–1324, 2007.
W. F. M. D. Bondt and R. Thaler. Does the stock market overreact? The Journal of
Finance, 40(3):793–805, 1985.
A. Borodin, R. El-Yaniv, and V. Gogan. Can we learn to beat the best stock. Journal
of Artificial Intelligence Research, 21:579–594, 2004.
S. Boyd and L. Vandenberghe. Convex Optimization. New York: Cambridge
University Press, 2004.
L. Breiman. The individual ergodic theorem of information theory. The Annals of
Mathematical Statistics, 31:809–811, 1957 (Correction version 1960).
L. Breiman. Investment policies for expanding businesses optimal in a long-run sense.
Naval Research Logistics Quarterly, 7(4):647–651, 1960.
L. Breiman. Optimal gambling systems for favorable games. Proceedings of the
Berkeley Symposium on Mathematical Statistics and Probability, 1:65–78, 1961.
J. Brodie, I. Daubechies, C. De Mol, D. Giannone, and I. Loris. Sparse and stable
Markowitz portfolios. Proceedings of the National Academy of Sciences, 106
(30):12267–12272, 2009.
N. A. Canakgoz and J. E. Beasley. Mixed-integer programming approaches for index
tracking and enhanced indexation. European Journal of Operational Research,
196(1):384–399, 2009.

T&F Cat #K23731 — K23731_A004 — page 194 — 9/26/2015 — 8:06


BIBLIOGRAPHY 195
A. Cañete, J. Constanzo, and L. Salinas. Kernel price pattern trading. Applied
Intelligence, 29(2):152–156, 2008.
L. J. Cao and F. E. H. Tay. Support vector machine with adaptive parameters in
financial time series forecasting. IEEE Transactions on Neural Networks, 14(6):
1506–1518, 2003.
N. Cesa-Bianchi and G. Lugosi. Prediction, Learning, and Games. New York:
Cambridge University Press, 2006.
E. Chan. Quantitative Trading: How to Build Your Own Algorithmic Trading Business.
Hoboken, NJ: Wiley, 2008.
K. C. Chan. On the contrarian investment strategy. The Journal of Business, 61(2):
147–163, 1988.
L. K. C. Chan, N. Jegadeesh, and J. Lakonishok. Momentum strategies. The Journal
of Finance, 51(5):1681–1713, 1996.
K. Chaudhuri and Y. Wu. Mean reversion in stock prices: Evidence from emerging
markets. Managerial Finance, 29:22–37, 2003.
V. S. Cherkassky and F. Mulier. Learning from Data: Concepts, Theory, and Methods.
New York: Wiley, 1998.
V. K. Chopra and W. T. Ziemba. The effect of errors in means, variances, and covari-
ances on optimal portfolio choice. The Journal of Portfolio Management, 19:
6–11, 1993.
T. F. Coleman, Y. Li, and J. Henniger. Minimizing tracking error while restricting
the number of assets. Journal of Risk, 8:33–56, 2006.
J. Conrad and G. Kaul. An anatomy of trading strategies. Review of Financial Studies,
11(3):489–519, 1998.
R. Cont. Empirical properties of asset returns: stylized facts and statistical issues.
Quantitative Finance, 1(2):223–236, 2001.
P. Cootner. The Random Character of Stock Market Prices. Cambridge, MA:
MIT Press, 1964.
T. Cover and E. Ordentlich. Universal portfolios with short sales and margin. In
Proceedings of Annual IEEE International Symposium on Information Theory,
Cambridge, MA, 174, 1998.
T. M. Cover. Universal portfolios. Mathematical Finance, 1(1):1–29, 1991.
T. M. Cover. Universal data compression and portfolio selection. In Proceedings of
Annual IEEE Symposium on Foundations of Computer Science, Burlington, VT,
534–538, 1996.
T. M. Cover and D. H. Gluss. Empirical Bayes stock market portfolios. Advances in
Applied Mathematics, 7(2):170–181, 1986.
T. M. Cover and E. Ordentlich. Universal portfolios with side information. IEEE
Transactions on Information Theory, 42(2):348–363, 1996.
T. M. Cover and J. A. Thomas. Elements of Information Theory. New York: Wiley,
1991.
K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passive–
aggressive algorithms. Journal of Machine Learning Research, 7:551–585,
2006.

T&F Cat #K23731 — K23731_A004 — page 195 — 9/26/2015 — 8:06


196 BIBLIOGRAPHY
K. Crammer, M. Dredze, andA. Kulesza. Multi-class confidence weighted algorithms.
In Proceedings of the Conference on Empirical Methods in Natural Language
Processing, Singapore, 496–504, 2009.
K. Crammer, M. Dredze, and F. Pereira. Exact convex confidence-weighted learn-
ing. In Proceedings of Annual Conference on Neural Information Processing
Systems, Vancouver, 345–352, 2008.
G. Creamer. Using Boosting for Automated Planning and Trading Systems. PhD thesis,
Columbia University, 2007.
G. Creamer. Model calibration and automated trading agent for euro futures.
Quantitative Finance, 12(4):531–545, 2012.
G. Creamer and S. Stolfo. A link mining algorithm for earnings forecast and trading.
Data Mining and Knowledge Discovery, 18(3):419–445, 2009.
G. G. Creamer and Y. Freund. A boosting approach for automated trading. Journal
of Trading, 2(3):84–96, 2007.
G. G. Creamer and Y. Freund. Automated trading with boosting and expert weighting.
Quantitative Finance, 10(4):401–420, 2010.
J. E. Cross and A. R. Barron. Efficient universal portfolios for past-dependent target
classes. Mathematical Finance, 13(2):245–276, 2003.
P. Das and A. Banerjee. Meta optimization and its application to portfolio selection.
In Proceedings of International Conference on Knowledge Discovery and Data
Mining, San Diego, 1163–1171, 2011.
V. DeMiguel, L. Garlappi, and R. Uppal. Optimal versus naive diversification: How
inefficient is the 1 − n portfolio strategy? Review of Financial Studies, 22(5):
1915–1953, 2009.
M. A. H. Dempster, T. W. Payne, Y. Romahi, and G. W. P. Thompson. Computational
learning techniques for intraday FX trading using popular technical indicators.
IEEE Transactions on Neural Networks, 12(4):744–754, 2001.
E. Dimson. Stock Market Anomalies. Cambridge, MA: Cambridge University Press,
1988.
M. Dredze, K. Crammer, and F. Pereira. Confidence-weighted linear classification.
In Proceedings of International Conference on Machine Learning, Helsinki,
Finland, 246–271, 2008.
X. Du, R. Jin, L. Ding, V. E. Lee, and J. H. Thornton Jr. Migration motif: A spatial-
temporal pattern mining approach for financial markets. In Proceedings of the
ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining, Paris, France, 1135–1144, 2009.
J. Duchi, S. Shalev-Shwartz, Y. Singer, and T. Chandra. Efficient projections onto
the l1 -ball for learning in high dimensions. In Proceedings of International
Conference on Machine Learning, Helsinki, Finland, 272–279, 2008.
M. Durbin. All About High-Frequency Trading. New York: McGraw-Hill, 2010.
R. El-Yaniv. Competitive solutions for online financial problems. ACM Computing
Surveys, 30:28–69, 1998.
J. Exley, S. Mehta, and A. Smith. Mean Reversion. Technical report, Fac-
ulty & Institute of Actuaries, Finance and Investment Conference, Brussels,
2004.

T&F Cat #K23731 — K23731_A004 — page 196 — 9/26/2015 — 8:06


BIBLIOGRAPHY 197
E. Fagiuoli, F. Stella, and A. Ventura. Constant rebalanced portfolios and side-
information. Quantitative Finance, 7(2):161–173, 2007.
E. F. Fama and K. R. French. The cross-section of expected stock returns. The Journal
of Finance, 47(2):427–465, 1992.
M. Feder, N. Merhav, and M. Gutman. Universal prediction of individual sequences.
IEEE Transactions on Information Theory, 38(4):1258–1270, 1992.
M. Finkelstein and R. Whitley. Optimal strategies for repeated games. Advances in
Applied Probability, 13(2):415–428, 1981.
T. Foucault, O. Kadan, and E. Kandel. Limit order book as a market for liquidity.
Review of Financial Studies, 18(4):1171–1217, 2005.
W. J. Fu. Penalized regressions: The bridge versus the lasso. Journal of Computa-
tional and Graphical Statistics, 7(3):397–416, 1998.
A. A. Gaivoronski and F. Stella. Stochastic nonstationary optimization for finding
universal portfolios. Annals of Operations Research, 100:165–188, 2000.
A. A. Gaivoronski and F. Stella. On-line portfolio selection using stochastic program-
ming. Journal of Economic Dynamics and Control, 27(6):1013–1043, 2003.
K. Ganchev, Y. Nevmyvaka, M. Kearns, and J. W. Vaughan. Censored explo-
ration and the dark pool problem. Communications of the ACM, 53(5):99–107,
2010.
M. Gilli and E. Këllezi. The threshold accepting heuristic for index tracking. In
Financial Engineering, E-Commerce, and Supply Chain, P. M. Pardalos and
V. Tsitsiringos (eds.), Boston: Kluwer Academic, 1–18, 2002.
G. H. Golub and C. F. Van Loan. Matrix Computations. Baltimore, MD: Johns
Hopkins University Press, 1996.
T. F. Gosnell, A. J. Keown, and J. M. Pinkerton. The intraday speed of stock
price adjustment to major dividend changes: Bid-ask bounce and order flow
imbalances. Journal of Banking & Finance, 20(2):247–266, 1996.
R. Grinold and R. Kahn. Active Portfolio Management: A Quantitative Approach for
Producing Superior Returns and Controlling Risk. New York: McGraw-Hill,
1999.
L. Györfi, G. Lugosi, and F. Udina. Nonparametric kernel-based sequential invest-
ment strategies. Mathematical Finance, 16(2):337–357, 2006.
L. Györfi, G. Ottucsák, and H. Walk. Machine Learning for Financial Engineering.
Singapore: World Scientific, 2012.
L. Györfi, A. Urbán, and I. Vajda. Kernel-based semi-log-optimal empirical portfolio
selection strategies. International Journal of Theoretical and Applied Finance,
10(3):505–516, 2007.
L. Györfi, F. Udina, and H. Walk. Nonparametric nearest neighbor based empirical
portfolio selection strategies. Statistics and Decisions, 26(2):145–157, 2008.
L. Györfi and D. Schäfer. Nonparametric prediction. In Advances in Learning Theory:
Methods, Models and Applications, J. Suykens, G. Horvath, and S. Basu (eds.),
The Netherlands: IOS Press, Amsterdam, 339–354, 2003.
L. Györfi and I. Vajda. Growth optimal investment with transaction costs. In Proceed-
ings of the International Conference on Algorithmic Learning Theory, Budapest,
Hungary, 108–122, 2008.

T&F Cat #K23731 — K23731_A004 — page 197 — 9/26/2015 — 8:06


198 BIBLIOGRAPHY
L. Gyorfi and H. Walk. Empirical portfolio selection strategies with proportional trans-
action costs. IEEE Transactions on Information Theory, 58(10):6320–6331,
2012.
N. H. Hakansson. Optimal investment and consumption strategies under risk for
a class of utility functions. Econometrica, 38(5):587–607, 1970.
N. H. Hakansson. Capital growth and the mean-variance approach to portfolio
selection. The Journal of Financial and Quantitative Analysis, 6(1):517–557,
1971.
J. D. Hamilton. Time Series Analysis. Princeton, NJ: Princeton University Press,
1994.
J. D. Hamilton. Regime-switching models. In New Palgrave Dictionary of Economics,
S. N. Durlauf and L. E. Blume (eds.), New York: Palgrave McMillan, 53–57,
2008.
M. R. Hardy. A regime-switching model of long-term stock returns. North American
Actuarial Journal Society of Acutaries, 5(2):41–53, 2001.
L. Harris. Trading and Exchanges: Market Microstructure for Practitioners. New
York: Oxford University Press, 2003.
C. R. Harvey, J. C. Liechty, M. W. Liechty, and P. Müller. Portfolio selection with
higher moments. Quantitative Finance, 10(5):469–485, 2010.
R. A. Haugen and J. Lakonishok. The Incredible January Effect: The Stock Market’s
Unsolved Mystery. Homewood, IL: Dow Jones-Irwin, 1987.
E. Hazan. Efficient Algorithms for Online Convex Optimization and Their Applica-
tions. PhD thesis, Princeton University, 2006.
E. Hazan, A. Agarwal, and S. Kale. Logarithmic regret algorithms for online convex
optimization. Machine Learning, 69(2–3):169–192, 2007.
E. Hazan, A. Kalai, S. Kale, and A. Agarwal. Logarithmic regret algorithms for
online convex optimization. In Proceedings of the Annual Conference on
Learning Theory, 2006.
E. Hazan and S. Kale. On stochastic and worst-case models for investing. In Pro-
ceedings of Annual Conference on Neural Information Processing Systems,
Vancouver, 709–717, 2009.
E. Hazan and S. Kale. An online portfolio selection algorithm with regret logarith-
mic in price variation. Mathematical Finance, 25(2):288–310, 2015.
E. Hazan and C. Seshadhri. Efficient learning algorithms for changing environments.
In Proceedings of the International Conference on Machine Learning, Montreal,
393–400, 2009.
D. P. Helmbold, R. E. Schapire, Y. Singer, and M. K. Warmuth. On-line portfo-
lio selection using multiplicative updates. In Proceedings of the International
Conference on Machine Learning, Bari, Italy, 243–251, 1996.
D. P. Helmbold, R. E. Schapire, Y. Singer, and M. K. Warmuth. A comparison of new
and old algorithms for a mixture estimation problem. Machine Learning, 27(1):
97–119, 1997.
D. P. Helmbold, R. E. Schapire, Y. Singer, and M. K. Warmuth. On-line portfolio
selection using multiplicative updates. Mathematical Finance, 8(4):325–347,
1998.

T&F Cat #K23731 — K23731_A004 — page 198 — 9/26/2015 — 8:06


BIBLIOGRAPHY 199
M. Herbster and M. K. Warmuth. Tracking the best expert. Machine Learning, 32(2):
151–178, 1998.
E. Hillebrand. Mean Reversion Models of Financial Markets. PhD thesis, University
of Bremen, 2003.
D. Huang, J. Zhou, B. Li, S. C. Hoi, and S. Zhou. Robust median reversion strategy
for on-line portfolio selection. In Proceedings of the Twenty-Third International
Joint Conference on Artificial Intelligence, 2006–2012, AAAI Press, Beijing,
China, 2013.
S.-H. Huang, S.-H. Lai, and S.-H. Tai. A learning-based contrarian trading strategy via
a dual-classifier model. ACM Transactions on Interactive Intelligent Systems,
[Link]–20:20, 2011.
J. C. Hull. Options, Futures, and Other Derivatives. Upper Saddle River, NJ: Prentice
Hall, 1997.
G. Iyengar. Universal investment in markets with transaction costs. Mathematical
Finance, 15(2):359–371, 2005.
F. Jamshidian. Asymptotically optimal portfolios. Mathematical Finance, 2(2):131–
150, 1992.
N. Jegadeesh. Evidence of predictable behavior of security returns. Journal of
Finance, 45(3):881–898, 1990.
A. Kalai and S. Vempala. Efficient algorithms for universal portfolios. Journal of
Machine Learning Research, 3:423–440, 2002.
J. O. Katz and D. L. McCormick. The Encyclopedia of Trading Strategies. New York:
McGraw-Hill, 2000.
M. Kearns, A. Kulesza, and Y. Nevmyvaka. Empirical limitations on high frequency
trading profitability. Journal of Trading, 5(4):50–62, 2010.
D. B. Keim and A. Madhavan. Anatomy of the trading process empirical evidence
on the behavior of institutional traders. Journal of Financial Economics, 37(3):
371–398, 1995.
J. Kelly. A new interpretation of information rate. Bell Systems Technical Journal,
35:917–926, 1956.
E. Keogh. Exact indexing of dynamic time warping. In Proceedings of the 28th
International Conference on Very Large Data Bases, 406–417, 2002.
E. J. Keogh and M. J. Pazzani. Scaling up dynamic time warping for datamining
applications. In Proceedings of the Sixth ACM SIGKDD International Con-
ference on Knowledge Discovery and Data Mining, Boston, MA, 285–289,
2000.
T. Kimoto, K. Asakawa, M. Yoda, and M. Takeoka. Stock market prediction system
with modular neural networks. Neural Networks in Finance and Investing, 343–
357, 1993.
R. Kissell, M. Glantz, and R. Malamut. Optimal Trading Strategies: Quantita-
tive Approaches for Managing Market Impact and Trading Risk. New York:
AMACOM, 2003.
W. M. Koolen and V. Vovk. Buy low, sell high. In Proceedings of Interna-
tional Conference on Algorithmic Learning Theory, Lyon, France, 335–349,
2012.

T&F Cat #K23731 — K23731_A004 — page 199 — 9/26/2015 — 8:06


200 BIBLIOGRAPHY
S. S. Kozat and A. C. Singer. Universal constant rebalanced portfolios with switch-
ing. In Proceedings of the International Conference on Acoustics, Speech, and
Signal Processing, Honolulu, 1129–1132, 2007.
S. S. Kozat and A. C. Singer. Universal switching portfolios under transaction costs.
In Proceedings of the International Conference on Acoustics, Speech, and Signal
Processing, Las Vegas, NV, 5404–5407, 2008.
S. S. Kozat and A. C. Singer. Switching strategies for sequential decision problems
with multiplicative loss with application to portfolios. IEEE Transactions on
Signal Processing, 57(6):2192–2208, 2009.
S. S. Kozat and A. C. Singer. Universal randomized switching. IEEE Transactions
on Signal Processing, 58:3, 2010.
S. S. Kozat and A. C. Singer. Universal semiconstant rebalanced portfolios.
Mathematical Finance, 21(2):293–311, 2011.
S. S. Kozat, A. C. Singer, and A. J. Bean. Universal portfolios via context trees. In
Proceedings of the International Conference on Acoustics, Speech, and Signal
Processing, Las Vegas, NV, 2093–2096, 2008.
S. S. Kozat, A. C. Singer, and A. J. Bean. A tree-weighting approach to sequential
decision problems with multiplicative loss. Signal Processing, 91(4):890–905,
2011.
S. Kullback and R. Leibler. On information and sufficiency. Annals of Mathematical
Statistics, 22:79–86, 1951.
H. A. Latané. Criteria for choice among risky ventures. The Journal of Political
Economy, 67(2):144–155, 1959.
T. Levina and G. Shafer. Portfolio selection and online learning. International Jour-
nal of Uncertainty, Fuzziness and Knowledge-Based Systems, 16(4):437–473,
2008.
B. Li and S. C. Hoi. Online portfolio selection: A survey. ACM Computing Surveys,
[Link]–35:36, 2014.
B. Li, S. C. Hoi, and V. Gopalkrishnan. CORN: Correlation-driven nonparamet-
ric learning approach for portfolio selection. ACM Transactions on Intelligent
Systems and Technology, 2(3):21:1–21:29, 2011a.
B. Li, S. C. Hoi, D. Sahoo, and Z. Liu. Moving average reversion strategy for on-line
portfolio selection. Artificial Intelligence, 222:104–123, 2015.
B. Li, S. C. Hoi, P. Zhao, and V. Gopalkrishnan. Confidence weighted mean reversion
strategy for on-line portfolio selection. In Proceedings of the International Con-
ference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, 434–442,
2011b.
B. Li, S. C. Hoi, P. Zhao, and V. Gopalkrishnan. Confidence weighted mean
reversion strategy for on-line portfolio selection. In ACM Transactions on
Knowledge Discovery from Data, 2013.
B. Li and S. C. H. Hoi. On-line portfolio selection with moving average reversion.
In Proceedings of the International Conference on Machine Learning,
Edinburgh, 273–280, 2012.
B. Li, P. Zhao, S. Hoi, and V. Gopalkrishnan. PAMR: Passive–aggressive mean
reversion strategy for portfolio selection. Machine Learning, 87(2):221–258,
2012.

T&F Cat #K23731 — K23731_A004 — page 200 — 9/26/2015 — 8:06


BIBLIOGRAPHY 201
A. W. Lo. Where do alphas come from? A measure of the value of active investment
management. Journal of Investment Management, 6:1–29, 2008.
A. W. Lo and A. C. MacKinlay. When are contrarian profits due to stock market
overreaction? Review of Financial Studies, 3(2):175–205, 1990.
M. S. Lobo, M. Fazel, and S. Boyd. Portfolio optimization with linear and fixed
transaction costs. Annals of Operations Research, 152(1):341–365, 2007.
J. Loveless, S. Stoikov, and R. Waeber. Online algorithms in high-frequency trading.
Communication of the ACM, 56(10):50–56, 2013.
C.-J. Lu, T.-S. Lee, and C.-C. Chiu. Financial time series forecasting using inde-
pendent component analysis and support vector regression. Decision Support
Systems, 47:115–125, 2009.
D. G. Luenberger. Investment Science. New York: Oxford University Press, 1998.
L. C. MacLean, E. O. Thorp, and W. T. Ziemba. The Kelly Capital Growth Invest-
ment Criterion: Theory and Practice. Volume 3. Singapore: World Scientific,
2011.
M. Magdon-Ismail and A. Atiya. Maximum drawdown. Risk Magazine, 10:99–102,
2004.
C. D. Manning and H. Schütze. Foundations of Statistical Natural Language
Processing. Cambridge, MA: MIT Press, 1999.
D. Maringer. Constrained index tracking under loss aversion using differential evo-
lutionary, natural computing in computational finance. In Natural Computing
in Computational Finance, A. Brabazon and M. O’Neill (eds.), 7–24. Berlin:
Springer, 2008.
H. Markowitz. Portfolio selection. The Journal of Finance, 7(1):77–91, 1952.
H. Markowitz. Portfolio Selection: Efficient Diversification of Investments. New
York: Wiley, 1959.
T. H. Mcinish and R. A. Wood. An analysis of intraday patterns in bid/ask spreads for
NYSE stocks. The Journal of Finance, 47(2):753–764, 1992.
B. McWilliams and G. Montana. Sparse partial least squares regression for on-line
variable selection with multivariate data streams. Statistical Analysis and Data
Mining, 3(3):170–193, 2010.
N. Meade and G. R. Salkin. Index funds-construction and performance measurement.
The Journal of the Operational Research Society, 40(10):871–879, 1989.
N. Meade and G. R. Salkin. Developing and maintaining an equity index fund.
The Journal of the Operational Research Society, 41(7):599–607, 1990.
M. H. Miller. Financial innovation: The last twenty years and the next. The Journal
of Financial and Quantitative Analysis, 21(4):459–471, 1986.
T. Mitchell. Machine Learning. Burr Ridge, IL: McGraw-Hill, 1997.
J. Moody and M. Saffell. Learning to trade via direct reinforcement. IEEE
Transactions on Neural Networks, 12(4):875–889, 2001.
J. Moody, L. Wu, Y. Liao, and M. Saffell. Performance functions and reinforcement
learning for trading systems and portfolios. Journal of Forecasting, 17:441–471,
1998.
Y. Nevmyvaka, Y. Feng, and M. S. Kearns. Reinforcement learning for optimized
trade execution. In Proceedings of the International Conference on Machine
Learning, 673–680, 2006.

T&F Cat #K23731 — K23731_A004 — page 201 — 9/26/2015 — 8:06


202 BIBLIOGRAPHY
E. Ordentlich. Universal Investment and Universal Data Compression. PhD thesis,
Stanford University, 1996.
E. Ordentlich. Encyclopedia of Quantitative Finance, Universal Portfolios. Sussex:
Wiley, 2010.
E. Ordentlich and T. M. Cover. On-line portfolio selection. In Proceedings of the
Annual Conference on Learning Theory, Desenzano del Garda, Italy, 310–313,
1996.
E. Ordentlich and T. M. Cover. The cost of achieving the best portfolio in hindsight.
Mathematics of Operations Research, 23(4):960–982, 1998.
M. F. M. Osborne. Brownian motion in the stock market. Operations Research, 7(2):
145–173, 1959.
G. Ottucsák and I. Vajda. An asymptotic analysis of the mean-variance portfolio
selection. Statistics and Decisions, 25:63–88, 2007.
D. C. Porter. The probability of a trade at the ask: An examination of interday and
intraday behavior. The Journal of Financial and Quantitative Analysis, 27(2):
209–227, 1992.
J. M. Poterba and L. H. Summers. Mean reversion in stock prices: Evidence and
implications. Journal of Financial Economics, 22(1):27–59, 1988.
E. Qian, R. Hua, and E. Sorensen. Quantitative Equity Portfolio Management: Mod-
ern Techniques and Applications. Boca Raton: Chapman & Hall/CRC, 2007.
L. Rabiner and S. Levinson. Isolated and connected word recognition—theory and
selected applications. IEEE Transactions on Communications, 29(5):621–659,
1981.
T. Rakthanmanon, B. Campana, A. Mueen, G. Batista, B. Westover, Q. Zhu,
J. Zakaria, and E. Keogh. Searching and mining trillions of time series
subsequences under dynamic time warping. In Proceedings of the 18th ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining,
Beijing, China, 262–270, 2012.
M. R. Reinganum. The anomalous stock market behavior of small firms in January:
Empirical tests for tax-loss selling effects. Journal of Financial Economics,
12(1):89–104, 1983.
J. Rissanen. A universal data compression system. IEEE Transactions on
Information Theory, 29(5):656–663, 1983.
H. Sakoe and S. Chiba. Dynamic programming algorithm optimization for spoken
word recognition. In Readings in Speech Recognition, A. Waibel and K. Lee
(eds.), Morgan Kaufmann, San Mateo, 159–165. 1990.
S. Shalev-Shwartz. Online learning and online convex optimization. Foundations
and Trends in Machine Learning, 4(2):107–194, 2012.
W. F. Sharpe. A simplified model for portfolio analysis. Management Science, 9:
277–293, 1963.
W. F. Sharpe. Capital asset prices: A theory of market equilibrium under conditions
of risk. The Journal of Finance, 19(3):425–442, 1964.
W. F. Sharpe. Mutual fund performance. The Journal of Business, 39(1):119, 1966.
W. F. Sharpe. The sharpe ratio. Journal of Portfolio Management, 21(1):49–58,
1994.

T&F Cat #K23731 — K23731_A004 — page 202 — 9/26/2015 — 8:06


BIBLIOGRAPHY 203
Y. Singer. Switching portfolios. International Journal of Neural Systems, 8(4):
488–495, 1997.
R. Srikant and R. Agrawal. Mining sequential patterns: Generalizations and perfor-
mance improvements. In Proceedings of the 5th International Conference of
Extending Database Technology, Avignon, France, 1–17, 1996.
G. Stoltz and G. Lugosi. Internal regret in on-line portfolio selection. Machine
Learning, 59(1–2):125–159, 2005.
Y. Takano and J.-y. Gotoh. Constant rebalanced portfolio optimization under
nonlinear transaction costs. Asia-Pacific Financial Markets, 18:191–211, 2011.
N. Taleb. Fooled by Randomness: The Hidden Role of Chance in Life and in the
Markets. New York: Random House, 2008.
F. E. H. Tay and L. Cao. Application of support vector machines in financial time
series forecasting. Omega, 29(4):309–317, 2001.
E. O. Thorp. Optimal gambling systems for favorable games. Review of the
International Statistical Institute, 37(3):273–293, 1969.
E. O. Thorp. Portfolio choice and the Kelly criterion. In Proceedings of the Business
and Economics Section of the American Statistical Association, Fort Collins,
Colorado, 599–619, 1971.
E. O. Thorp. The Kelly criterion in blackjack, sports betting, and the stock market.
In Proceedings of the International Conference on Gambling and Risk Taking,
Montreal, 1997.
R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal
Statistical Society (Series B), 58:267–288, 1996.
J. Ting, T. Fu, and F. Chung. Mining of stock data: Intra- and inter-stock pattern
associative classification. Threshold, 5(100):5–99, 2006.
E. Tsang, P. Yung, and J. Li. Eddie-automation, a decision support tool for financial
forecasting. Decision Support Systems, 37:559–565, 2004.
R. S. Tsay. Analysis of Financial Time Series. New York: Wiley, 2002.
I. Vajda. Analysis of semi-log-optimal investment strategies. In Proceedings of
Prague Stochastic, Prague, 2006.
V. Vovk. Derandomizing stochastic prediction strategies. In Proceedings of Annual
Conference on Computational Learning Theory, Nashville, Tennessee, 32–44,
1997.
V. Vovk. Derandomizing stochastic prediction strategies. Machine Learning, 35:
247–282, 1999.
V. Vovk. Competitive on-line statistics. International Statistical Review/Revue
Internationale de Statistique, 69(2):213–248, 2001.
V. G. Vovk. Aggregating strategies. In Proceedings of the Annual Conference on
Learning Theory, Rochester, NY, 371–383, 1990.
V. G. Vovk and C. Watkins. Universal portfolio selection. In Proceedings of the
Annual Conference on Learning Theory, Madison, WI, 12–23, 1998.
X. Wang, A. Mueen, H. Ding, G. Trajcevski, P. Scheuermann, and E. Keogh.
Experimental comparison of representation methods and distance measures
for time series data. Data Mining and Knowledge Discovery, 26(2):275–309,
2013.

T&F Cat #K23731 — K23731_A004 — page 203 — 9/26/2015 — 8:06


204 BIBLIOGRAPHY
R. J. Yan and C. X. Ling. Machine learning for stock selection. In Proceedings of
the ACM SIGKDD International Conference on Knowledge Discovery and
Data Mining, San Jose, CA, 1038–1042, 2007.
H. Yang, Z. Xu, I. King, and M. R. Lyu. Online learning for group lasso. In
Proceedings of the International Conference on Machine Learning, Haifa,
Israel, 1191–1198, 2010.
B.-K. Yi, H. Jagadish, and C. Faloutsos. Efficient retrieval of similar time sequences
under time warping. In Proceedings of the International Conference on Data
Engineering, Orlando, FL, 201–208, 1998.
W. Young. Calmar ratio: A smoother tool. Futures, 20(1):40, 1991.
W. Zhang and S. Skiena. Financial Analysis Using News Data. Technical report,
State University of New York at Stony Brook, 2008.
W. Zhang and S. Skiena. Trading strategies to exploit blog and news sentiment.
In Proceedings of the International AAAI Conference on Weblogs and Social
Media, Atlanta, 375–378, 2010.
Y. Zhou, R. Jin, and S. C. Hoi. Exclusive lasso for multi-task feature selection.
In Proceedings of the International Conference on Artificial Intelligence and
Statistics, Chia Laguna Resort, Sardinia, Italy, 988–995, 2010.
M. Zinkevich. Online convex programming and generalized infinitesimal gradient
ascent. In Proceedings of the International Conference on Machine Learning,
Washington, DC, 928–936, 2003.
H. Zou. The adaptive lasso and its oracle properties. Journal of the American
Statistical Association, 101:1418–1429, 2006.

T&F Cat #K23731 — K23731_A004 — page 204 — 9/26/2015 — 8:06


Online Portfolio
Finance & Investing / Machine Learning & Pattern Recognition

Li and Hoi
With the aim to sequentially determine optimal allocations across a set of
assets, Online Portfolio Selection (OLPS) has significantly reshaped the

Selection
financial investment landscape. Online Portfolio Selection: Principles
and Algorithms supplies a comprehensive survey of existing OLPS
principles and presents a collection of innovative strategies that leverage
machine learning techniques for financial investment.

The book presents four new algorithms based on machine learning Principles and Algorithms
techniques that were designed by the authors, as well as a new back-

Online Portfolio Selection


test system they developed for evaluating trading strategy effectiveness.
The book uses simulations with real market data to illustrate the trading
strategies in action and to provide readers with the confidence to deploy
the strategies themselves. The book is presented in five sections that:

I. Introduce and formulate OLPS as a sequential decision task


II. Present key OLPS principles, including benchmarks, follow the
winner, follow the loser, pattern matching, and meta-learning
III. Detail four innovative OLPS algorithms based on cutting-edge
machine learning techniques
IV. Provide a toolbox for evaluating the OLPS algorithms and present
empirical studies comparing the proposed algorithms with the
state of the art
V. Investigate possible future directions

Complete with a back-test system that uses historical data to evaluate


the performance of trading strategies, as well as MATLAB® code for the
back-test systems, this book is an ideal resource for graduate students in
finance, computer science, and statistics. It is also suitable for researchers
and engineers interested in computational investment.

Readers are encouraged to visit the authors’ website for updates:


[Link]

K23731

Bin Li and Steven C.H. Hoi


6000 Broken Sound Parkway, NW
Suite 300, Boca Raton, FL 33487 ISBN: 978-1-4822-4963-7
711 Third Avenue 90000
an informa business New York, NY 10017
2 Park Square, Milton Park
[Link] Abingdon, Oxon OX14 4RN, UK
9 781482 249637
w w [Link]

K23731 mech [Link] 1 10/5/15 10:31 AM

Online Portfolio
Selection
Principles and Algorithms
Bin Li and Steven C.H. Hoi
Online Portfolio
Selection
Principles and Algorithms
Online Portfolio
Selection
Principles and Algorithms
Bin Li and Steven C.H. Hoi
MATLAB® is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks does 
not warrant the accuracy of th
Contents
List of Figures
ix
List of Tables
xi
List of Notations
xiii
Preface
xv
Acknowledgments
xvii
Authors
xix
I
Introducti
vi
CONTENTS
4
Follow the Winner
23
4.1
Universal Portfolios
23
4.2
Exponential Gradient
25
4.3
Follow the Leader
26
4.4
Follo
CONTENTS
vii
10 Confidence-Weighted Mean Reversion
71
10.1 Preliminaries
71
10.1.1 Motivation
71
10.2 Formulations
73
10.3 Alg
viii
CONTENTS
14 Threats to Validity
129
14.1 On Model Assumptions
129
14.2 On Mean Reversion Assumptions
130
14.3 On Theoret

You might also like