1---
2title: 'mlpack 3: a fast, flexible machine learning library'
3tags:
4- machine learning
5- deep learning
6- c++
7- optimization
8- template metaprogramming
9
10authors:
11- name: Ryan R. Curtin
12  orcid: 0000-0002-9903-8214
13  affiliation: 1
14
15- name: Marcus Edel
16  orcid: 0000-0001-5445-7303
17  affiliation: 2
18
19- name: Mikhail Lozhnikov
20  orcid: 0000-0002-8727-0091
21  affiliation: 3
22
23- name: Yannis Mentekidis
24  orcid: 0000-0003-3860-9885
25  affiliation: 5
26
27- name: Sumedh Ghaisas
28  orcid: 0000-0003-3753-9029
29  affiliation: 5
30
31- name: Shangtong Zhang
32  orcid: 0000-0003-4255-1364
33  affiliation: 4
34
35affiliations:
36- name: Center for Advanced Machine Learning, Symantec Corporation
37  index: 1
38- name: Institute of Computer Science, Free University of Berlin
39  index: 2
40- name: Moscow State University, Faculty of Mechanics and Mathematics
41  index: 3
42- name: University of Alberta
43  index: 4
44- name: None
45  index: 5
46
47date: 5 April 2018
48bibliography: paper.bib
49---
50
51# Summary
52
53In the past several years, the field of machine learning has seen an explosion
54of interest and excitement, with hundreds or thousands of algorithms developed
55for different tasks every year.  But a primary problem faced by the field is the
56ability to scale to larger and larger data---since it is known that training on
57larger datasets typically produces better results [@halevy2009unreasonable].
58Therefore, the development of new algorithms for the continued growth of the
59field depends largely on the existence of good tooling and libraries that enable
60researchers and practitioners to quickly prototype and develop solutions
61[@sonnenburg2007need].  Simultaneously, useful libraries must also be efficient
62and well-implemented.  This has motivated our development of mlpack.
63
64mlpack is a flexible and fast machine learning library written in C++ that has
65bindings that allow use from the command-line and from Python, with support for
66other languages in active development.  mlpack has been developed actively for
67over 10 years [@mlpack2011, @mlpack2013], with over 100 contributors from
68around the world, and is a frequent mentoring organization in the Google Summer
69of Code program (\url{https://summerofcode.withgoogle.com}).  If used in C++,
70the library allows flexibility with no speed penalty through policy-based design
71and template metaprogramming [@alexandrescu2001modern]; but bindings are
72available to other languages, which allow easy use of the fast mlpack codebase.
73
74For fast linear algebra, mlpack is built on the Armadillo C++ matrix library
75[@sanderson2016armadillo], which in turn can use an optimized BLAS
76implementation such as OpenBLAS [@xianyi2018openblas] or even NVBLAS
77[@nvblas] which would allow mlpack algorithms to be run on the GPU.  In
78order to provide fast code, template metaprogramming is used throughout the
79library to reduce runtime overhead by performing any possible computations and
80optimizations at compile time.  An automatic benchmarking system is developed
81and used to test the efficiency of mlpack's algorithms [@edel2014automatic].
82
83mlpack contains a number of standard machine learning algorithms, such as
84logistic regression, random forests, and k-means clustering, and also contains
85cutting-edge techniques such as a compile-time optimized deep learning and
86reinforcement learning framework, dual-tree algorithms for nearest neighbor
87search and other tasks [@curtin2013tree], a generic optimization framework with
88numerous optimizers [@curtin2017generic], a generic hyper-parameter tuner, and
89other recently published machine learning algorithms.
90
91For a more comprehensive introduction to mlpack, see the website at
92\url{http://www.mlpack.org/} or a recent paper detailing the design and
93structure of mlpack [@curtin2017designing].
94
95# References
96