-
Notifications
You must be signed in to change notification settings - Fork 42
/
Copy pathabout.html
137 lines (117 loc) · 7.26 KB
/
about.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
<!DOCTYPE html>
<html class="writer-html5" lang="en" data-content_root="./">
<head>
<meta charset="utf-8" /><meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>About Multi-Armed Bandits — MABWiser 2.7.4 documentation</title>
<link rel="stylesheet" type="text/css" href="_static/pygments.css?v=80d5e7a1" />
<link rel="stylesheet" type="text/css" href="_static/css/theme.css?v=19f00094" />
<!--[if lt IE 9]>
<script src="_static/js/html5shiv.min.js"></script>
<![endif]-->
<script src="_static/jquery.js?v=5d32c60e"></script>
<script src="_static/_sphinx_javascript_frameworks_compat.js?v=2cd50e6c"></script>
<script src="_static/documentation_options.js?v=e8140b17"></script>
<script src="_static/doctools.js?v=888ff710"></script>
<script src="_static/sphinx_highlight.js?v=dc90522c"></script>
<script src="_static/js/theme.js"></script>
<link rel="author" title="About these documents" href="#" />
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="Installation" href="installation.html" />
<link rel="prev" title="MABWiser Contextual Multi-Armed Bandits" href="index.html" />
</head>
<body class="wy-body-for-nav">
<div class="wy-grid-for-nav">
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
<div class="wy-side-scroll">
<div class="wy-side-nav-search" >
<a href="index.html" class="icon icon-home">
MABWiser
</a>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="search.html" method="get">
<input type="text" name="q" placeholder="Search docs" aria-label="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
<ul class="current">
<li class="toctree-l1 current"><a class="current reference internal" href="#">About Multi-Armed Bandits</a></li>
<li class="toctree-l1"><a class="reference internal" href="installation.html">Installation</a></li>
<li class="toctree-l1"><a class="reference internal" href="quick.html">Quick Start</a></li>
<li class="toctree-l1"><a class="reference internal" href="examples.html">Usage Examples</a></li>
<li class="toctree-l1"><a class="reference internal" href="contributing.html">Contributing</a></li>
<li class="toctree-l1"><a class="reference internal" href="new_bandit.html">Adding a New Bandit</a></li>
<li class="toctree-l1"><a class="reference internal" href="api.html">MABWiser Public API</a></li>
</ul>
</div>
</div>
</nav>
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" >
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
<a href="index.html">MABWiser</a>
</nav>
<div class="wy-nav-content">
<div class="rst-content">
<div role="navigation" aria-label="Page navigation">
<ul class="wy-breadcrumbs">
<li><a href="index.html" class="icon icon-home" aria-label="Home"></a></li>
<li class="breadcrumb-item active">About Multi-Armed Bandits</li>
<li class="wy-breadcrumbs-aside">
<a href="_sources/about.rst.txt" rel="nofollow"> View page source</a>
</li>
</ul>
<hr/>
</div>
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<section id="about-multi-armed-bandits">
<span id="about"></span><h1>About Multi-Armed Bandits<a class="headerlink" href="#about-multi-armed-bandits" title="Link to this heading"></a></h1>
<p>There are many real-world situations in which we have to decide between multiple options yet we are only able to learn the best course of action by testing each option sequentially.</p>
<p><strong>Multi-armed bandit (MAB)</strong> algorithms are suitable for such sequential, online decision making problems under uncertainty.
As such, they play an important part in many machine learning applications in internet advertising, recommendation engines, and clinical trials among many others.</p>
<div class="admonition-exploration-vs-exploitation admonition">
<p class="admonition-title">Exploration vs. Exploitation</p>
<p>In this setting, for each and every renewed decision we face an underlying question: Do we stick to what we know and receive an expected result (”<strong>exploit</strong>”) or choose an option we do not know much about and potentially learn something new (”<strong>explore</strong>”)?</p>
</div>
<p><strong>Problem Definition:</strong> In a multi-armed bandits problem, the model of outcomes is unknown, and the outcomes can be deterministic
or stochastic. The agent needs to make a sequence of decisions in time <em>1, 2, …, T</em>.
At each time <em>t</em> the agent is given a set of <em>K</em> arms, and it has to decide which arm to pull.
After pulling an arm, it receives a <em>reward</em> of that arm, and the rewards of other arms are unknown.
In a stochastic setting the reward of an arm is sampled from some unknown distribution. There exist situations where we also observe side information at each time <em>t</em>.
This side information is referred to as <em>context</em>. The arm that has the highest expected reward may be different given different contexts.
This variant is called <strong>contextual multi-armed bandits</strong>. Overall, the objective is to maximize the cumulative expected reward in the long run.</p>
<hr class="docutils" />
<p>For more information, we refer to these excellent resources:</p>
<ol class="arabic simple">
<li><p><a class="reference external" href="http://proceedings.mlr.press/v9/lu10a/lu10a.pdf">Contextual Multi-Armed Bandits</a>, Tyler Lu <em>et. al</em>. Proc. of Machine Learning Research, 2010</p></li>
<li><p><a class="reference external" href="https://arxiv.org/pdf/1508.03326.pdf">A Survey on Contextual Multi-armed Bandits</a>, Li Zhou, Carnegie Mellon University, arXiv, 2016</p></li>
</ol>
</section>
</div>
</div>
<footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
<a href="index.html" class="btn btn-neutral float-left" title="MABWiser Contextual Multi-Armed Bandits" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="installation.html" class="btn btn-neutral float-right" title="Installation" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
</div>
<hr/>
<div role="contentinfo">
<p>© Copyright Copyright (C), FMR LLC.</p>
</div>
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script>
jQuery(function () {
SphinxRtdTheme.Navigation.enable(true);
});
</script>
</body>
</html>