1Metadata-Version: 1.1 2Name: pyaixi 3Version: 1.0.4.post1 4Summary: A pure Python implementation of the Monte Carlo-AIXI-Context Tree Weighting (MC-AIXI-CTW) artificial intelligence algorithm. 5Home-page: https://github.com/sgkasselau/pyaixi 6Author: SG Kassel 7Author-email: UNKNOWN 8License: Creative Commons Attribution-ShareAlike 3.0 Unported License 9Description: pyaixi 10 ====== 11 12 Description 13 ----------- 14 15 A pure Python implementation of the Monte Carlo-AIXI-Context Tree Weighting (MC-AIXI-CTW) 16 artificial intelligence algorithm. 17 18 This is an approximation of the AIXI universal artificial intelligence algorithm, which 19 describes a model-based, reinforcement-learning agent capable of general learning. 20 21 22 A more in-depth description of the MC-AIXI-CTW algorithm can be found here: 23 24 J.Veness, K.S.Ng, M.Hutter, W.Uther, D.Silver, 25 A Monte Carlo AIXI Approximation, 26 Journal of Artificial Intelligence Research, 40 (2011) 95-142 27 http://dx.doi.org/10.1613/jair.3125 28 Free TechReport version: http://arxiv.org/abs/0909.0801 29 BibTeX: http://www.hutter1.net/official/bib.htm#aixictwx 30 31 32 Motivation 33 ---------- 34 35 Providing a pure Python implementation of the MC-AIXI-CTW algorithm is intended to: 36 37 - help make the implementation of AIXI-approximate algorithms more accessible to people 38 without a C++ background 39 40 - permit easier use of the MC-AIXI-CTW algorithm (and components) in other Python-based 41 AI projects, and 42 43 - permit faster prototyping of new AIXI-approximate algorithms via Python's comparative 44 linguistic simplicity. 45 46 47 Getting started 48 --------------- 49 50 To try the example `Rock Paper Scissors` environment, run the following in the 51 base directory of this package. 52 53 From the Linux/Unix/Mac console: 54 55 python aixi.py -v conf/rock_paper_scissors_fast.conf 56 57 58 On Windows: 59 60 python aixi.py -v conf\rock_paper_scissors_fast.conf 61 62 63 Or if you have PyPy (e.g. version 1.9) installed on Linux/Unix/Mac: 64 65 pypy-c1.9 aixi.py -v conf/rock_paper_scissors_fast.conf 66 67 68 NOTE: it is highly recommended to use the PyPy http://pypy.org Python interpreter to 69 run code from this package, as this typically provides an order-of-magnitude run-time 70 improvement over using the standard CPython interpreter. 71 72 (This is unfortunately still an order of magnitude slower than the C++ version, though.) 73 74 75 This example will perform 500 interactions of the agent with the environment, with the agent 76 exploring the environment by trying permitted actions at random, and learning from 77 the related observations and rewards. 78 79 Then, the agent will use what it has learnt to maximise its reward in the following 80 500 interactions. (Exploration is typically quite quick, while using that gained knowledge 81 to choose the best action possible is typically much slower.) 82 83 84 For this particular environment, an average reward greater than 1 means the agent is winning 85 more than it is losing. 86 87 (A score ranging from 1.02 to 1.04 is typical, depending on the random seed given.) 88 89 90 Further example environments can be found in the `environments` directory: 91 92 - coin_flip - A simulation of a biased coin flip 93 - extended_tiger - An extended version of the Tiger-or-Gold door choice problem. 94 - kuhn_poker - A simplified, zero-sum version of poker. 95 - maze - A two-dimensional maze. 96 - rock_paper_scissors - Rock Paper Scissors. 97 - tic_tac_toe - Tic Tac Toe 98 - tiger - A choice between two doors. One door hides gold; the other, a tiger. 99 100 Similarly-named environment configuration files for these environments can be found in the 101 `conf` directory, and run by replacing `rock_paper_scissors_fast.conf` in the commands 102 listed above with the name of the appropriate configuration file. 103 104 105 Script usage 106 ------------ 107 108 Usage: python aixi.py [-a | --agent <agent module name>] 109 [-d | --explore-decay <exploration decay value, between 0 and 1>] 110 [-e | --environment <environment module name>] 111 [-h | --agent-horizon <search horizon>] 112 [-l | --learning-period <cycle count>] 113 [-m | --mc-simulations <number of simulations to run each step>] 114 [-o | --option <extra option name>=<value>] 115 [-p | --profile] 116 [-r | --terminate-age <number of cycles before stopping the run>] 117 [-t | --ct-depth <maximum depth of predicting context tree>] 118 [-x | --exploration <exploration factor, greater than 0>] 119 [-v | --verbose] 120 [<environment configuration file name to load>] 121 122 123 Adding new environments 124 ----------------------- 125 126 The environments in the `environments` directory all inherit from 127 a base class, `environment.Environment`, found in the base package directory. 128 129 New environments will need to inherit this class, and provide the methods 130 of this class (as well as any internal logic) to interact with the agent. 131 132 You'll also need to construct a new configuration file for this environment, 133 making sure to give the name of your new environment in the `environment` key. 134 135 136 Adding new agents 137 ----------------- 138 139 The only (for now) provided agent class can be found in the `agent` directory: 140 141 - mc_aixi_ctw - an agent implementing the Monte Carlo-AIXI-Context Tree Weighting algorithm. 142 143 144 The prediction algorithm used by this agent can be found in the `prediction` directory: 145 146 - ctw_context_tree - an implementation of Context Tree Weighting context trees. 147 148 149 The search algorithm used is found in the `search` directory: 150 151 - monte_carlo_search_tree - an implementation of Monte Carlo search trees. 152 153 154 New agents need to inherit from the base `agent.Agent` class, and provide the methods 155 listed within to interact with the currently-configured environment. 156 157 To use your own agent instead of the default `mc_aixi_ctw` agent in a configuration file, 158 use the `agent` key to specify the Python module name of your agent. 159 160 Alternatively, you can override the default/the configuration file value, by using 161 the '-a'/'--agent' option on the command line. 162 163 164 Similar projects 165 ---------------- 166 167 This package is based on the C++ implementation of the MC-AIXI-CTW algorithm seen here: 168 169 https://github.com/moridinamael/mc-aixi 170 171 172 Another implementation of MC-AIXI-CTW can be found here: 173 174 Joel Veness's personal page: http://jveness.info/software/default.html 175 176 177 License 178 ------- 179 180 Creative Commons Attribution ShareAlike 3.0 Unported. (CC BY-SA 3.0) 181 182 Please see `LICENSE.txt` for details. 183 184 If permitted in your legal domain (as this package is arguably a substantive 185 derivative of another CC BY-SA 3.0 package, hence the licensing terms above, 186 and the legal compatibility of CC BY-SA 3.0 with other open-source licences is currently 187 unknown), the author of this package permits alternate licensing under your 188 choice of either the LGPL 3.0 or the GPL 3.0. 189 190 191 Contact the author 192 ------------------ 193 194 For further assistance or to offer constructive feedback, please contact the author, 195 SG Kassel, via: 196 197 sg_dot_kassel_dot_au_at_gmail_dot_com 198Keywords: AIXI,agent,artificial intelligence,general learning,machine learning,MC-AIXI-CTW,model based,reinforcement learning,prediction,search 199Platform: UNKNOWN 200Classifier: Development Status :: 5 - Production/Stable 201Classifier: Environment :: Console 202Classifier: License :: Freely Distributable 203Classifier: Intended Audience :: Developers 204Classifier: Intended Audience :: Education 205Classifier: Intended Audience :: Science/Research 206Classifier: Natural Language :: English 207Classifier: Operating System :: OS Independent 208Classifier: Programming Language :: Python 209Classifier: Programming Language :: Python :: 2 210Classifier: Programming Language :: Python :: 2.6 211Classifier: Programming Language :: Python :: 2.7 212Classifier: Programming Language :: Python :: 3 213Classifier: Programming Language :: Python :: 3.2 214Classifier: Programming Language :: Python :: 3.3 215Classifier: Programming Language :: Python :: 3.4 216Classifier: Programming Language :: Python :: Implementation :: CPython 217Classifier: Programming Language :: Python :: Implementation :: PyPy 218Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence 219Classifier: Topic :: Software Development 220Classifier: Topic :: Software Development :: Libraries 221Classifier: Topic :: Software Development :: Libraries :: Python Modules 222