000 00613nam a22002177a 4500
005 20240903161803.0
008 240903b |||||||| |||| 00| 0 eng d
020 _a9780262039246
082 _a006.3 SUT
_bSUT- R
100 _aSutton, Richard S.
245 _aReinforcement Learning
250 _a2nd ed
260 _aUSA
_bMIT
_c2018
300 _axxii: 526p.
650 _vTubular Solution Method
_xMonte Carlo Method
651 _aMonte Carlo Method
700 _aBarto, Andrew G.
760 _b2nd ed
942 _cBK
942 _e2nd ed
_h006.3 SUT- R
_iSUT- R
_kSUT
_mR
999 _c12073
_d12073