toIPA
Home
Blog
Category
Multi-armed bandit
reinforcement learning problem exemplifying the exploration–exploitation tradeoff
Pronunciation
/ˈmʌlti - ɑrmd ˈbændɪt/
Categories
mathematical problem
optimization problem