Vlad's Roam Garden

Powered by 🌱Roam Garden

Upper Confidence Bound algorithm

This is an "Orphan" page. Its core content has not been shared: what you see below is a loose collection of pages and page snippets that mention this page, as well as snippets of this page that were quoted elsewhere.

Referenced in

Algorithms to Live By: The Computer Science of Human Decisions

Second, knowing that you are using an optimal algorithm should be a relief even if you don’t get the results you were looking for. The 37% Rule fails 63% of the time. Maintaining your cache with LRU doesn’t guarantee that you will always find what you’re looking for; in fact, neither would clairvoyance. Using the Upper Confidence Bound algorithm approach to the explore/exploit tradeoff doesn’t mean that you will have no regrets, just that those regrets will accumulate ever more slowly as you go through life. Even the best strategy sometimes yields bad results — which is why computer scientists take care to distinguish between “process” and “outcome.” If you followed the best possible process, then you’ve done all you can, and you shouldn’t blame yourself if things didn’t go your way.
Outcomes make news headlines—indeed, they make the world we live in—so it’s easy to become fixated on them. But processes are what we have control over.