2025 prob Greedy Selection under Independent Increments: A Toy Model Analysis Huitao Yang 2025 arXiv ml In-Context Curiosity: Distilling Exploration for Decision-Pretrained Transformers on Bandit Tasks Huitao Yang and Guanting Chen 2025 arXiv 2024 Neurips Fast Best-of-N Decoding via Speculative Rejection Hanshi Sun, Momin Haider, Ruiqi Zhang, and 6 more authors 2024 arXiv