Andrew Wagenmaker, Kevin Jamieson: Instance-Dependent Near-Optimal Policy Identification in Linear MDPs via Online Experiment Design. NeurIPS 2022