We present an anytime adaptive sampling technique that generates paths to efficiently measure and then mathematically model a scalar field by performing non-uniform measurements in a given region of interest. In particular, the class of scalar field we are interested is some physical or virtual parameter that varies with location, such as depth of the sea floor or the probability of finding a lost object. As the measurements are collected at each sampling location, we can compute an estimate of the large-scale variation of the phenomenon of interest. We compute a sampling path that minimizes the expected time to accurately model the phenomenon of interest by visiting high information regions using non-myopic path generation based on reinforcement learning.