We present a method for learning a set of generative models which are suitable for representing variations of selected image-domain features of the scene as a function of changes in the camera viewpoint. Such models are important for robotic tasks, such as probabilistic position estimation (i.e. localization), as well as visualization. Our approach entails the selection of image-domain features, as well as the synthesis of models of their visual behavior. The model we propose is capable of generating maximum likelihood views of automatically selected features, as well as a measure of the likelihood of a particular view from a particular camera position. Training the models involves regularizing observations of the features from known camera locations. The uncertainty of the model is evaluated using cross validation. The features themselves are initially selected automatically as salient points by a measure of visual attention, and are tracked across multiple views. While the motivation for this work is for robot localization, the results have implications for image interpolation, virtual scene reconstruction and object recognition. This paper presents a formulation of the problem and illustrative experimental results.