When a network of vision-based sensors is emplaced in an environment for applications such as surveillance or monitoring the spatial relationships between the sensing units must be inferred or computed for self-calibration purposes. In this paper we describe a technique to solve one aspect of this self-calibration problem: automatically determining the topology and connectivity information of a network of cameras based on a statistical analysis of observed motion in the environment. While the technique can use labels from reliable cameras systems, the algorithm is powerful enough to function using ambiguous tracking data. The method requires no prior knowledge of the relative locations of the cameras and operates under very weak environmental assumptions. Our approach stochastically samples plausible agent trajectories based on a delay model that allows for transitions to and from sources and sinks in the environment. The technique demonstrates considerable robustness both to sensor error and non-trivial patterns of agent motion. The output of the method is a Markov model describing the behavior of agents in the system and the underlying traffic patterns. The concept is demonstrated with simulation data for systems containing up to ten agents and verified with experiments conducted on a six camera sensor network.