The server still uses only one thread, but it can handle multiple clients simultaneously by time-multiplexing among them. Within the network listening loop it repeatedly polls the network connections (in UNIX the select(2) call would be used for this purpose) and when it determines that one is ready to send/receive data it checks whether it's a new connection request (from the listening port) or an already connected one: in the former case it accepts a new connection with a new forthcoming client; in the latter it sends or receive data as appropriate.
This scheme can create a powerful services, because it take advantages of the time delays in the clients and in the network: a client may need to maintain a connection open for a long time (because of the delays involved in closing it and opening a new one when needed), but use it only for short busts. In between the server can select and serve or create other connections. However the selection itself does not come for free, and it's acceptable when CPU use is not an issue, but presents a problem if the service requires high performance.