This is part of the Semicolon&Sons Code Diary - consisting of lessons learned on the job. You're in the unix category.
Last Updated: 2025-01-18
There is a unix system call select()
, What
does select
do? When do you need it?
Here's the motivating example containing code I could not understand
def start_wait_thread(pid, child)
# Some operating systems retain the status of terminated child processes until
# the parent collects that status (normally using some variant of wait()). If
# the parent never collects this status, the child stays around as a zombie
# process. Process::detach prevents this by setting up a separate Ruby thread
# whose sole job is to reap the status of the process pid when it terminates.
# Use detach only when you do not intend to explicitly wait for the child to
# terminate.
Process.detach(pid)
# This was defined as being a thread which rescues any exceptions that occur
# within so they don't bubble up
Spring.failsafe_thread {
# The `child.recv` call can raise an ECONNRESET, killing the thread, but that's ok
# as if it does we're no longer interested in the child
loop do
IO.select([child])
# receive one byte on the socket
break if child.recv(1, Socket::MSG_PEEK).empty?
sleep 0.01
end
log "child #{pid} shutdown"
synchronize {
if @pid == pid
@pid = nil
restart
end
}
}
end
From the man page on linux
int select(int nfds, fd_set *readfds, fd_set *writefds,
fd_set *exceptfds, struct timeval *timeout);
select() allows a program to monitor multiple file descriptors, waiting until
one or more of the file descriptors become "ready" for some class of I/O
operation (e.g., input possible). A file descriptor is considered ready if it
is possible to perform a corresponding I/O operation (e.g., read(2), or a
sufficiently small write(2)) without blocking.
The principal arguments of select() are three "sets" of file descriptors
(declared with the type fd_set), which allow the caller to wait for three
classes of events on the specified set of file descriptors. Each of the fd_set
arguments may be specified as NULL if no file descriptors are to be watched for
the corresponding class of events.
The nfds parameter (i.e. the first parameter) is the number of the highest file
descriptor in any of the sets.
- readfds. Watched for reading
- writefds. Watched for writing
- exceptfdset The file descriptors in this set are watched for "exceptional conditions".
The timeout argument is a timeval structure (shown below) that specifies the
interval that select() should block waiting for a file descriptor to become
ready. The call will block until either:
· a file descriptor becomes ready;
· the call is interrupted by a signal handler; or
· the timeout expires.
On return, select() replaces the given descriptor sets with subsets consisting
of those descriptors that are ready for the requested operation. select()
returns the total number of ready descriptors in all the sets
select()
is O(highest file descriptor), whereas a newer alternative, poll
is O(number of file descriptors). However, note that poll
is unix only,
whereas select()
is more portable
select
monitors sockets, open files, and pipes (anything with a fileno() method that
returns a valid file descriptor)
A blocking call will return when there is data available (and wait for said data), a non-blocking call will return data if there is data to return, otherwise returns an error saying there's no data (but always returns "immediately" after being called).
Whether you use one or the other depends on what you want to do — if you want to get that data and there's nothing else to do, you just call a blocking call. But sometimes you want to do something else if there's no data yet
A file descriptor from which you cannot read, because a read(2)
attempt would
block, is not ready so could be called a "blocked file descriptor".
There are several cases of a file descriptor being blocked (that is, not ready). The usual one is some network socket(7) or some pipe(7) not having any more input. That is why poll (or select, epoll(7), etc...) is needed to code event loops since you want to avoid busy polling.
select
then is the POSIX swiss knife for "is there any data?" kind of calls,
featuring blocked calls on several file descriptors, which may be timed (so, if
there's no input for five minutes, you can have it return with an error)
Or, in other words, The select()
API allows the process to wait for an event to
occur and to wake up the process when the event occurs.
The primary use-case is efficiency in servers.
The traditional way to write network servers is to have the main server block on
accept()
, waiting for a connection. Once a connection comes in, the server
forks, then the child process handles the connection and the main server is able to
service new incoming requests.
Now, with select(), instead of having a child process for each request, there is usually only one process that "multi-plexes" all requests, servicing each request as much as it can. Thus one main advantage of using select() is that your server will only require a single process to handle all requests. As such, your server will not need shared memory or synchronization primitives for different 'tasks' to communicate.
select()
makes it easier to monitor multiple connections at the same time, and
is more efficient than writing a polling loop in Python using socket timeouts,
because the monitoring happens in the operating system network layer, instead
of the interpreter.
Most select()-based servers look pretty much the same:
fd_set
structure with the file descriptors you want to know when data comes in on.fd_set
structure with the file descriptors you want to know when you can write on.# Here is a rough sketch of the idea
inputs = [some_bound_socket, some_other_bound_socket]
outpus = [socket]
while inputs:
# Wait for at least one of the sockets to be
# ready for processing
print('waiting for the next event', file=sys.stderr)
readable, writable, exceptional = select.select(inputs,
outputs,
inputs)
One major disadvantage of using select(), is that your server cannot act like there's only one client, like with a fork()ing solution. For example, with a fork()ing solution, after the server fork()s, the child process works with the client as if there was only one client in the universe -- the child does not have to worry about new incoming connections or the existence of other sockets. With select(), the programming isn't as transparent.