Sockets

This is part of the Semicolon&Sons Code Diary - consisting of lessons learned on the job. You're in the unix category.

Last Updated: 2024-11-21

The socket is the BSD method for accomplishing interprocess communication (IPC). What this means is a socket is used to allow one process to speak to another, very much like the telephone is used to allow one person to speak to another.

Step 1: Create a socket with socket system call

In order for a person to receive telephone calls, she must first have a telephone installed. Likewise you must create a socket to listen for connections. This process involves several steps. First you must make a new socket, which is similar to having a telephone line installed. The socket() command is used to do this.

Per the man page

socket() creates an endpoint for communication and returns a descriptor.

Step 1b: Choose addressing schema of socket

Since sockets can have several types, you must specify what type of socket you want when you create one. One option that you have is the addressing format of a socket. Just as the mail service uses a different scheme to deliver mail than the telephone company uses to complete calls, so can sockets differ. The two most common addressing schemes are AF_UNIX and IAF_INET. AF_UNIX addressing uses UNIX pathnames to identify sockets; these sockets are very useful for IPC between processes on the same machine. AF_INET addressing uses Internet addresses which are four-byte numbers usually written as four decimal numbers separated by periods (such as 192.9.200.10). In addition to the machine address, there is also a port number which allows more than one AF_INET socket on each machine.

Step 1c: Choose type of socket

Another option which you must supply when creating a socket is the type of socket. The two most common types are SOCK_STREAM and SOCK_DGRAM. SOCK_STREAM indicates that data will come across the socket as a stream of characters, while SOCK_DGRAM indicates that data will come in bunches (called datagrams)

As the manual page says, Unix sockets are always reliable. The difference between SOCK_STREAM and SOCK_DGRAM is in the semantics of consuming data out of the socket.

Stream socket allows for reading arbitrary number of bytes, but still preserving byte sequence. In other words, a sender might write 4K of data to the socket, and the receiver can consume that data byte by byte. The other way around is true too - sender can write several small messages to the socket that the receiver can consume in one read. Stream socket does not preserve message boundaries.

Datagram socket, on the other hand, does preserve these boundaries - one write by the sender always corresponds to one read by the receiver (even if receiver's buffer given to read(2) or recv(2) is smaller then that message - recv can be used to specify how many bytes to read with stream sockets.)

So if your application protocol has small messages with a known upper bound on message size you are better off with SOCK_DGRAM since that's easier to manage.

Step 2: Bind the socket

After creating a socket, we must give the socket an address to listen to, just as you get a telephone number so that you can receive calls. The bind() function is used to do this (it binds a socket to an address, hence the name).

bind() assigns a socket to an address. When a socket is created using socket(), it is only given a protocol family, but not assigned an address. This association with an address must be performed with the bind() system call

Step 3: Accept calls to socket

Example in action

require "socket"
dts=TCPServer.new('localhost',3000)
loop do
  Thread.start(dts.accept) do |s|
    print(s, " is accepted\n")
    s.write(Time.now)
    print(s, " is gone\n")
    s.close
  end
end

You first load the socket library with the require command. The TCPServer class is a helper class for building TCP socket servers. The TCPServer.new('localhost', 20000) statement creates a new socket identified by localhost and port number. The Thread.start creates and runs a new thread to execute instructions given in block. Any arguments passed to Thread.start are passed into the block. The dts.accept method waits for a connection on dts, and returns a new TCPSocket object connected to the caller. The Kernel.loop iterator calls the associated block (do...end) forever or at least until you break out of the loop. We use s.write(Time.now) to write the current date and time on the server to the socket. Finally, we write s.close to close a socket using the close method. This method is inherited from the IO class, but it's available for each of the socket types.

Connect vs bind

The one liner : bind() to own address, connect() to remote address.

Resources