Sockets in .Net Core – Part 1

Fundamentals

A couple of decades ago, it was very normal to have the necessity of writing all sort of socket clients or servers. At some point the abstractions of some languages made it irrelevant, or to be more accurate, less used than ever, keeping the scope of sockets for college degrees with the typical chat servers or similar exercises. Those examples from classic university literature were enough to understand the foundations, but very far from reality where a lot of different challenges exist.

The history is cyclic as we learned; some new application such as Internet of the Things, GPSs and similar devices required to write sockets again, at least when you are implementing those solutions from scratch. In some cases, abstractions were not enough to cope with fine tuning to be able to deal with heavy workloads or implementations nuances. Microsoft itself implements some abstractions such as TcpClient that are less complex but hide all sort of details.

To make it worse, C# is not a language that was born with concurrency in its core. It is possible to deal with concurrency but it’s far from languages written with that purpose. If you require extremely high concurrency, performance and practically zero downtime, it is more adequate to go for another language such as Erlang that is very specific for that.

Principles 

Socket principles are well known and there is a lot of literature about it. However, it is not easy to find online real cases of blocking, heavy workload, how to deal with concurrency and the worst nightmare, framing. The reason might be most applications for sockets do not need such complexity, connecting 10 or 20 clients for a small chat server does not require more than a simple implementation that can handle some threads and minor locking techniques. A mutex will mostly always do the work.

However, I want to remark that I noted a lack of understanding of how the underlying layers work that leads to wrong implementations. This is the area I want to focus first.

A world before new Socket()

Our applications sit on top of lower level implementations that made possible the communication. Even when it is not required to understand everything about those low level implementations, it is necessary to know some of them.

Our sockets always use (at the date of writing this article) the Winsock implementation when we deal with Windows machines. WinSock is based on the original Berkeley implementation (BSD). Even when .Net Core is written to be used in other operative systems, the good news is Winsock was entirely derived of the Unix implementation so they look a lot alike and behave nearly in the same way. So when we are writing sockets we are indirectly using the old and reliable Winsock (BSD based) concept updated and enhanced through Windows versions.

As you probably know we can use UDP or TCP over IP (Internet Protocol). IP provides a datagram service and it is important to remark that is a “best effort protocol”. It will try to deliver packets to destination but it might (and eventually will) fail. You might probably ask how it is reliable, packets can be lost, reordered, retried. The reason is TCP manages that, it is designed to recover from losses, duplication, errors, etc. UDP does not care about that and packets might be received N times, get lost, reordered from how they were sent. Implementations on UDP and TCP must contemplate that.

Running under IP we have the real communication layer that can be all sort of transports, such as old modems, ethernet networks, satellite routers, microwave communications, etc. Why is that important? The main reason resides on focusing on the following principle: 

No matter what we push on the Socket stream on top of the implementation; everything will be buffered, pushed downstream, converted to a series of binary datagrams and it will eventually be broadcast in chunks to the destination that needs to revert the entire process to reach the upper layer.

Socket in .Net is the upper layer abstraction. Our messages will flow downstream following this path:

  1. Socket buffer
  2. WinSock implementation
  3. TCP / UDP
  4. IP
  5. Network

Flowing backwards in the reception.

Another important difference between TCP and UDP is the way packets are broadcast. Because UDP is a disconnected protocol, every packet must contain the destination address and it will be always sent in one piece that cannot exceed 64kb (technically a bit less but it’s not relevant for this article).

TCP is a connected protocol, the connection has to be established before sending packets. Messages will be transformed by the buffers, queues and network protocols to reach the destination. At the end TCP will rebuild the original message from the packets and push them upstream. The original message might be reconstructed in smaller chunks. 

This last principle can be summarized as message boundaries. TCP does not preserve them, UDP does.  

.Net implementation

As we know the Socket itself is just an abstraction to send and receive information through the network. 

The base implementation in .Net is the Socket class which allows a lot of low level access that upper-level implementation does not. In Windows it is possible to see how much of the WinSock implementation is still available, socket exceptions populate a property ErrorCode which is the original WinSock code. To make it more compatible with other Operative Systems .Net Core also adds more data such as NativeErrorCode property which is not tied to a particular operative system.

The Socket class is more powerful  but a bit more complex, for that reason Microsoft created three higher-level implementations:

  • TcpClient
  • TcpListener
  • UdpClient

All of them are implemented on top of Socket hiding some complexity. Again, let’s keep in mind the Socket class is still an abstraction that depends on WinSock (in Windows) and BSD Socket (for Linux). 

Threads and blocking operations

Another aspect of sockets is the fact most of the time we will have to deal with threads while implementing real-world solutions. A socket can only attend one client (at least in TCP) at a time that is connected. The classic example of a chat room will have some few dozens of hundreds of clients connected from different computers to our virtual room. Every client is an incoming connection with a specific IP and port number (the server port is known, client ports are assigned dynamically at the time of creating the connection).

In theory we could create a list of connections and attend them in a cycle, let’s imagine the following scenario:

  • Connection 1 is received in port 9999, we add it to the list of open sockets
  • Start cycling on 1 connection checking for incoming data
  • Connection 2 is received in port 9999, we add it to the list of open sockets
  • Connection 3, 4, 5… 20 is received in port 9999, we add them to the list of open sockets
  •  Keep cycling on 20 connections checking for incoming data

This approach is possible but inefficient. It will not scale either. Loops in sockets have several problems, one of them is flow control that I will briefly explain in the framing post about sockets. Another important issue of sockets is most basic operations block. What does it mean? The program will be completely blocked in that instruction waiting for something (reading operation, socket connection, etc). If we accept connections in blocking mode our potential loops indicated before won’t ever run until a new connection unblocks the program.

Example:

int yourPort = 9999;
Socket server = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
var ipEndpoint = new IPEndPoint(IPAddress.Parse("10.0.0.1"), yourPort);
server.Bind(ipEndpoint);
server.Listen(5); // Backlog
Socket client = server.Accept(); // This operation blocks

// The following lines do not run until a connection to the server is made
Console.WriteLine("I am unblocked!")

If you take a look at the code that is a basic example of a socket server bound to 10.0.0.1:9999, I indicated in the last line where it blocks. Execution of the Console WriteLine instruction will not happen until a connection is accepted by the server in that address and port. 

A normal technique is polling, which means just checking if there is a pending connection for a certain amount of time and keep running code. In general, the client accept code part, except for very particular scenarios, is implemented as a blocking code section. How can we execute more code if the server is almost always blocked waiting for a connection? The answer is threading, we need to spawn a thread with every connection accepted and keep blocking (not necessarily always) in the main thread. So updating the previous code to loop accepting connections and spawn a thread would look like the following example:

public void ServerStart()
{
  while (true)
  {
    int yourPort = 9999;
    Socket server = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
    var ipEndpoint = new IPEndPoint(IPAddress.Parse("10.0.0.1"), yourPort);
    server.Bind(ipEndpoint);
    server.Listen(5); // Backlog
    Socket client = server.Accept(); // This operation blocks

    var newThread = new Thread(SocketProcessor);
    newThread.Start(client);
  }
}

private void SocketProcessor(object socketObject)
{
  // Here we get the socket to use
  var mySocket = (Socket)socketObject;

  // Let's do something with the connection
}

The ServerStart method loops accepting connections. Due to Accept blocks, it will wait until a new connection is accepted and hand over the connection passing the socket object to a new spawned thread that can keep running independently.

Reading data, blocking and unblocking methods

Most examples online show basic data reading techniques, which might be useful in many contexts. Reading is not complex but it is important to understand there is only one type of reading. Sockets manage stream of bytes, it doesn’t matter what kind of high-level protocol we use or invent, we will be always reading bytes.

A further section about the stream focus on that.

There are different parameters passed to the Receive method at the time of reading but they basically do the same; they read from the stream into a buffer, The stream is in fact data ready to be processed that has flown all the way to the client (or server, at this point reading in each endpoint is exactly the same) to the network buffer, IP implementation, operative socket implementation and it’s buffered waiting to be consumed by our application.

Example:

public void Receive(Socket socket)
{
  int bufferSize = 128; // This size is a matter of an entire topic!
  byte[] rcvBuffer = new byte[bufferSize];
  int bytesReceived = 0;

  if ((bytesReceived = socket.Receive(rcvBuffer, 0, rcvBuffer.Length, SocketFlags.None)) > 0)
  {
    byte[] received = new byte[bytesReceived]; // This is what we actually receive
    Array.Copy(rcvBuffer, received, bytesReceived);
  }
}

As an example I used a socket passed to a method, which doesn’t mean we have to, it is just for this particular code. 

When we receive data we need to pass a buffer and the length of it. It doesn’t mean that:

  1. We will actually receive any data, that’s why the entire instruction determines if there are bytes read (> 0)
  2. We will receive the exact buffer size, it might be possible the buffer is full but we need to assign it to a variable to determine how many bytes we actually read.

The Array Copy instruction will copy the section of the buffer read to a new byte array with the exact information. There are potential optimizations but the principles are the same.

As I mentioned before, this operation also blocks. Receive will block until data is read or socket is disconnected (0 bytes means there is nothing to read at the end of the connection). In general we will need to execute more code such as processing that data that might take longer or any other task type. Spawning a new thread to read the data is complex and unnecessary. 

There is a simple solution to this and it’s polling. Polling will check if there is something buffered to be read, if not, we can continue the execution. The polling mechanism is simple, it checks the underlying buffer for data to read for a period of time (passed as a parameter) and returns true if data is present. It avoids blocking; technically it avoids blocking forever, we are still waiting a certain period of time.

The new code would look like this:

public void MainSection(Socket socket)
{
  while(true) // Loop forever
  {
    Receive(socket); // This execution will last at most 2 seconds
    // Do something else
  }
}

public void Receive(Socket socket)
{
  int bufferSize = 128; // This size is a matter of an entire topic!
  byte[] rcvBuffer = new byte[bufferSize];
  int bytesReceived = 0;

  if (socket.Poll(2000, SelectMode.SelectRead))
  {
    if ((bytesReceived = socket.Receive(rcvBuffer, 0, rcvBuffer.Length, SocketFlags.None)) > 0)
    {
      byte[] received = new byte[bytesReceived]; // This is what we actually receive
      Array.Copy(rcvBuffer, received, bytesReceived);
    }
  }
}

In that way we avoid blocking, polling periods of 2 seconds = 2000 milliseconds or any time window it suits your application.

In the next article we will explore the world of Stream and Framing, the pillars of data processing using sockets.

Published by maxriosflores

Solution Architect for a decade. I designed, built and implemented software solutions for more than 25 years and every single day more interested on technology. I learned to code in a Texas Instruments with 16kb at 8 years old. I shared this passion with friends coding CZ Spectrums, MSX's and C64's. I worked in computers since my early 17's with super old tools like plain C and Quick Basic. I love math and computers as much as outdoors and family life.

Join the Conversation

1 Comment

Leave a comment

Your email address will not be published. Required fields are marked *