Re: [DALnet-src] Libd Design (Bahamut 2.0)

24 Apr 2009

      On Fri, Apr 24, 2009 at 11:09:56AM -0400, epiphani wrote:
...
I've been designing a lot of the system with this in mind for the
future.  I've decided that threading is important to support simply
because the recent crop of CPU designers hate programmers enough to
effectively force it on us.
You may have trouble with operating systems that emulate threads in
userspace.
...
Threading has always been a point of heated argument in the past - how do
we do it?  I've come up with the following design:
1.  Poll thread - basically hosts the polling engine, whatever it happens to be.
2.  Read/Write thread set(s).  Watch queues and polling states, reads
and writes from and to sockets, and interacts with queues.
3.  Event trigger thread(s).  Schedules and triggers events on timers,
for anything that isn't edge-triggered from a socket read.
4.  Parsing/worker thread pool.  Executes actual work from event or
read threads, and places data into write thread queues.
This design uses too many threads that need to talk to each other. For
example, when there is new data to be read on a socket, this is what will
happen:

1. The kernel will tell the poll thread that there is activity on a socket.
The poll thread will then wake up and then have to inform the read thread
that there is activity to read (for example, by adding the socket to a
queue, or saving the new poll state somewhere)

2. The read thread will then wake up and realize that there is new activity
to be read on the socket, and read it. Whatever is read from the socket will
be queued up for the parsing/worker thread to process.

3. The parsing/worker thread will wake up and realize that there is new
incoming activity to parse and process. It will then do that work, and then
presumably have a response to write out. The outgoing response is then
queued up for the write thread to write out to the socket.

4. The write thread will then wake up and realize that there is new activity
to be written on the socket, and write it out.

Each step will require a context switch to transfer control to a different
thread. This will be especially bad if not all threads are on the same
processor/core in a SMP machine. In order to finish processing activity on a
socket, we will be bouncing back and forth between threads before we are
done.

On a non-SMP machine, this just multiplies the work the processor has to do,
since the processor can only do one thing at a time anyway. On a SMP
machine, this limits our ability to use multiple processors in parallel, as
we have extra context switches to handle the threads talking to each other.

A better approach is to minimize the number of threads that need to be woken
up to completely process some activity.

We could have an accept thread which does nothing else other than wait for
activity on all the listening sockets and accept connections. When a new
connection arrives, it can assign a connection to a worker thread.

Each worker thread then waits for activity on only the connections that it
has been assigned, and handles all the work itself.

We can then limit the number of worker threads to some configurable number
that will allow each processor to share the load of processing activity.

-- 
Ned T. Crigler