Writing a server with Python’s asyncore module

The Python asyncore and aynchat modules

The Python standard library provides two modules—asyncore and
asynchat—to help in writing concurrent network servers using
event-based designs. The documentation does not give good examples,
so I am making some notes.

Overview

The basic idea behind the asyncore module is that:

there is a function, asyncore.loop() that does select() on a bunch of ‘channels’. Channels are thin wrappers around sockets.
when select returns, it reports which sockets have data waiting to be read, which ones are now free to send more data, and which ones have errors; loop() examines the event and the socket’s state to create a higher level event;
it then calls a method on the channel corresponding to the higher level event.

asyncore provides a low-level, but flexible API to build network
servers. asynchat builds upon asyncore and provides an API that is
more suitable for request/response type of protocols.

aysncore

The asyncore module’s API consists of:

the loop() method, to be called by a driver program written by you;
the dispatcher class, to be subclassed by you to do useful stuff. The dispatcher class is what is called ‘channel’ elsewhere.

+-------------+           +--------+
| driver code |---------> | loop() |
+-------------+           +--------+
      |                      |
      |                      | loop-dispatcher API (a)
      |                      |
      |                  +--------------+
      |                  | dispatcher   |
      +----------------->| subclass     |
                         +--------------+
                             |
                             | dispatcher-logic API (b)
                             |
                         +--------------+
                         | server logic |
                         +--------------+

This is all packaged nicely in an object oriented way. So, we have
the dispatcher class, that extends/wraps around the socket class (from
the socket module in the Python standard library). It provides all
the socket class’ methods, as well as methods to handle the higher
level events. You are supposed to subclass dispatcher and implement
the event handling methods to do something useful.

The loop-dispatcher API

The loop function looks like this:

loop( [timeout[, use_poll[, map[,count]]]])

What is the map? It is a dictionary whose keys are the
file-descriptors, or fds, of the socket (i.e., socket.fileno()), and
whose values are the dispatcher objects which you want to handle events on that socket/fd.

When we create a new dispatcher object, it automatically gets added to a
global list of sockets (which is invisible to us, and managed behind the scenes).
The loop() function does a select() on this
list.

We can over-ride the list that loop looks at, by providing an explicit map. But then, we would need to add/remove dispatchers we create to/from this map ourselves. (Hmm… we might always want
to use explicit maps; then our loop calls will be thread safe and we
will be able to launch multiple threads, each calling loop on
different maps.)

Methods a dispatcher subclass should implement

loop() needs the dispatcher to implement some methods:

readable(): should return True, if you want the fd to be observed for read events;
writable(): should return True, if you want the fd to be observed for write events;

If either readable or writable returns True, the corresponding fd will be examined
for errors also. Obviously, it makes no sense to have a dispatcher
which returns False for both readable and writable.

Some other methods that loop calls on dispatchers are:

handle_read: socket is readable; dispatcher.recv() can be used to actually get the data
handle_write: socket is writable; dispatcher.send(data) can be used to actually send the data
handle_error: socket encountered an error
handle_expt: socket received OOB data (not really used in practice)
handle_close: socket was closed remotely or locally

For server dispatchers, loop calls one more event:

handle_accept: a new incoming connection can be accept()ed. Call the accept() method really accept the connection. To create a server socket, call the bind() and listen() methods on it first.

Client sockets get this event:

handle_connect: connection to remote endpoint has been made. To initiate the connection, first call the connect() method on it.

Client sockets are discussed in the asyncore documentation so I will not discuss them here.

Other socket methods are available in dispatcher: create_socket,
close, set_resue_addr. They are not called by loop but are available
so that your code can call them when it needs to create a new socket, close an existing socket, and tell the OS to set the SO_REUSEADDR flag on the server socket.

How to write a server using asyncore

The standard library documentation gives a client example, but not a
server example. Here are some notes on the latter.

Subclass dispatcher to create a listening socket
In its handle_accept method, create new dispatchers. They’ll get added to the global socket map.

Note: the handlers must not block or take too much time… or the server won’t be concurrent. This is because when multiple sockets get an event, loop calls their dispatchers one-by-one, in the same thread.

The socket-like functions that dispatcher extends should not be bypassed in order to access the low level socket functions. They do funky things to detect higher level events. For e.g., how does asyncore figure out that the socket is closed? If I remember correctly, there are two ways to detect whether a non-blocking socket is closed:

select() returns a read event, but when you call recv()/read() you get zero bytes;
you call send()/write() and it fails with an error (sending zero bytes is not an error).

(I wish I had a copy of Unix Network Programming by Stevens handy
right now.) dispatcher will detect both events above and if any one of them occurs, will call handle_close. This frees you from having to look at low-level events, and think in terms of higher level events.

The code for a server based on asyncore is below:

asyncore_echo_server.py

import logging
import asyncore
import socket

logging.basicConfig(level=logging.DEBUG, format="%(created)-15s %(msecs)d %(levelname)8s %(thread)d %(name)s %(message)s")
log                     = logging.getLogger(__name__)

BACKLOG                 = 5
SIZE                    = 1024

class EchoHandler(asyncore.dispatcher):

    def __init__(self, conn_sock, client_address, server):
        self.server             = server
        self.client_address     = client_address
        self.buffer             = ""

        # We dont have anything to write, to start with
        self.is_writable        = False

        # Create ourselves, but with an already provided socket
        asyncore.dispatcher.__init__(self, conn_sock)
        log.debug("created handler; waiting for loop")

    def readable(self):
        return True     # We are always happy to read

    def writable(self):
        return self.is_writable # But we might not have
                                # anything to send all the time

    def handle_read(self):
        log.debug("handle_read")
        data = self.recv(SIZE)
        log.debug("after recv")
        if data:
            log.debug("got data")
            self.buffer += data
            self.is_writable = True  # sth to send back now
        else:
            log.debug("got null data")

    def handle_write(self):
        log.debug("handle_write")
        if self.buffer:
            sent = self.send(self.buffer)
            log.debug("sent data")
            self.buffer = self.buffer[sent:]
        else:
            log.debug("nothing to send")
        if len(self.buffer) == 0:
            self.is_writable = False

    # Will this ever get called?  Does loop() call
    # handle_close() if we called close, to start with?
    def handle_close(self):
        log.debug("handle_close")
        log.info("conn_closed: client_address=%s:%s" % \
                     (self.client_address[0],
                      self.client_address[1]))
        self.close()
        #pass

class EchoServer(asyncore.dispatcher):

    allow_reuse_address         = False
    request_queue_size          = 5
    address_family              = socket.AF_INET
    socket_type                 = socket.SOCK_STREAM

    def __init__(self, address, handlerClass=EchoHandler):
        self.address            = address
        self.handlerClass       = handlerClass

        asyncore.dispatcher.__init__(self)
        self.create_socket(self.address_family,
                               self.socket_type)

        if self.allow_reuse_address:
            self.set_reuse_addr()

        self.server_bind()
        self.server_activate()

    def server_bind(self):
        self.bind(self.address)
        log.debug("bind: address=%s:%s" % (self.address[0], self.address[1]))

    def server_activate(self):
        self.listen(self.request_queue_size)
        log.debug("listen: backlog=%d" % self.request_queue_size)

    def fileno(self):
        return self.socket.fileno()

    def serve_forever(self):
        asyncore.loop()

    # TODO: try to implement handle_request()

    # Internal use
    def handle_accept(self):
        (conn_sock, client_address) = self.accept()
        if self.verify_request(conn_sock, client_address):
            self.process_request(conn_sock, client_address)

    def verify_request(self, conn_sock, client_address):
        return True

    def process_request(self, conn_sock, client_address):
        log.info("conn_made: client_address=%s:%s" % \
                     (client_address[0],
                      client_address[1]))
        self.handlerClass(conn_sock, client_address, self)

    def handle_close(self):
        self.close()

and to use it:

    interface = "0.0.0.0"
    port = 8080
    server = asyncore_echo_server.EchoServer((interface, port))
    server.serve_forever()

By parijatmishra, on January 4, 2008 at 11:34 am, under Uncategorized. Tags: python. 19 Comments

Post a comment or leave a trackback: Trackback URL.

Comments

Diodotus Reserva On January 13, 2008 at 4:06 am
Permalink | Reply

Thank you.
xtealc On March 13, 2008 at 10:04 am
Permalink | Reply

Thanks a lot.
Tom@sQo On April 19, 2008 at 1:40 am
Permalink | Reply

thanks a lot. This is an article I`ve been looking for ages ;)
FrogFace On May 4, 2008 at 10:08 am
Permalink | Reply

Nice post.

There is one typo I’ve noticed, self.set_resue_addr() should be self.set_reuse_addr()
lalakis On May 26, 2008 at 11:58 pm
Permalink | Reply

great post, thanx
Someone On December 31, 2009 at 3:27 am
Permalink | Reply

Thanks for posting, will make it easier to get going with this class.
G G Chandrasekaran On February 5, 2010 at 3:26 pm
Permalink | Reply

Very useful one for beginners
Antonis On May 30, 2010 at 11:15 pm
Permalink | Reply

Great article!!!
I have a question.
can i combine this server in an application implemented with cmd library? cmdloop() is blocking.
And how?
Thanks
parijatmishra On May 31, 2010 at 11:04 am
Permalink | Reply

@Antonis: since cmdloop() is blocking and so is asyncore.loop(), you have to execute them in different threads. So, for e.g:

import threading
t1 = threading.thread(Cmd.Cmdloop, "prompt> ") t1.start()
server = asyncore_echo_server.EchoServer((interface, port)) t2 = threading.thread(server.serve_forever) t2.start()

The other thing to consider is that you’d probably want cmdloop and the echoserver to communicate with each other. For e.g., the server should print what it receives to the console, and send what the user types to the network.

One way to do this is to create two queues Q1 and Q2. Your command interpreter should read the user input, perhaps transform or process it, and put the result in Q1. The EchoHandler.handle_write should read Q1 (instead of the “buffer” above) and send its contents out.

The EchoHandler’s handle_read method should read the network data and put it in Q2. Your command interpreter should, perhaps just after the user has entered a command, read the contents of Q2 and print it to console.

This is the general idea. When to read/write to the queues is highly application dependent.
- Curious On July 20, 2010 at 8:00 pm
  Permalink | Reply
  
  If I wanted to create a simple proxy it seems that I’d have to create 2 threads and probably use Queue as well. One thread to handle receiving and sending of data to the proxy and back to the client, and one thread to handle sending/receiving data to/from a website. If I understand what you’ve said above, I would also need to create a Queue to handle passing information back and forth between the two threads.
  
  After reading your posts, my understanding is that asyncore would function normally (in my suggested example above) and the use of threads would prevent any client side processing hangs/slow responses from slowing down the asyncore function itself.
  
  Is that model correct? Or is my understanding flawed?
  
  Thanks!
  - parijatmishra On July 21, 2010 at 11:16 am
    Permalink
    
    @Curious: your understanding is correct. Your scenario is conceptually the same as the one posted earlier by @Antonis, except in that case there was only one “client” (the cmdloop) and one “website” (the echoserver). In your case, however, presumable there will be many clients connecting to your server, asking for various URLs?
    
    If so, then your server needs to be more sophisticated: the thread getting the data back from the website has to “remember” which client originally requested that data and ensure it is sent to only that client.
  - Curious On July 21, 2010 at 7:54 pm
    Permalink
    
    (Interesting, I can’t reply to your reply.)
    
    I thought it was similar to the one presented by @Antonis but I wasn’t positive. I’m not a programmer by trade.
    
    Oh my, I had planned on all the necessary functions requiring the request and then returning both the request and response to the requestor. At this point I see how that’s not going to fully solve the problem of who sent what data and needs what response. I’ll somehow need to tag each thread.
    
    Is there some pythonic book that describes these types of scenarios?
    
    Thanks for the useful posts here! And for responding to my question.
  - parijatmishra On July 22, 2010 at 1:35 pm
    Permalink
    
    Any good book on network programming should cover these topics, perhaps with other languages/frameworks.
    
    “Twisted Network Programming Essentials” (http://www.amazon.com/Twisted-Network-Programming-Essentials-Fettig/dp/0596100329/ref=sr_1_1?ie=UTF8&s=books&qid=1279776775&sr=1-1) is a good book on how to write non-blocking IO based network clients and servers. It will cover your scenario, but use the Twisted framework. Which is interesting, because Twisted takes the core idea behind the asyncore module and takes it to a new level.
alexanderquinn On July 8, 2010 at 5:29 am
Permalink | Reply

Thank you! This saved me a lot of time.
Veron On May 2, 2012 at 5:33 pm
Permalink | Reply

Thank you! This Tutorial reallyu shows how python make abstraction of C select() and epoll() fucntions.

But I’ve two questions:
– I don’t undertstand why your EchoHandler Class has a server parameter, and why to call the constructor with that parameter.
– I would also like to know if when use an explicit map, we can act ourself on this map, like adding file descriptor and such things.
Or it’s just in order to have a reference to the map we’re using, and it’s for the main loop(asyncore.loop()) to make actions on the given map.

Thans again.
- parijatmishra On May 2, 2012 at 9:30 pm
  Permalink | Reply
  
  @Veron:
  
  – I don’t undertstand why your EchoHandler Class has a server parameter, and why to call the constructor with that parameter.
  
  Good question. In the sample code, the server parameter is useless and can be removed. My EchoHandler does not use the server parameter because it is rather simple. In a real server, handlers would typically be reading/storing some state that is independent of a connection or that needs to survive across connections. This state can be conveniently kept in the server object and accessed by the individual handlers via the server parameter passed to them. Of course, instead of passing server object to the handlers, you could pass in any object that is suitable for storing the state. So again, yes, in the sample code, you can get rid of the server parameter.
  
  – I would also like to know if when use an explicit map, we can act ourself on this map, like adding file descriptor and such things.
  Or it’s just in order to have a reference to the map we’re using, and it’s for the main loop(asyncore.loop()) to make actions on the given map.
  
  If you pass in a map to loop explicitly, then you must: (a) add new dispatchers to it when you create them (when a new socket is accepted); (b) remove dispatchers from it when their socket is closed.
  
  The dispatcher code does the above two things for you, using the global map. If you are not using the global map, then you have to do these things yourself.
Veron On May 7, 2012 at 5:24 pm
Permalink | Reply

Thanks parijatmishara for your greet answer.
Veron On May 7, 2012 at 7:56 pm
Permalink | Reply

However, I’d like to if one can access the global map, in order to get a specific file descriptor/dispatcher, and write data on this file descriptor?

Or It’s beter or obliged to provide a specific map if we want to do such actions?
- parijatmishra On May 10, 2012 at 9:46 pm
  Permalink | Reply
  
  The problem with accessing the global map is that you don’t have a reference to it — you don’t know its name and which module it is defined in — unless you are willing to dig into the source code of the asyncore module.
  
  Actually, you DON’T want to know the name of the global map variable — it is an implementation detail that may change from version to version of the asyncore module, potentially breaking your code.
  
  Hence, if you want to get hold of an fd and do operations yourself, you are better off using your own map.

Parijat’s Weblog