Blog entries

  • Serving Cubicweb via WSGI with Pyramid: comparing the options

    2015/04/21 by David Douard

    CubicWeb can now be powered by Pyramid (thank you so much Christophe) instead of Twisted.

    I aim at moving all our applications to CubicWeb/Pyramid, so I wonder what will be the best way to deliver them. For now, we have a setup made of Apache + Varnish + Cubicweb/Twisted. In some applications we have two CubicWeb instances with a naive load balacing managed by Varnish.

    When moving to cubicweb-pyramid, there are several options. By default, a cubicweb-pyramid instance started via the cubicweb-ctl pyramid command, is running a waitress wsgi http server. I read it is common to deliver wsgi applications with nginx + uwsgi, but I wanted to play with mongrel2 (that I already tested with Cubicweb a while ago), and give a try to the circus + chaussette stack.

    I ran my tests :

    • using ab the simple Apache benchmark tool (aka ApacheBench) ;
    • on a clone of our logilab.org forge ;
    • on my laptop (Intel Core i7, 2.67GHz, quad core, 8Go),
    • using a postgresql 9.1 database server.

    Setup

    In order to be able to start the application as a wsgi app, a small python script is required. I extracted a small part of the cubicweb-pyramid ccplugin.py file into a elo.py file for this:

    appid = 'elo2'
    
    cwconfig = cwcfg.config_for(appid)
    application = wsgi_application_from_cwconfig(cwconfig)
    repo = cwconfig.repository()
    repo.start_looping_tasks()
    

    I tested 5 configurations: twisted, pyramid, mongrel2+wsgid, uwsgi and circus+chaussette. When possible, they were tested with 1 worker and 4 workers.

    Legacy Twisted mode

    Using good old legacy twisted setup:

    cubicwebctl start -D -l info elo
    

    The config setting that worth noting are:

    webserver-threadpool-size=6
    connections-pool-size=6
    

    Basic Pyramid mode

    Using the pyramid command that uses waitress:

    cubicwebctl pyramid --no-daemon -l info elo
    

    Mongrel2 + wsgid

    I have not been able to use uwsgi-mongrel2 as wsgi backend for mongrel2, since this uwsgi plugin is not provided by the uwsgi debian packages. I've used wsgid instead (sadly, the project appears to be dead).

    The mongrel config is:

    main = Server(
       uuid="f400bf85-4538-4f7a-8908-67e313d515c2",
       access_log="/logs/access.log",
       error_log="/logs/error.log",
       chroot="./",
       default_host="localhost",
       name="test",
       pid_file="/pid/mongrel2.pid",
       bind_addr="0.0.0.0",
       port=8083,
       hosts = [
           Host(name="localhost",
                routes={'/': Handler(send_spec='tcp://127.0.0.1:5000',
                                     send_ident='2113523d-f5ff-4571-b8da-8bddd3587475',
                                     recv_spec='tcp://127.0.0.1:5001',
                                     recv_ident='')
                       })
               ]
       )
    
    servers = [main]
    

    and the wsgid server is started with:

    wsgid --recv tcp://127.0.0.1:5000 --send tcp://127.0.0.1:5001 --keep-alive \
    --workers <N> --wsgi-app elo.application --app-path .
    

    uwsgi

    The config file used to start uwsgi is:

    [uwsgi]
    stats = 127.0.0.1:9191
    processes = <N>
    wsgi-file = elo.py
    http = :8085
    plugin = http,python
    virtualenv = /home/david/hg/grshells/venv/jpl
    enable-threads = true
    lazy-apps = true
    

    The tricky config option there is lazy-apps which must be set, otherwise the worker processes are forked after loading the cubicweb application, which this later does not support. If you omit this, only one worker will get the requests.

    circus + chaussette

    For the circus setup, I have used this configuration file:

    [circus]
    check_delay = 5
    endpoint = tcp://127.0.0.1:5555
    pubsub_endpoint = tcp://127.0.0.1:5556
    stats_endpoint = tcp://127.0.0.1:5557
    statsd = True
    httpd = True
    httpd_host = localhost
    httpd_port = 8086
    
    [watcher:webworker]
    cmd = /home/david/hg/grshells/venv/jpl/bin/chaussette --fd $(circus.sockets.webapp) elo2.app
    use_sockets = True
    numprocesses = 4
    
    [env:webworker]
    PATH=/home/david/hg/grshells/venv/jpl/bin:/usr/local/bin:/usr/bin:/bin
    CW_INSTANCES_DIR=/home/david/hg/grshells/grshell-jpl/etc
    PYTHONPATH=/home/david/hg/grshells//grshell-jpl
    
    [socket:webapp]
    host = 127.0.0.1
    port = 8085
    

    Results

    The bench are very simple; 100 requests from 1 worker or 500 requests from 5 concurrent workers, getting the main index page for the application:

    One ab worker

    ab -n 100 -c 1 http://127.0.0.1:8085/
    

    We get:

    Synthesis (1 client)

    Response times are:

    Response time (1 client)

    Five ab workers

    ab -n 500 -c 5 http://127.0.0.1:8085/
    

    We get:

    Synthesis (5 clients)

    Response times are:

    Response time (5 clients)

    Conclusion

    As expected, the legacy (and still default) twisted-based server is the least efficient method to serve a cubicweb application.

    When comparing results with only one CubicWeb worker, the pyramid+waitress solution that comes with cubicweb-pyramid is the most efficient, but mongrel2 + wsgid and circus + chaussette solutions mostly have similar performances when only one worker is activated. Surprisingly, the uwsgi solution is significantly less efficient, and especially have some requests that take significantly longer than other solutions (even the legacy twisted-based server).

    The price for activating several workers is small (around 3%) but significant when only one client is requesting the application. It is still unclear why.

    When there are severel workers requesting the application, it's not a surpsise that solutions with 4 workers behave significanly better (we are still far from a linear response however, roughly a 2x better for 4x the horsepower; maybe the hardware is the main reason for this unexpected non-linear response).

    I am quite surprised that uwsgi behaved significantly worse than the 2 other scalable solutions.

    Mongrel2 is still very efficient, but sadly the wsgid server I've used for these tests has not been developed for 2 years, and the uwsgi plugin for mongrel2 is not yet available on Debian.

    On the other side, I am very pleasantly surprised by circus + chaussette. Circus also comes with some nice features like a nice web dashboard which allows to add or remove workers dynamically:

    //www.cubicweb.org/file/5272071/raw //www.cubicweb.org/file/5272077/raw

  • Towards building a JavaScript user interface to CubicWeb

    2016/01/08 by Denis Laxalde

    This post is an introduction of a series of articles dealing with an on-going experiment on building a JavaScript user interface to CubicWeb, to ultimately replace the web component of the framework. The idea of this series is to present the main topics of the experiment, with open questions in order to eventually engage the community as much as possible. The other side of this is to experiment a blog driven development process, so getting feedback is the very point of it!

    As of today, three main topics have been identified:

    • the Web API to let the client and server communicate,
    • the issue of representing the application schema client-side, and,
    • the construction of components of the web interface (client-side).

    As part of the first topic, we'll probably rely on another experimental work about REST-fulness undertaken recently in pyramid-cubicweb (see this head for source code). Then, it appears quite clearly that we'll need sooner or later a representation of data on the client-side and that, quite obviously, the underlying format would be JSON. Apart from exchanging of entities (database) information, we already anticipate on the need for the HATEOAS part of REST. We already took some time to look at the existing possibilities. At a first glance, it seems that hydra is the most promising in term of capabilities. It's also built using semantic web technologies which definitely grants bonus point for CubicWeb. On the other hand, it seems a bit isolated and very experimental, while JSON API follows a more pragmatic approach (describe itself as an anti-bikeshedding tool) and appears to have more traction from various people. For this reason, we choose it for our first draft, but this topic seems so central in a new UI, and hard to hide as an implementation detail; that it definitely deserves more discussion. Other candidates could be Siren, HAL or Uber.

    Concerning the schema, it seems that there is consensus around JSON-Schema so we'll certainly give it a try.

    Finally, while there is nothing certain as of today we'll probably start on building components of the web interface using React, which is also getting quite popular these days. Beyond that choice, the first practical task in this topic will concern the primary view system. This task being neither too simple nor too complicated will hopefully result in a clearer overview of what the project will imply. Then, the question of edition will come up at some point. In this respect, perhaps it'll be a good time to put the UX question at a central place, in order to avoid design issues that we had in the past.

    Feedback welcome!