blog entries created by David Douard
  • Serving Cubicweb via WSGI with Pyramid: comparing the options

    2015/04/21 by David Douard

    CubicWeb can now be powered by Pyramid (thank you so much Christophe) instead of Twisted.

    I aim at moving all our applications to CubicWeb/Pyramid, so I wonder what will be the best way to deliver them. For now, we have a setup made of Apache + Varnish + Cubicweb/Twisted. In some applications we have two CubicWeb instances with a naive load balacing managed by Varnish.

    When moving to cubicweb-pyramid, there are several options. By default, a cubicweb-pyramid instance started via the cubicweb-ctl pyramid command, is running a waitress wsgi http server. I read it is common to deliver wsgi applications with nginx + uwsgi, but I wanted to play with mongrel2 (that I already tested with Cubicweb a while ago), and give a try to the circus + chaussette stack.

    I ran my tests :

    • using ab the simple Apache benchmark tool (aka ApacheBench) ;
    • on a clone of our logilab.org forge ;
    • on my laptop (Intel Core i7, 2.67GHz, quad core, 8Go),
    • using a postgresql 9.1 database server.

    Setup

    In order to be able to start the application as a wsgi app, a small python script is required. I extracted a small part of the cubicweb-pyramid ccplugin.py file into a elo.py file for this:

    appid = 'elo2'
    
    cwconfig = cwcfg.config_for(appid)
    application = wsgi_application_from_cwconfig(cwconfig)
    repo = cwconfig.repository()
    repo.start_looping_tasks()
    

    I tested 5 configurations: twisted, pyramid, mongrel2+wsgid, uwsgi and circus+chaussette. When possible, they were tested with 1 worker and 4 workers.

    Legacy Twisted mode

    Using good old legacy twisted setup:

    cubicwebctl start -D -l info elo
    

    The config setting that worth noting are:

    webserver-threadpool-size=6
    connections-pool-size=6
    

    Basic Pyramid mode

    Using the pyramid command that uses waitress:

    cubicwebctl pyramid --no-daemon -l info elo
    

    Mongrel2 + wsgid

    I have not been able to use uwsgi-mongrel2 as wsgi backend for mongrel2, since this uwsgi plugin is not provided by the uwsgi debian packages. I've used wsgid instead (sadly, the project appears to be dead).

    The mongrel config is:

    main = Server(
       uuid="f400bf85-4538-4f7a-8908-67e313d515c2",
       access_log="/logs/access.log",
       error_log="/logs/error.log",
       chroot="./",
       default_host="localhost",
       name="test",
       pid_file="/pid/mongrel2.pid",
       bind_addr="0.0.0.0",
       port=8083,
       hosts = [
           Host(name="localhost",
                routes={'/': Handler(send_spec='tcp://127.0.0.1:5000',
                                     send_ident='2113523d-f5ff-4571-b8da-8bddd3587475',
                                     recv_spec='tcp://127.0.0.1:5001',
                                     recv_ident='')
                       })
               ]
       )
    
    servers = [main]
    

    and the wsgid server is started with:

    wsgid --recv tcp://127.0.0.1:5000 --send tcp://127.0.0.1:5001 --keep-alive \
    --workers <N> --wsgi-app elo.application --app-path .
    

    uwsgi

    The config file used to start uwsgi is:

    [uwsgi]
    stats = 127.0.0.1:9191
    processes = <N>
    wsgi-file = elo.py
    http = :8085
    plugin = http,python
    virtualenv = /home/david/hg/grshells/venv/jpl
    enable-threads = true
    lazy-apps = true
    

    The tricky config option there is lazy-apps which must be set, otherwise the worker processes are forked after loading the cubicweb application, which this later does not support. If you omit this, only one worker will get the requests.

    circus + chaussette

    For the circus setup, I have used this configuration file:

    [circus]
    check_delay = 5
    endpoint = tcp://127.0.0.1:5555
    pubsub_endpoint = tcp://127.0.0.1:5556
    stats_endpoint = tcp://127.0.0.1:5557
    statsd = True
    httpd = True
    httpd_host = localhost
    httpd_port = 8086
    
    [watcher:webworker]
    cmd = /home/david/hg/grshells/venv/jpl/bin/chaussette --fd $(circus.sockets.webapp) elo2.app
    use_sockets = True
    numprocesses = 4
    
    [env:webworker]
    PATH=/home/david/hg/grshells/venv/jpl/bin:/usr/local/bin:/usr/bin:/bin
    CW_INSTANCES_DIR=/home/david/hg/grshells/grshell-jpl/etc
    PYTHONPATH=/home/david/hg/grshells//grshell-jpl
    
    [socket:webapp]
    host = 127.0.0.1
    port = 8085
    

    Results

    The bench are very simple; 100 requests from 1 worker or 500 requests from 5 concurrent workers, getting the main index page for the application:

    One ab worker

    ab -n 100 -c 1 http://127.0.0.1:8085/
    

    We get:

    Synthesis (1 client)

    Response times are:

    Response time (1 client)

    Five ab workers

    ab -n 500 -c 5 http://127.0.0.1:8085/
    

    We get:

    Synthesis (5 clients)

    Response times are:

    Response time (5 clients)

    Conclusion

    As expected, the legacy (and still default) twisted-based server is the least efficient method to serve a cubicweb application.

    When comparing results with only one CubicWeb worker, the pyramid+waitress solution that comes with cubicweb-pyramid is the most efficient, but mongrel2 + wsgid and circus + chaussette solutions mostly have similar performances when only one worker is activated. Surprisingly, the uwsgi solution is significantly less efficient, and especially have some requests that take significantly longer than other solutions (even the legacy twisted-based server).

    The price for activating several workers is small (around 3%) but significant when only one client is requesting the application. It is still unclear why.

    When there are severel workers requesting the application, it's not a surpsise that solutions with 4 workers behave significanly better (we are still far from a linear response however, roughly a 2x better for 4x the horsepower; maybe the hardware is the main reason for this unexpected non-linear response).

    I am quite surprised that uwsgi behaved significantly worse than the 2 other scalable solutions.

    Mongrel2 is still very efficient, but sadly the wsgid server I've used for these tests has not been developed for 2 years, and the uwsgi plugin for mongrel2 is not yet available on Debian.

    On the other side, I am very pleasantly surprised by circus + chaussette. Circus also comes with some nice features like a nice web dashboard which allows to add or remove workers dynamically:

    //www.cubicweb.org/file/5272071/raw //www.cubicweb.org/file/5272077/raw

  • CubicWeb Roadmap meeting on March 5th 2015

    2015/03/11 by David Douard

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. The previous roadmap meeting was in January 2015.

    Christophe de Vienne (Unlish) and Aurélien Campéas (self-employed) joined us.

    Christophe de Vienne asked for discussions on:

    • Security Context: settle on an approach, and make it happen.
    • Pyramid Cubicweb adoption: where are we? what authentication stack do we want by default?
    • Package layout (aka "develop mode" friendliness): let's get real
    • Documentation: is the restructuration attempt (https://www.cubicweb.org/ticket/4832808) a credible path for the documentation?

    Aurélien Campéas asked for discussions on:

    • status of integration in the 3.21 branch
    • a new API for cubicweb stores

    Sylvain Thénault asked for discussions on:

    • a new API for dataimport (including cubicweb stores, but not only),
    • new integrators on CW

    Versions

    Cubicweb

    Version 3.18

    This version is stable but old and maintained (current is 3.18.8).

    Version 3.19

    This version is stable and maintained (current is 3.19.9).

    Version 3.20

    This version is now stable and maintained (current is 3.20.4).

    Version 3.21

    See below

    Agenda

    Next roadmap meeting will be held at the beginning of may 2015 at Logilab. Interested parties are invited to get in touch.

    Open Discussions

    New integrators

    Rémi Cardona (rcardona) and Denis Laxaldle (dlaxalde) have now the publish access level on Cubicweb repositories.

    Security context

    Christophe exposed his proposal for a "security context" in Cubicweb, as exposed in https://lists.cubicweb.org/pipermail/cubicweb/2015-February/002278.html and https://lists.cubicweb.org/pipermail/cubicweb/2015-February/002297.html with a proposition of implementation (see https://www.cubicweb.org/ticket/4919855 )

    The idea has been validated based on a substitution variables, which names will start with "ctx:" (the RQL grammar will have to be modified to accept a ":")

    This will then allow to write RQL queries like (API still to be tuned):

    X owned_by U, U eid %(ctx:cwuser_eid)s
    

    Pyramid

    The pyramid-based web server proposed by Christophe and used for its unlish website is still under test and evaluation at Logilab. There are missing features (implemented in cubes) required to be able to deploy pyramid-cubicweb for most of the applications used at Logilab, especially cubicweb-signedrequest

    In order to make it possible to implement authentication cubes like cubicweb-signedrequest, the pyramid-cubicweb requires some modifications. These has been developped and are about to be published, along with a new version of signedrequest that provide pyramid compatibility.

    There are still some dependencies that lack a proper Debian package, but that should be done in the next few weeks.

    In order to properly identify pyramid-related code in a cube, it has been proposed that these code should go in modules in the cube named pviews and pconfig (note that most cube won't require any pyramid specific code). The includeme function should however be in the cube's main packgage (in the __init__.py file)

    There have been some discussions about the fact that, for now, a pyramid-cubicweb instance requires an anonymous user/access, which can also be a problem for some application.

    Layout

    Christophe pointed the fact that the directory/files layout of cubicweb and cubes do not follow current Python's de facto standards, which makes cubicweb hard to use in a context of virtualenv/pip based installation. There is the CWEP004 discussing some aspects of this problem.

    The decision has been taken to move toward a Cubicweb ecosystem that is more pip-friendly. This will be done step by step, starting with the dependencies (packages currently living in the logilab "namespace").

    Then we will investigate the feasibility of migrating the layout of Cubicweb itself.

    Documentation

    The new documentation structure has been approved.

    It has been proposed (and more or less accepted) to extract the documentation in a dedicated project. This is not a priority, however.

    Roadmap for 3.21

    No change since last meeting:

    • the complete removal of the dbapi, the merging of Connection and ClientConnection. remains
    • Integrate the pyramid cube to provide the pyramid command if the pyramid framework can be imported: removed (too soon, pyramid-cubicweb's APIs are not stable enough)
    • Integration of CWEP-003 (FROM clause for RQL): removed (will probably never be included unless someone needs it)
    • CWEP-004 (cubes as standard python packages) is being discussed: removed (not for 3.21, see above)

    dataimports et stores

    A heavy refactoring is under way that concerns data import in CubicWeb. The main goal is to design a single API to be used by the various cubes that accelerate the insertion of data (dataio, massiveimport, fastimport, etc) as well as the internal CWSource and its data feeds.

    For details, see the thread on the mailing-list and the patches arriving in the review pipeline.


  • Cubicweb sprints winter/spring 2014

    2014/01/24 by David Douard

    The Logilab team is pleased to announce two Cubicweb sprints to be held in its Paris offices in the upcoming months:

    February 13/14th at Logilab in Paris

    The agenda would be the FROM clause for which a CWEP is still awaited, and the RQL rewriter according to the CWEP02.

    April 28/30th at Logilab in Paris

    Agenda to be defined.

    Join the party

    All users and contributors of CubicWeb are invited to join the party. Just send an email to contact at Logilab.fr if you plan to come.

    http://farm1.static.flickr.com/183/419945378_4ead41a76d_m.jpg