show 131 results

Blog entries

  • Monitor all the things! ... and early too!

    2016/09/16 by Arthur Lutz

    Following the "release often, release early" mantra, I thought it might be a good idea to apply it to monitoring on one of our client projects. So right from the demo stage where we deliver a new version every few weeks (and sometimes every few days), we setup some monitoring.

    https://www.cubicweb.org/file/15338085/raw/66511658.jpg

    Monitoring performance

    The project is an application built with the CubicWeb platform, with some ElasticSearch for indexing and searching. As with any complex stack, there are a great number of places where one could monitor performance metrics.

    https://www.cubicweb.org/file/15338628/raw/Screenshot_2016-09-16_12-19-21.png

    Here are a few things we have decided to monitor, and with what tools.

    Monitoring CubicWeb

    To monitor our running Python code, we have decided to use statsd, since it is already built into CubicWeb's core. Out of the box, you can configure a statsd server address in your all-in-one.conf configuration. That will send out some timing statistics about some core functions.

    The statsd server (there a numerous implementations, we use a simple one : python-pystatsd) gets the raw metrics and outputs them to carbon which stores the time series data in whisper files (which can be swapped out for a different technology if need be).

    https://www.cubicweb.org/file/15338392/raw/Screenshot_2016-09-16_11-56-44.png

    If we are curious about a particular function or view that might be taking too long to generate or slow down the user experience, we can just add the @statsd_timeit decorator there. Done. It's monitored.

    statsd monitoring is a fire-and-forget UDP type of monitoring, it should not have any impact on the performance of what you are monitoring.

    Monitoring Apache

    Simply enough we re-use the statsd approach by plugging in an apache module to time the HTTP responses sent back by apache. With nginx and varnish, this is also really easy.

    https://www.cubicweb.org/file/15338407/raw/Screenshot_2016-09-16_11-56-54.png

    One of the nice things about this part is that we can then get graphs of errors since we will differentiate OK 200 type codes from 500 type codes (HTTP codes).

    Monitoring ElasticSearch

    ElasticSearch comes with some metrics in GET /_stats endpoint, the same goes for individual nodes, individual indices and even at cluster level. Some popular tools can be installed through the ElasticSearch plugin system or with Kibana (plugin system there too).

    We decided on a different approach that fitted well with our other tools (and demonstrates their flexibility!) : pull stats out of ElasticSearch with SaltStack, push them to Carbon, pull them out with Graphite and display them in Grafana (next to our other metrics).

    https://www.cubicweb.org/file/15338399/raw/Screenshot_2016-09-16_11-56-34.png

    On the SaltStack side, we wrote a two line execution module (elasticsearch.py)

    import requests
    def stats:
        return request.get('http://localhost:9200/_stats').json()
    

    This gets shipped using the custom execution modules mechanism (_modules and saltutils.sync_modules), and is executed every minute (or less) in the salt scheduler. The resulting dictionary is fed to the carbon returner that is configured to talk to a carbon server somewhere nearby.

    # salt demohost elasticsearch.stats
    [snip]
      { "indextime_inmillis" : 30,
    [snip]
    

    Monitoring web metrics

    To evaluate parts of the performance of a web page we can look at some metrics such as the number of assets the browser will need to download, the size of the assets (js, css, images, etc) and even things such as the number of subdomains used to deliver assets. You can take a look at such metrics in most developer tools available in the browser, but we want to graph this over time. A nice tool for this is sitespeed.io (written in javascript with phantomjs). Out of the box, it has a graphite outputter so we just have to add --graphiteHost FQDN. sitespeed.io even recommends using grafana to visualize the results and publishes some example dashboards that can be adapted to your needs.

    https://www.cubicweb.org/file/15338109/raw/sitespeed-logo-2c.png

    The sitespeed.io command is configured and run by salt using pillars and its scheduler.

    We will have to take a look at using their jenkins plugin with our jenkins continuous integration instance.

    Monitoring crashes / errors / bugs

    Applications will have bugs (in particular when released often to get a client to validate some design choices early). Level 0 is having your client calling you up saying the application has crashed. The next level is watching some log somewhere to see those errors pop up. The next level is centralised logs on which you can monitor the numerous pieces of your application (rsyslog over UDP helps here, graylog might be a good solution for visualisation).

    https://www.cubicweb.org/file/15338139/raw/Screenshot_2016-09-16_11-30-53.png

    When it starts getting useful and usable is when your bugs get reported with some rich context. That's when using sentry gets in. It's free software developed on github (although the website does not really show that) and it is written in python, so it was a good match for our culture. And it is pretty awesome too.

    We plug sentry into our WSGI pipeline (thanks to cubicweb-pyramid) by installing and configuring the sentry cube : cubicweb-sentry. This will catch rich context bugs and provide us with vital information about what the user was doing when the crash occured.

    This also helps sharing bug information within a team.

    The sentry cube reports on errors being raised when using the web application, but can also catch some errors when running some maintenance or import commands (ccplugins in CubicWeb). In this particular case, a lot of importing is being done and Sentry can detect and help us triage the import errors with context on which files are failing.

    Monitoring usage / client side

    This part is a bit neglected for the moment. Client side we can use Javascript to monitor usage. Some basic metrics can come from piwik which is usually used for audience statistics. To get more precise statistics we've been told Boomerang has an interesting approach, enabling a closer look at how fast a page was displayed client side, how much time was spend on DNS, etc.

    On the client side, we're also looking at two features of the Sentry project : the raven-js client which reports Javascript errors directly from the browser to the Sentry server, and the user feedback form which captures some context when something goes wrong or a user/client wants to report that something should be changed on a given page.

    Load testing - coverage

    To wrap up, we also often generate traffic to catch some bugs and performance metrics automatically :

    • wget --mirror $URL
    • linkchecker $URL
    • for $search_term in cat corpus; do wget URL/$search_term ; done
    • wapiti $URL --scope page
    • nikto $URL

    Then watch the graphs and the errors in Sentry... Fix them. Restart.

    Graphing it in Grafana

    We've spend little time on the dashboard yet since we're concentrating on collecting the metrics for now. But here is a glimpse of the "work in progress" dashboard which combines various data sources and various metrics on the same screen and the same time scale.

    https://www.cubicweb.org/file/15338648/raw/Screenshot_2016-09-13_09-41-45.png

    Further plans

    • internal health checks, we're taking a look at python-hospital and healthz: Stop reverse engineering applications and start monitoring from the inside (Monitorama) (the idea is to distinguish between the app is running and the app is serving it's purpose), and pyramid_health
    • graph the number of Sentry errors and the number of types of errors: the sentry API should be able to give us this information. Feed it to Salt and Carbon.
    • setup some alerting : next versions of Grafana will be doing that, or with elastalert
    • setup "release version X" events in Graphite that are displayed in Grafana, maybe with some manual command or a postcreate command when using docker-compose up ?
    • make it easier for devs to have this kind of setup. Using this suite of tools in developement might sometimes be overkill, but can be useful.

  • Status of the CubicWeb python3 porting effort, February 2016

    2016/02/05 by Julien Cristau

    An effort to port CubicWeb to a dual python 2.6/2.7 and 3.3+ code base was started by Rémi Cardona in summer of 2014. The first task was to port all of CubicWeb's dependencies:

    • logilab-common 0.63
    • logilab-database 1.14
    • logilab-mtconverter 0.9
    • logilab-constraint 0.6
    • yams 0.40
    • rql 0.34

    Once that was out of the way, we could start looking at CubicWeb itself. We first set out to make sure we used python3-compatible syntax in all source files, then started to go and make as much of the test suite as possible pass under both python2.7 and python3.4. As of the 3.22 release, we are almost there. The remaining pain points are:

    • cubicweb's setup.py hadn't been converted. This is fixed in the 3.23 branch as of https://hg.logilab.org/master/cubicweb/rev/0b59724cb3f2 (don't follow that link, the commit is huge)
    • the CubicWebServerTC test class uses twisted to start an http server thread, and twisted itself is not available for python3
    • the current method to serialize schema constraints into CWConstraint objects gives different results on python2 and python3, so it needs to be fixed (https://www.logilab.org/ticket/296748)
    • various questions around packaging and deployment: what happens to e.g. the cubicweb-common package installing into python2's site-packages directory? What does the ${prefix}/share/cubicweb directory become? How do cubes express their dependencies? Do we need a flag day? What does that mean for applications?

  • Using JSONAPI as a Web API format for CubicWeb

    2016/01/26 by Denis Laxalde

    Following the introduction post about rethinking the web user interface of CubicWeb, this article will address the topic of the Web API to exchange data between the client and the server. As mentioned earlier, this question is somehow central and deserves particular interest, and better early than late. Of the two candidate representations previously identified Hydra and JSON API, this article will focus on the later. Hopefully, this will give a better insight of the capabilities and limits of this specification and would help take a decision, though a similar experiment with another candidate would be good to have. Still in the process of blog driven development, this post has several open questions from which a discussion would hopefully emerge...

    A glance at JSON API

    JSON API is a specification for building APIs that use JSON as a data exchange format between clients and a server. The media type is application/vnd.api+json. It has a 1.0 version available from mid-2015. The format has interesting features such as the ability to build compound documents (i.e. response made of several, usually related, resources) or to specify filtering, sorting and pagination.

    A document following the JSON API format basically represents resource objects, their attributes and relationships as well as some links also related to the data of primary concern.

    Taking the example of a Ticket resource modeled after the tracker cube, we could have a JSON API document formatted as:

    GET /ticket/987654
    Accept: application/vnd.api+json
    
    {
      "links": {
        "self": "https://www.cubicweb.org/ticket/987654"
      },
      "data": {
        "type": "ticket",
        "id": "987654",
        "attributes": {
          "title": "Let's use JSON API in CubicWeb"
          "description": "Well, let's try, at least...",
        },
        "relationships": {
          "concerns": {
            "links": {
              "self": "https://www.cubicweb.org/ticket/987654/relationships/concerns",
              "related": "https://www.cubicweb.org/ticket/987654/concerns"
            },
            "data": {"type": "project", "id": "1095"}
          },
          "done_in": {
            "links": {
              "self": "https://www.cubicweb.org/ticket/987654/relationships/done_in",
              "related": "https://www.cubicweb.org/ticket/987654/done_in"
            },
            "data": {"type": "version", "id": "998877"}
          }
        }
      },
      "included": [{
        "type": "project",
        "id": "1095",
        "attributes": {
            "name": "CubicWeb"
        },
        "links": {
          "self": "https://www.cubicweb.org/project/cubicweb"
        }
      }]
    }
    

    In this JSON API document, top-level members are links, data and included. The later is here used to ship some resources (here a "project") related to the "primary data" (a "ticket") through the "concerns" relationship as denoted in the relationships object (more on this later).

    While the decision of including or not these related resources along with the primary data is left to the API designer, JSON API also offers a specification to build queries for inclusion of related resources. For example:

    GET /ticket/987654?include=done_in
    Accept: application/vnd.api+json
    

    would lead to a response including the full version resource along with the above content.

    Enough for the JSON API overview. Next I'll present how various aspects of data fetching and modification can be achieved through the use of JSON API in the context of a CubicWeb application.

    CRUD

    CRUD of resources is handled in a fairly standard way in JSON API, relying of HTTP protocol semantics.

    For instance, creating a ticket could be done as:

    POST /ticket
    Content-Type: application/vnd.api+json
    Accept: application/vnd.api+json
    
    {
      "data": {
        "type": "ticket",
        "attributes": {
          "title": "Let's use JSON API in CubicWeb"
          "description": "Well, let's try, at least...",
        },
        "relationships": {
          "concerns": {
            "data": { "type": "project", "id": "1095" }
          }
        }
      }
    }
    

    Then updating it (assuming we got its id from a response to the above request):

    PATCH /ticket/987654
    Content-Type: application/vnd.api+json
    Accept: application/vnd.api+json
    
    {
      "data": {
        "type": "ticket",
        "id": "987654",
        "attributes": {
          "description": "We'll succeed, for sure!",
        },
      }
    }
    

    Relationships

    In JSON API, a relationship is in fact a first class resource as it is defined by a noun and an URI through a link object. In this respect, the client just receives a couple of links and can eventually operate on them using the proper HTTP verb. Fetching or updating relationships is done using the special <resource url>/relationships/<relation type> endpoint (self member of relationships items in the first example). Quite naturally, the specification relies on GET verb for fetching targets, PATCH for (re)setting a relation (i.e. replacing its targets), POST for adding targets and DELETE to drop them.

    GET /ticket/987654/relationships/concerns
    Accept: application/vnd.api+json
    
    {
      "data": {
        "type": "project",
        "id": "1095"
      }
    }
    
    PATCH /ticket/987654/relationships/done_in
    Content-Type: application/vnd.api+json
    Accept: application/vnd.api+json
    
    {
      "data": {
        "type": "version",
        "id": "998877"
      }
    }
    

    The body of request and response of this <resource url>/relationships/<relation type> endpoint consists of so-called resource identifier objects which are lightweight representation of resources usually only containing information about their "type" and "id" (enough to uniquely identify them).

    Related resources

    Remember the related member appearing in relationships links in the first example?

      [ ... ]
      "done_in": {
        "links": {
          "self": "https://www.cubicweb.org/ticket/987654/relationships/done_in",
          "related": "https://www.cubicweb.org/ticket/987654/done_in"
        },
        "data": {"type": "version", "id": "998877"}
      }
      [ ... ]
    

    While this is not a mandatory part of the specification, it has an interesting usage for fetching relationship targets. In contrast with the .../relationships/... endpoint, this one is expected to return plain resource objects (which attributes and relationships information in particular).

    GET /ticket/987654/done_in
    Accept: application/vnd.api+json
    
    {
      "links": {
        "self": "https://www.cubicweb.org/998877"
      },
      "data": {
        "type": "version",
        "id": "998877",
        "attributes": {
            "number": 4.2
        },
        "relationships": {
          "version_of": {
            "self": "https://www.cubicweb.org/998877/relationships/version_of",
            "data": { "type": "project", "id": "1095" }
          }
        }
      },
      "included": [{
        "type": "project",
        "id": "1095",
        "attributes": {
            "name": "CubicWeb"
        },
        "links": {
          "self": "https://www.cubicweb.org/project/cubicweb"
        }
      }]
    }
    

    Meta information

    The JSON API specification allows to include non-standard information using a so-called meta object. This can be found in various place of the document (top-level, resource objects or relationships object). Usages of this field is completely free (and optional). For instance, we could use this field to store the workflow state of a ticket:

    {
      "data": {
        "type": "ticket",
        "id": "987654",
        "attributes": {
          "title": "Let's use JSON API in CubicWeb"
        },
        "meta": { "state": "open" }
    }
    

    Permissions

    Permissions are part of metadata to be exchanged during request/response cycles. As such, the best place to convey this information is probably within the headers. According to JSON API's FAQ, this is also the recommended way for a resource to advertise on supported actions.

    So for instance, response to a GET request could include Allow headers, indicating which request methods are allowed on the primary resource requested:

    GET /ticket/987654
    Allow: GET, PATCH, DELETE
    

    An HEAD request could also be used for querying allowed actions on links (such as relationships):

    HEAD /ticket/987654/relationships/comments
    Allow: POST
    

    This approach has the advantage of being standard HTTP, no particular knowledge of the permissions model is required and the response body is not cluttered with these metadata.

    Another possibility would be to rely use the meta member of JSON API data.

    {
      "data": {
        "type": "ticket",
        "id": "987654",
        "attributes": {
          "title": "Let's use JSON API in CubicWeb"
        },
        "meta": {
          "permissions": ["read", "update"]
        }
      }
    }
    

    Clearly, this would minimize the amount client/server requests.

    More Hypermedia controls

    With the example implementation described above, it appears already possible to manipulate several aspects of the entity-relationship database following a CubicWeb schema: resources fetching, CRUD operations on entities, set/delete operations on relationships. All these "standard" operations are discoverable by the client simply because they are baked into the JSON API format: for instance, adding a target to some relationship is possible by POSTing to the corresponding relationship resource something that conforms to the schema.

    So, implicitly, this already gives us a fairly good level of Hypermedia control so that we're not so far from having a mature REST architecture according to the Richardson Maturity Model. But beyond these "standard" discoverable actions, the JSON API specification does not address yet Hypermedia controls in a generic manner (see this interesting discussion about extending the specification for this purpose).

    So the question is: would we want more? Or, in other words, do we need to define "actions" which would not map directly to a concept in the application model?

    In the case of a CubicWeb application, the most obvious example (that I could think of) of where such an "action" would be needed is workflow state handling. Roughly, workflows in CubicWeb are modeled through two entity types State and TrInfo (for "transition information"), the former being handled through the latter, and a relationship in_state between the workflowable entity type at stake and its current State. It does not appear so clearly how would one model this in terms of HTTP resource. (Arguably we wouldn't want to expose the complexity of Workflow/TrInfo/State data model to the client, nor can we simply expose this in_state relationship, as a client would not be able to simply change the state of a entity by updating the relation). So what would be a custom "action" to handle the state of a workflowable resource? Back in our tracker example, how would we advertise to the client the possibility to perform "open"/"close"/"reject" actions on a ticket resource? Open question...

    Request for comments

    In this post, I tried to give an overview of a possible usage of JSON API to build a Web API for CubicWeb. Several aspects were discussed from simple CRUD operations, to relationships handling or non-standard actions. In many cases, there are open questions for which I'd love to receive feedback from the community. Recalling that this topic is a central part of the experiment towards building a client-side user interface to CubicWeb, the more discussion it gets, the better!

    For those wanting to try and play themselves with the experiments, have a look at the code. This is a work-in-progress/experimental implementation, relying on Pyramid for content negotiation and route traversals.

    What's next? Maybe an alternative experiment relying on Hydra? Or an orthogonal one playing with the schema client-side?


  • Happy New Year CubicWeb !

    2016/01/25 by Nicolas Chauvat

    This CubicWeb blog that has been asleep for some months, whereas the development was active. Let me try to summarize the recent progress.

    https://upload.wikimedia.org/wikipedia/commons/thumb/f/f1/New_Year_Ornaments_%282%29.JPG/320px-New_Year_Ornaments_%282%29.JPG

    CubicWeb 3.21

    CubicWeb 3.21 was published in July 2015. The announce was sent to the mailing list and changes were listed in the documentation.

    The main goal of this release was to reduce the technical debt. The code was improved, but the changes were not directly visible to users.

    CubicWeb 3.22

    CubicWeb 3.22 was published in January 2016. A mail was sent to the mailing list and the documentation was updated with the list of changes.

    The main achievements of this release were the inclusion of a new procedure to massively import data when using a Postgresql backend, improvements of migrations and customization of generic JSON exports.

    Roadmap and bi-monthly meetings

    After the last-minute cancellation of the may 2015 roadmap meeting, we failed to reschedule in june, the summer arrived, then the busy-busy end of the year... and voilà, we are in 2016.

    During that time, Logilab has been working on massive data import, full-js user interfaces exchanging JSON with the CubicWeb back-end, 3D in the browser, switching CubicWeb to Python3, moving its own apps to Bootstrap, using CubicWeb-Pyramid in production and improving management/supervision, etc. We will be more than happy to discuss this with the rest of the (small but strong) CubicWeb community.

    So let's wish a happy new year to everyone and meet again in March for a new roadmap session !


  • Towards building a JavaScript user interface to CubicWeb

    2016/01/08 by Denis Laxalde

    This post is an introduction of a series of articles dealing with an on-going experiment on building a JavaScript user interface to CubicWeb, to ultimately replace the web component of the framework. The idea of this series is to present the main topics of the experiment, with open questions in order to eventually engage the community as much as possible. The other side of this is to experiment a blog driven development process, so getting feedback is the very point of it!

    As of today, three main topics have been identified:

    • the Web API to let the client and server communicate,
    • the issue of representing the application schema client-side, and,
    • the construction of components of the web interface (client-side).

    As part of the first topic, we'll probably rely on another experimental work about REST-fulness undertaken recently in pyramid-cubicweb (see this head for source code). Then, it appears quite clearly that we'll need sooner or later a representation of data on the client-side and that, quite obviously, the underlying format would be JSON. Apart from exchanging of entities (database) information, we already anticipate on the need for the HATEOAS part of REST. We already took some time to look at the existing possibilities. At a first glance, it seems that hydra is the most promising in term of capabilities. It's also built using semantic web technologies which definitely grants bonus point for CubicWeb. On the other hand, it seems a bit isolated and very experimental, while JSON API follows a more pragmatic approach (describe itself as an anti-bikeshedding tool) and appears to have more traction from various people. For this reason, we choose it for our first draft, but this topic seems so central in a new UI, and hard to hide as an implementation detail; that it definitely deserves more discussion. Other candidates could be Siren, HAL or Uber.

    Concerning the schema, it seems that there is consensus around JSON-Schema so we'll certainly give it a try.

    Finally, while there is nothing certain as of today we'll probably start on building components of the web interface using React, which is also getting quite popular these days. Beyond that choice, the first practical task in this topic will concern the primary view system. This task being neither too simple nor too complicated will hopefully result in a clearer overview of what the project will imply. Then, the question of edition will come up at some point. In this respect, perhaps it'll be a good time to put the UX question at a central place, in order to avoid design issues that we had in the past.

    Feedback welcome!


  • Serving Cubicweb via WSGI with Pyramid: comparing the options

    2015/04/21 by David Douard

    CubicWeb can now be powered by Pyramid (thank you so much Christophe) instead of Twisted.

    I aim at moving all our applications to CubicWeb/Pyramid, so I wonder what will be the best way to deliver them. For now, we have a setup made of Apache + Varnish + Cubicweb/Twisted. In some applications we have two CubicWeb instances with a naive load balacing managed by Varnish.

    When moving to cubicweb-pyramid, there are several options. By default, a cubicweb-pyramid instance started via the cubicweb-ctl pyramid command, is running a waitress wsgi http server. I read it is common to deliver wsgi applications with nginx + uwsgi, but I wanted to play with mongrel2 (that I already tested with Cubicweb a while ago), and give a try to the circus + chaussette stack.

    I ran my tests :

    • using ab the simple Apache benchmark tool (aka ApacheBench) ;
    • on a clone of our logilab.org forge ;
    • on my laptop (Intel Core i7, 2.67GHz, quad core, 8Go),
    • using a postgresql 9.1 database server.

    Setup

    In order to be able to start the application as a wsgi app, a small python script is required. I extracted a small part of the cubicweb-pyramid ccplugin.py file into a elo.py file for this:

    appid = 'elo2'
    
    cwconfig = cwcfg.config_for(appid)
    application = wsgi_application_from_cwconfig(cwconfig)
    repo = cwconfig.repository()
    repo.start_looping_tasks()
    

    I tested 5 configurations: twisted, pyramid, mongrel2+wsgid, uwsgi and circus+chaussette. When possible, they were tested with 1 worker and 4 workers.

    Legacy Twisted mode

    Using good old legacy twisted setup:

    cubicwebctl start -D -l info elo
    

    The config setting that worth noting are:

    webserver-threadpool-size=6
    connections-pool-size=6
    

    Basic Pyramid mode

    Using the pyramid command that uses waitress:

    cubicwebctl pyramid --no-daemon -l info elo
    

    Mongrel2 + wsgid

    I have not been able to use uwsgi-mongrel2 as wsgi backend for mongrel2, since this uwsgi plugin is not provided by the uwsgi debian packages. I've used wsgid instead (sadly, the project appears to be dead).

    The mongrel config is:

    main = Server(
       uuid="f400bf85-4538-4f7a-8908-67e313d515c2",
       access_log="/logs/access.log",
       error_log="/logs/error.log",
       chroot="./",
       default_host="localhost",
       name="test",
       pid_file="/pid/mongrel2.pid",
       bind_addr="0.0.0.0",
       port=8083,
       hosts = [
           Host(name="localhost",
                routes={'/': Handler(send_spec='tcp://127.0.0.1:5000',
                                     send_ident='2113523d-f5ff-4571-b8da-8bddd3587475',
                                     recv_spec='tcp://127.0.0.1:5001',
                                     recv_ident='')
                       })
               ]
       )
    
    servers = [main]
    

    and the wsgid server is started with:

    wsgid --recv tcp://127.0.0.1:5000 --send tcp://127.0.0.1:5001 --keep-alive \
    --workers <N> --wsgi-app elo.application --app-path .
    

    uwsgi

    The config file used to start uwsgi is:

    [uwsgi]
    stats = 127.0.0.1:9191
    processes = <N>
    wsgi-file = elo.py
    http = :8085
    plugin = http,python
    virtualenv = /home/david/hg/grshells/venv/jpl
    enable-threads = true
    lazy-apps = true
    

    The tricky config option there is lazy-apps which must be set, otherwise the worker processes are forked after loading the cubicweb application, which this later does not support. If you omit this, only one worker will get the requests.

    circus + chaussette

    For the circus setup, I have used this configuration file:

    [circus]
    check_delay = 5
    endpoint = tcp://127.0.0.1:5555
    pubsub_endpoint = tcp://127.0.0.1:5556
    stats_endpoint = tcp://127.0.0.1:5557
    statsd = True
    httpd = True
    httpd_host = localhost
    httpd_port = 8086
    
    [watcher:webworker]
    cmd = /home/david/hg/grshells/venv/jpl/bin/chaussette --fd $(circus.sockets.webapp) elo2.app
    use_sockets = True
    numprocesses = 4
    
    [env:webworker]
    PATH=/home/david/hg/grshells/venv/jpl/bin:/usr/local/bin:/usr/bin:/bin
    CW_INSTANCES_DIR=/home/david/hg/grshells/grshell-jpl/etc
    PYTHONPATH=/home/david/hg/grshells//grshell-jpl
    
    [socket:webapp]
    host = 127.0.0.1
    port = 8085
    

    Results

    The bench are very simple; 100 requests from 1 worker or 500 requests from 5 concurrent workers, getting the main index page for the application:

    One ab worker

    ab -n 100 -c 1 http://127.0.0.1:8085/
    

    We get:

    Synthesis (1 client)

    Response times are:

    Response time (1 client)

    Five ab workers

    ab -n 500 -c 5 http://127.0.0.1:8085/
    

    We get:

    Synthesis (5 clients)

    Response times are:

    Response time (5 clients)

    Conclusion

    As expected, the legacy (and still default) twisted-based server is the least efficient method to serve a cubicweb application.

    When comparing results with only one CubicWeb worker, the pyramid+waitress solution that comes with cubicweb-pyramid is the most efficient, but mongrel2 + wsgid and circus + chaussette solutions mostly have similar performances when only one worker is activated. Surprisingly, the uwsgi solution is significantly less efficient, and especially have some requests that take significantly longer than other solutions (even the legacy twisted-based server).

    The price for activating several workers is small (around 3%) but significant when only one client is requesting the application. It is still unclear why.

    When there are severel workers requesting the application, it's not a surpsise that solutions with 4 workers behave significanly better (we are still far from a linear response however, roughly a 2x better for 4x the horsepower; maybe the hardware is the main reason for this unexpected non-linear response).

    I am quite surprised that uwsgi behaved significantly worse than the 2 other scalable solutions.

    Mongrel2 is still very efficient, but sadly the wsgid server I've used for these tests has not been developed for 2 years, and the uwsgi plugin for mongrel2 is not yet available on Debian.

    On the other side, I am very pleasantly surprised by circus + chaussette. Circus also comes with some nice features like a nice web dashboard which allows to add or remove workers dynamically:

    //www.cubicweb.org/file/5272071/raw //www.cubicweb.org/file/5272077/raw

  • CubicWeb Roadmap meeting on March 5th 2015

    2015/03/11 by David Douard

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. The previous roadmap meeting was in January 2015.

    Christophe de Vienne (Unlish) and Aurélien Campéas (self-employed) joined us.

    Christophe de Vienne asked for discussions on:

    • Security Context: settle on an approach, and make it happen.
    • Pyramid Cubicweb adoption: where are we? what authentication stack do we want by default?
    • Package layout (aka "develop mode" friendliness): let's get real
    • Documentation: is the restructuration attempt (https://www.cubicweb.org/ticket/4832808) a credible path for the documentation?

    Aurélien Campéas asked for discussions on:

    • status of integration in the 3.21 branch
    • a new API for cubicweb stores

    Sylvain Thénault asked for discussions on:

    • a new API for dataimport (including cubicweb stores, but not only),
    • new integrators on CW

    Versions

    Cubicweb

    Version 3.18

    This version is stable but old and maintained (current is 3.18.8).

    Version 3.19

    This version is stable and maintained (current is 3.19.9).

    Version 3.20

    This version is now stable and maintained (current is 3.20.4).

    Version 3.21

    See below

    Agenda

    Next roadmap meeting will be held at the beginning of may 2015 at Logilab. Interested parties are invited to get in touch.

    Open Discussions

    New integrators

    Rémi Cardona (rcardona) and Denis Laxaldle (dlaxalde) have now the publish access level on Cubicweb repositories.

    Security context

    Christophe exposed his proposal for a "security context" in Cubicweb, as exposed in https://lists.cubicweb.org/pipermail/cubicweb/2015-February/002278.html and https://lists.cubicweb.org/pipermail/cubicweb/2015-February/002297.html with a proposition of implementation (see https://www.cubicweb.org/ticket/4919855 )

    The idea has been validated based on a substitution variables, which names will start with "ctx:" (the RQL grammar will have to be modified to accept a ":")

    This will then allow to write RQL queries like (API still to be tuned):

    X owned_by U, U eid %(ctx:cwuser_eid)s
    

    Pyramid

    The pyramid-based web server proposed by Christophe and used for its unlish website is still under test and evaluation at Logilab. There are missing features (implemented in cubes) required to be able to deploy pyramid-cubicweb for most of the applications used at Logilab, especially cubicweb-signedrequest

    In order to make it possible to implement authentication cubes like cubicweb-signedrequest, the pyramid-cubicweb requires some modifications. These has been developped and are about to be published, along with a new version of signedrequest that provide pyramid compatibility.

    There are still some dependencies that lack a proper Debian package, but that should be done in the next few weeks.

    In order to properly identify pyramid-related code in a cube, it has been proposed that these code should go in modules in the cube named pviews and pconfig (note that most cube won't require any pyramid specific code). The includeme function should however be in the cube's main packgage (in the __init__.py file)

    There have been some discussions about the fact that, for now, a pyramid-cubicweb instance requires an anonymous user/access, which can also be a problem for some application.

    Layout

    Christophe pointed the fact that the directory/files layout of cubicweb and cubes do not follow current Python's de facto standards, which makes cubicweb hard to use in a context of virtualenv/pip based installation. There is the CWEP004 discussing some aspects of this problem.

    The decision has been taken to move toward a Cubicweb ecosystem that is more pip-friendly. This will be done step by step, starting with the dependencies (packages currently living in the logilab "namespace").

    Then we will investigate the feasibility of migrating the layout of Cubicweb itself.

    Documentation

    The new documentation structure has been approved.

    It has been proposed (and more or less accepted) to extract the documentation in a dedicated project. This is not a priority, however.

    Roadmap for 3.21

    No change since last meeting:

    • the complete removal of the dbapi, the merging of Connection and ClientConnection. remains
    • Integrate the pyramid cube to provide the pyramid command if the pyramid framework can be imported: removed (too soon, pyramid-cubicweb's APIs are not stable enough)
    • Integration of CWEP-003 (FROM clause for RQL): removed (will probably never be included unless someone needs it)
    • CWEP-004 (cubes as standard python packages) is being discussed: removed (not for 3.21, see above)

    dataimports et stores

    A heavy refactoring is under way that concerns data import in CubicWeb. The main goal is to design a single API to be used by the various cubes that accelerate the insertion of data (dataio, massiveimport, fastimport, etc) as well as the internal CWSource and its data feeds.

    For details, see the thread on the mailing-list and the patches arriving in the review pipeline.


  • CubicWeb roadmap meeting on January 8th, 2015

    2015/01/05 by Nicolas Chauvat

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. The previous roadmap meeting was in November 2014.

    Here is the report about the January 8th, 2015 meeting.

    Christophe de Vienne (Unlish) and Aurélien Campéas (self-employed) joined us to express their concerns and discuss the future of CubicWeb.

    Versions

    Version 3.18

    This version is stable but old and maintained (current is 3.18.7).

    Version 3.19

    This version is stable and maintained (current is 3.19.8).

    Version 3.20

    This version has been released a few days ago. It has not been deployed on production systems yet.

    Its main features are:

    • virtual relations: a new ComputedRelation class can be used in schema.py; its rule attribute is an RQL snippet that defines the new relation.

    • computed attributes: an attribute can now be defined with a formula argument (also an RQL snippet); it will be read-only, and updated automatically.

      Both of these features are described in CWEP-002, and the updated "Data model" chapter of the CubicWeb book.

    • cubicweb-ctl plugins can use the cubicweb.utils.admincnx function to get a Connection object from an instance name.

    • new 'tornado' wsgi backend

    • session cookies have the HttpOnly flag, so they're no longer exposed to javascript

    • rich text fields can be formatted as markdown

    • the edit controller detects concurrent editions, and raises a ValidationError if an entity was modified between form generation and submission

    • cubicweb can use a postgresql "schema" (namespace) for its tables

    • cubicweb-ctl configure can be used to set values of the admin user credentials in the sources configuration file

    For details read list of tickets for CubicWeb 3.20.0.

    We would have loved to integrate the pyramid cube in this release, but the debian packaging effort needed by the pyramid stack is quite big and is acceptable if we target jessie only (at decent price).

    Version 3.21

    For now, the roadmap for 3.21 is still the complete removal of the dbapi, the merging of Connection and ClientConnection.

    Integrate the pyramid cube to provide the pyramid command if the pyramid framework can be imported.

    Integration of CWEP-003 (FROM clause for RQL) and CWEP-004 (cubes as standard python packages) is being discussed.

    Version 4.0

    We expect to accelerate development of CubicWeb 4, which exact roadmap is still to be discussed, but we may already want:

    • be pyramid-based (remove twisted, auth management, etc.),
    • do not have anything left of old dbapi and ClientConnection,
    • integrate squareui as main (and only) web-ui "template" or remove web generation (almost) completely from cubicweb-core and provide it only through the cube system.

    Agenda

    Next roadmap meeting will be held at the beginning of march 2015 at Logilab. Interested parties are invited to get in touch.

    Open Discussions

    Refactoring the documentation

    Christophe de Vienne suggested to completely revamp the documentation and intends to lead this effort.

    Training material

    Aurélien Campéas asks if Logilab would be willing to share its training material under a free license to help interested parties organize and sell trainings.

    Towards making squareui the default rendering engine for cubicweb

    We are expecting to be able to use squareui/bootstrap as "rendering engine" for our forge applications (like http://www.cubicweb.org and http://www.logilab.org) as soon as possible. However to achieve to goal, there are still too many "visual bugs", some of which may require a discussion.

    Among others:

    • put the ctxtoolbar component in the <nav> div
    • each box component should have an icon (what API for this?)
    • we cannot easily make the left column of the main template responsive-aware (requires to change the html flow), so it's probably best to take inspiration from things like http://wrapbootstrap.com/preview/WB0N89JMK
    • facet boxes are a mess, there is no simple solution to have a "smart layout"

    Migration

    • AppObjects should not be loaded by default
    • Have a look at Alembic the migration tool for SQLAlchemy and take inspiration from there.

  • CubicWeb roadmap meeting on November 6th, 2014

    2014/11/03 by Nicolas Chauvat

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. The previous roadmap meeting was in September 2014.

    Here is the report about the November 6th, 2014 meeting. Christophe de Vienne (Unlish) joined us to express their concerns and discuss the future of CubicWeb. Dimitri Papadopoulos (CEA) could not come.

    Versions

    Version 3.17

    This version is stable but old and maintainance will continue only as long as some customers will be willing to pay for it (current is 3.17.17).

    If you're still using 3.17, you should go directly to 3.19.

    Version 3.18

    This version is stable but old and maintained (current is 3.18.6).

    Version 3.19

    This version is stable and maintained (current is 3.19.5).

    Version 3.20

    This version is still under development but should be released very soon now (expected next week). Its main feature being the inclusion of CWEP-002 (computed attributes and relations), along with many small improvement patches.

    For details read list of tickets for CubicWeb 3.20.0.

    We would have loved to integrate the pyramid cube in this release, but the debian packaging effort needed by the pyramid stack is quite big and is acceptable if we target jessie only (at decent price).

    Version 3.21

    For now, the roadmap for 3.21 is still the complete removal of the dbapi, the merging of Connection and ClientConnection, and possibly including CWEP-003 (adding a FROM clause to RQL).

    Integrate the pyramid cube to provide the pyramid command if the pyramid framework can be imported.

    Integration of CWEP-004 is being discussed.

    Version 4.0

    We expect to accelerate development of CubicWeb 4, which exact roadmap is still to be discussed, but we may already want:

    • be pyramid-based (remove twisted, auth management, etc.),
    • do not have anything left of old dbapi and ClientConnection,
    • integrate squareui as main (and only) web-ui "template" or remove web generation (almost) completely from cubicweb-core and provide it only through the cube system.

    CWEPs

    Here is the status of open CubicWeb Evolution Proposals:

    to be written

    Work in progress

    Some work is in progress around CKAN, DCAT and othr Open Data and Semantic Web related technologies.

    Agenda

    Next roadmap meeting will be held at the beginning of january 2015 at Logilab, and Christophe and Dimitri (or Yann) are invited.

    Open Discussions

    Migration:

    • AppObjects should not be loaded by default
    • Have a look at Alembic the migration tool for SQLAlchemy and take inspiration from there

  • Exploring the datafeed API in CubicWeb

    2014/09/26 by Denis Laxalde

    The datafeed API is one of the nice features of the CubicWeb framework. It makes it possible to easily build such things as a news aggregator (or even a semantic news feed reader), a LDAP importer or an application importing data from another web platform. The underlying API is quite flexible and powerful. Yet, the documentation being quite thin, it may be hard to find one's way through. In this article, we'll describe the basics of the datafeed API and provide guiding examples.

    The datafeed API is essentially built around two things: a CWSource entity and a parser, which is a kind of AppObject.

    The CWSource entity defines a list of URL from which to fetch data to be imported in the current CubicWeb instance, it is linked to a parser through its __regid__. So something like the following should be enough to create a usable datafeed source [1].

    create_entity('CWSource', name=u'some name', type=u'datafeed', parser=u'myparser')
    

    The parser is usually a subclass of DataFeedParser (from cubicweb.server.sources.datafeed). It should at least implement the two methods process and before_entity_copy. To make it easier, there are specialized parsers such as DataFeedXMLParser that already define process so that subclasses only have to implement the process_item method.

    Overview of the datafeed API

    Before going into further details about the actual implementation of a DataFeedParser, it's worth having in mind a few details about the datafeed parsing and import process. This involves various players from the CubicWeb server, namely: a DataFeedSource (from cubicweb.server.sources.datafeed), the Repository and the DataFeedParser.

    • Everything starts from the Repository which loops over its sources and pulls data from each of these (this is done using a looping task which is setup upon repository startup). In the case of datafeed sources, Repository sources are instances of the aforementioned DataFeedSource class [2].
    • The DataFeedSource selects the appropriate parser from the registry and loops on each uri defined in the respective CWSource entity by calling the parser's process method with that uri as argument (methods pull_data and process_urls of DataFeedSource).
    • If the result of the parsing step is successful, the DataFeedSource will call the parser's handle_deletion method, with the URI of the previously imported entities.
    • Then, the import log is formatted and the transaction committed. The DataFeedSource and DataFeedParser are connected to an import_log which feeds the CubicWeb instance with a CWDataImport per data pull. This usually contains the number of created and updated entities along with any error/warning message logged by the parser. All this is visible in a table from the CWSource primary view.

    So now, you might wonder what actually happens during the parser's process method call. This method takes an URL from which to fetch data and processes further each piece of data (using a process_item method for instance). For each data-item:

    1. the repository is queried to retrieve or create an entity in the system source: this is done using the extid2entity method;
    2. this extid2entity method essentially needs two pieces of information:
      • a so-called extid, which uniquely identifies an item in the distant source
      • any other information needed to create or update the corresponding entity in the system source (this will be later refered to as the sourceparams)
    3. then, given the (new or existing) entity returned by extid2entity, the parser can perform further postprocessing (for instance, updating any relation on this entity).

    In step 1 above, the parser method extid2entity in turns calls the repository method extid2eid given the current source and the extid value. If an entry in the entities table matches with the specified extid, the corresponding eid (identifier in the system source) is returned. Otherwise, a new eid is created. It's worth noting that the created entity (in case the entity is to be created) is not complete with respect to the data model at this point. In order the entity to be completed, the source method before_entity_insertion is called. This is where the aforementioned sourceparams are used. More specifically, on the parser side the before_entity_copy method is called: it usually just updates (using entity.cw_set() for instance) the fetched entity with any relevant information.

    Case study: a news feeds parser

    Now we'll go through a concrete example to illustrate all those fairly abstract concepts and implement a datafeed parser which can be used to import news feeds. Our parser will create entities of type FeedArticle, which minimal data model would be:

    class FeedArticle(EntityType):
        title = String(fulltextindexed=True)
        uri = String(unique=True)
        author = String(fulltextindexed=True)
        content = RichString(fulltextindexed=True, default_format='text/html')
    

    Here we'll reuse the DataFeedXMLParser, not because we have XML data to parse, but because its interface fits well with our purpose, namely: it ships an item-based processing (a process_item method) and it relies on a parse method to fetch raw data. The underlying parsing of the news feed resources will be handled by feedparser.

    class FeedParser(DataFeedXMLParser):
        __regid__ = 'newsaggregator.feed-parser'
    

    The parse method is called by process, it should return a list tuples with items information.

    def parse(self, url):
        """Delegate to feedparser to retrieve feed items"""
        data = feedparser.parse(url)
        return zip(data.entries)
    

    Then the process_item method takes an individual item (i.e. an entry of the result obtained from feedparser in our case). It essentially defines an extid, here the uri of the feed entry (good candidate for unicity) and calls extid2entity with that extid, the entity type to be created / retrieved and any additional data useful for entity completion passed as keyword arguments. (The process_feed method call just transforms the results obtained from feedparser into a dict suitable for entity creation following the data model described above.)

    def process_item(self, entry):
        data = self.process_feed(entry)
        extid = data['uri']
        entity = self.extid2entity(extid, 'FeedArticle', feeddata=data)
    

    The before_entity_copy method is called before the entity is actually created (or updated) in order to give the parser a chance to complete it with any other attribute that could be set from source data (namely feedparser data in our case).

    def before_entity_copy(self, entity, sourceparams):
        feeddata = sourceparams['feeddata']
        entity.cw_edited.update(feeddata)
    

    And this is all what's essentially needed for a simple parser. Further details could be found in the news aggregator cube. More sophisticated parsers may use other concepts not described here, such as source mappings.

    Testing datafeed parsers

    Testing a datafeed parser often involves pulling data from the corresponding datafeed source. Here is a minimal test snippet that illustrates how to retrieve the datafeed source from a CWSource entity and to pull data from it.

    with self.admin_access.repo_cnx() as cnx:
        # Assuming one knows the URI of a CWSource.
        rset = cnx.execute('CWSource X WHERE X uri %s' % uri)
        # Retrieve the datafeed source instance.
        dfsource = self.repo.sources_by_eid[rset[0][0]]
        # Make sure it's parser matches the expected.
        self.assertEqual(dfsource.parser_id, '<my-parser-id>')
        # Pull data using an internal connection.
        with self.repo.internal_cnx() as icnx:
            stats = dfsource.pull_data(icnx, force=True, raise_on_error=True)
            icnx.commit()
    

    The resulting stats is a dictionnary containing eids of created and updated entities during the pull. In addition all entities created should have the cw_source relation set to the corresponding CWSource entity.

    Notes

    [1]

    It is possible to add some configuration to the CWSource entity in the form a string of configuration items (one per line). Noteworthy items are:

    • the synchronization-interval;
    • use-cwuri-as-url=no, which avoids using external URL inside the CubicWeb instance (leading to any link on an imported entity to point to the external source URI);
    • delete-entities=[yes,no] which controls if entities not found anymore in the distant source should be deleted from the CubicWeb instance.
    [2]The mapping between CWSource entities' type (e.g. "datafeed") and DataFeedSource object is quite unusual as it does not rely on the vreg but uses a specific sources registry (defined in cubicweb.server.SOURCE_TYPES).

  • CubicWeb roadmap meeting on September 4th, 2014

    2014/09/01 by Nicolas Chauvat

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. The previous roadmap meeting was in July 2014.

    Here is the report about the September 4th, 2014 meeting. Christophe de Vienne (Unlish) and Dimitri Papadopoulos (CEA) joined us to express their concerns and discuss the future of CubicWeb.

    Versions

    Version 3.17

    This version is stable but old and maintainance will continue only as long as some customers will be willing to pay for it (current is 3.17.16 with 3.17.17 in development).

    Version 3.18

    This version is stable and maintained (current is 3.18.5 with 3.18.6 in development).

    Version 3.19

    This version is stable and maintained (current is 3.19.3 with 3.19.4 in development).

    Version 3.20

    This version is under development. It will try to reduce as much as possible the stock of patches in the state "reviewed", "awaiting review" and "in progress". If you have had something in the works that has not been accepted yet, please ready it for 3.20 and get it merged.

    It should still include the work done for CWEP-002 (computed attributes and relations).

    For details read list of tickets for CubicWeb 3.20.0.

    Version 3.21

    Removal of the dbapi, merging of Connection and ClientConnection, CWEP-003 (adding a FROM clause to RQL).

    Version 4.0

    When the work done for Pyramid will have been tested, it will become the default runner and a lot of things will be dropped: twisted, dead code, ui and core code that would be better cast into cubes, etc.

    This version could happen early in 2015.

    Cubes

    New cubes and libraries

    CWEPs

    Here is the status of open CubicWeb Evolution Proposals:

    CWEP-0002 full-featured implementation, to be merged in 3.20

    CWEP-0003 patches sent to the review. . Champion will be adim.

    Work in progress

    PyConFR

    Christophe will try to present at PyConFR the work he did on getting CubicWeb to work with Pyramid.

    Pip-friendly source layout

    Logilab and Christophe will try to make CubicWeb more pip/virtualenv-friendly. This may involve changing the source layout to include a sub-directory, but the impact on existing devs is expected to be too much and could be delayed to CubicWeb 4.0.

    Pyramid

    Christophe has made good progress on getting CubicWeb to work with Pyramid and he intends to put it into production real soon now. There is a Pyramid extension named pyramid_cubicweb and a CubicWeb cube named cubicweb-pyramid. Both work with CubicWeb 3.19. Christophe demonstrated using the debug toolbar, authenticating users with Authomatic and starting multiple workers with uWSGI.

    Early adopters are now invited to jump in and help harden the code!

    Agenda

    Logilab's next roadmap meeting will be held at the beginning of november 2014 and Christophe and Dimitri were invited.


  • Handling dependencies between form fields in CubicWeb

    2014/07/11 by Denis Laxalde

    This post considers the issue of building an edition form of a CubicWeb entity with dependencies on its fields. It's a quite common issue that needs to be handled client-side, based on user interaction.

    Consider the following example schema:

    from yams.buildobjs import EntityType, RelationDefinition, String, SubjectRelation
    from cubicweb.schema import RQLConstraint
    
    _ = unicode
    
    class Country(EntityType):
        name = String(required=True)
    
    class City(EntityType):
        name = String(required=True)
    
    class in_country(RelationDefinition):
        subject = 'City'
        object = 'Country'
        cardinality = '1*'
    
    class Citizen(EntityType):
        name = String(required=True)
        country = SubjectRelation('Country', cardinality='1*',
                                  description=_('country the citizen lives in'))
        city = SubjectRelation('City', cardinality='1*',
                               constraints=[
                                   RQLConstraint('S country C, O in_country C')],
                               description=_('city the citizen lives in'))
    

    The main entity of interest is Citizen which has two relation definitions towards Country and City. Then, a City is bound to a Country through the in_country relation definition.

    In the automatic edition form of Citizen entities, we would like to restrict the choices of cities depending on the selected Country, to be determined from the value of the country field. (In other words, we'd like the constraint on city relation defined above to be fulfilled during form rendering, not just validation.) Typically, in the image below, cities not in Italy should be available in the city select widget:

    Example of Citizen entity edition form.

    The issue will be solved by little customization of the automatic entity form, some uicfg rules and a bit of Javascript. In the following, the country field will be referred to as the master field whereas the city field as the dependent field.

    So here the code of the views.py module:

    from cubicweb.predicates import is_instance
    from cubicweb.web.views import autoform, uicfg
    from cubicweb.uilib import js
    
    _ = unicode
    
    
    class CitizenAutoForm(autoform.AutomaticEntityForm):
        """Citizen autoform handling dependencies between Country/City form fields
        """
        __select__ = is_instance('Citizen')
    
        needs_js = autoform.AutomaticEntityForm.needs_js + ('cubes.demo.js', )
    
        def render(self, *args, **kwargs):
            master_domid = self.field_by_name('country', 'subject').dom_id(self)
            dependent_domid = self.field_by_name('city', 'subject').dom_id(self)
            self._cw.add_onload(js.cw.cubes.demo.initDependentFormField(
                master_domid, dependent_domid))
            super(CitizenAutoForm, self).render(*args, **kwargs)
    
    
    def city_choice(form, field):
        """Vocabulary function grouping city choices by country."""
        req = form._cw
        vocab = [(req._('<unspecified>'), '')]
        for eid, name in req.execute('Any X,N WHERE X is Country, X name N'):
            rset = req.execute('Any N,E ORDERBY N WHERE'
                               ' X name N, X eid E, X in_country C, C eid %(c)s',
                               {'c': eid})
            if rset:
                # 'optgroup' tag.
                oattrs = {'id': 'country_%s' % eid}
                vocab.append((name, None, oattrs))
                for label, value in rset.rows:
                    # 'option' tag.
                    vocab.append((label, str(value)))
        return vocab
    
    
    uicfg.autoform_field_kwargs.tag_subject_of(('Citizen', 'city', '*'),
                                               {'choices': city_choice, 'sort': False})
    

    The first thing (reading from the bottom of the file) is that we've added a choices function on city relation of the Citizen automatic entity form via uicfg. This function city_choice essentially generates the HTML content of the field value by grouping available cities by respective country through the addition of some optgroup tags.

    Then, we've overridden the automatic entity form for Citizen entity type by essentially calling a piece of Javascript code fed with the DOM ids of the master and dependent fields. Fields are retrieved by their name (field_by_name method) and respective id using the dom_id method.

    Now the Javascript part of the picture:

    cw.cubes.demo = {
        // Initialize the dependent form field select and bind update event on
        // change on the master select.
        initDependentFormField: function(masterSelectId,
                                         dependentSelectId) {
            var masterSelect = cw.jqNode(masterSelectId);
            cw.cubes.demo.updateDependentFormField(masterSelect, dependentSelectId);
            masterSelect.change(function(){
                cw.cubes.demo.updateDependentFormField(this, dependentSelectId);
            });
        },
    
        // Update the dependent form field select.
        updateDependentFormField: function(masterSelect,
                                           dependentSelectId) {
            // Clear previously selected value.
            var dependentSelect = cw.jqNode(dependentSelectId);
            $(dependentSelect).val('');
            // Hide all optgroups.
            $(dependentSelect).find('optgroup').hide();
            // But the one corresponding to the master select.
            $('#country_' + $(masterSelect).val()).show();
        }
    }
    

    It consists of two functions. The initDependentFormField is called during form rendering and it essentially bind the second function updateDependentFormField to the change event of the master select field. The latter "update" function retrieves the dependent select field, hides all optgroup nodes (i.e. the whole content of the select widget) and then only shows dependent options that match with selected master option, identified by a custom country_<eid> set by the vocabulary function above.


  • CubicWeb roadmap meeting on July 3rd, 2014

    2014/06/26 by Nicolas Chauvat

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. The previous roadmap meeting was in May 2014.

    Here is the report about the July 3rd, 2014 meeting. Christophe de Vienne (Unlish) and Dimitri Papadopoulos (CEA) joined us to express their concerns and discuss the future of CubicWeb.

    Versions

    Version 3.17

    This version is stable but old and maintainance will continue only as long as some customers will be willing to pay for it (current is 3.17.15 with 3.17.16 in development).

    Version 3.18

    This version is stable and maintained (current is 3.18.5 with 3.18.6 in development).

    Version 3.19

    This version was published at the end of April and has now been tested on our internal servers. It includes support for Cross Origin Resource Sharing (CORS) and a heavy refactoring that modifies sessions and sources to lay the path for CubicWeb 4.

    For details read the release notes or the list of tickets for CubicWeb 3.19.0. Current is 3.19.2

    Version 3.20

    This version is under development. It will try to reduce as much as possible the stock of patches in the state "reviewed", "awaiting review" and "in progress". If you have had something in the works that has not been accepted yet, please ready it for 3.20 and get it merged.

    It should still include the work done for CWEP-002 (computed attributes and relations.

    For details read list of tickets for CubicWeb 3.20.0.

    Version 3.21 (or maybe 4.0?)

    Removal of the dbapi, merging of Connection and ClientConnection, CWEP-003 (adding a FROM clause to RQL).

    Cubes

    Cubes published over the past two months

    New cubes

    • cubicweb-frbr: Cube providing a schema based on FRBR entities
    • cubicweb-clinipath
    • cubicweb-fastimport

    CWEPs

    Here is the status of open CubicWeb Evolution Proposals:

    CWEP-0002 only missing a bit of migration support, to be finished soon for inclusion in 3.20.

    CWEP-0003 has been reviewed and is waiting for a bit of reshaping that should occurs soon. It's targeted for 3.21.

    New CWEPs are expected to be written for clarifying the API of the _cw object, supporting persistent sessions and improving the performance of massive imports.

    Work in progress

    Design

    The new logo is now published in the 3.19 line. David showed us his experimentation that modernize a forge's ui with a bit of CSS. There is still a bit of pressure on the bootstrap side though, as it still rely on heavy monkey-patching in the cubicweb-bootstrap cube.

    Data import

    Also, Dimitry expressed is concerns with the lack of proper data import API. We should soon have some feedback from Aurelien's cubicweb-fastimport experimentation, which may be an answer to Dimitry's need. In the end, we somewhat agreed that there were different needs (eg massive-no-consistency import vs not-so-big-but-still-safe), that cubicweb.dataimport was an attempt to answer them all and then cubicweb-dataio and cubicweb-fastimport were more specific responses. In the end we may reasonably hope that an API will emerge.

    Removals

    On his way to persistent sessions, Aurélien made a huge progress toward silence of warnings in the 3.19 tests. dbapi has been removed, ClientConnection / Connection merged. We decided to take some time to think about the recurring task management as it is related to other tricky topics (application / instance configuration) and it's not directly related to persistent session.

    Rebasing on Pyramid

    Last but not least, Christophe demonstrated that CubicWeb could basically live with Pyramid. This experimentation will be pursued as it sounds very promising to get the good parts from the two framework.

    Agenda

    Logilab's next roadmap meeting will be held at the beginning of september 2014 and Christophe and Dimitri were invited.


  • Logilab's roadmap for CubicWeb on May 15th, 2014

    2014/05/21 by Nicolas Chauvat

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. Here is the report about the May 15th, 2014 meeting. The previous report posted to the blog was the march 2014 roadmap.

    Versions

    Version 3.17

    This version is stable but old and maintainance will continue only as long as some customers will be willing to pay for it (current is 3.17.15).

    Version 3.18

    This version is stable and maintained (current is 3.18.4).

    Version 3.19

    This version was published at the end of April. It includes support for Cross Origin Resource Sharing (CORS) and a heavy refactoring that modifies sessions and sources to lay the path for CubicWeb 4.

    For details read the release notes or the list of tickets for CubicWeb 3.19.0.

    Version 3.20

    This version is under development. It will try to reduce as much as possible the stock of patches in the state "reviewed", "awaiting review" and "in progress". If you have had something in the works that has not been accepted yet, please ready it for 3.20 and get it merged.

    It should also include the work done for CWEP-002 (computed attributes and relations) and the merging of Connection and ClientConnection if it happens to be simple enough to get done quickly (in case the removal of dbapi would really help, this merging will wait for 3.21).

    For details read list of tickets for CubicWeb 3.20.0.

    Version 3.21 (or maybe 4.0?)

    Removal of the dbapi and merging of CWEP-003 (adding a FROM clause to RQL).

    Cubes

    Here is a list of cubes that had versions published over the past two months: accidents, awstats, book, bootstrap, brainomics, cmt, collaboration, condor, container, dataio, expense, faq, file, forge, forum, genomics, geocoding, inlineedit, inventory, keyword, link, mailinglist, mediaplayer, medicalexp, nazcaui, ner, neuroimaging, newsaggregator, processing, questionnaire, rqlcontroller, semnews, signedrequest, squareui, task, testcard, timesheet, tracker, treeview, vcsfile, workorder.

    Here are a the new cubes we are pleased to announce:

    rqlcontroller receives via a POST a list of RQL queries and executes them. This is a way to build web services.

    wsme is helping build a web service API on top of a CubicWeb database.

    signedrequest is a simple token based authentication system. This is a way for scripts or callback urls to access an instance without login/pwd information.

    relationwidget is a widget usable in forms to edit relationships between objects. It depends on CubicWeb 3.19.

    searchui is an experiment on adding blocks to the list of facets that allow building complex RQL queries step by step by clicking with the mouse instead of directly writing the RQL with the keyboard.

    ckan is using the REST API of a CKAN data portal to mirror its content.

    CWEPs

    Here is the status of open CubicWeb Evolution Proposals:

    CWEP-0002 is now in good shape and the goal is to have it merged into 3.20. It lacks some documentation and a migration script.

    CWEP-0003 has made good progress during the latest sprint, but will need a thorough review before being merged. It will probably not be ready for 3.20 and have to wait for 3.21.

    New CWEPs are expected to be written for clarifying the API of the _cw object, supporting persistent sessions and improving the performance of massive imports.

    Visual identity

    CubicWeb has a new logo that will appear before the end of may on its revamped homepage at http://www.cubicweb.org

    Last but not least

    As already said on the mailing list, other developers and contributors are more than welcome to share their own goals in order to define a roadmap that best fits everyone's needs.

    Logilab's next roadmap meeting will be held at the beginning of july 2014.


  • What's new in CubicWeb 3.19

    2014/05/05 by Aurelien Campeas

    New functionalities

    • implement Cross Origin Resource Sharing (CORS) (see #2491768)
    • system_source.create_eid can return a range of IDs, to reduce overhead of batch entity creation

    Behaviour Changes

    • The anonymous property of Session and Connection is now computed from the related user login. If it matches the anonymous-user in the config the connection is anonymous. Beware that the anonymous-user config is web specific. Therefore, no session may be anonymous in a repository only setup.

    New Repository Access API

    Connection replaces Session

    A new explicit Connection object replaces Session as the main repository entry point. A Connection holds all the necessary methods to be used server-side (execute, commit, rollback, call_service, entity_from_eid, etc...). One obtains a new Connection object using session.new_cnx(). Connection objects need to have an explicit begin and end. Use them as a context manager to never miss an end:

    with session.new_cnx() as cnx:
        cnx.execute('INSERT Elephant E, E name "Babar"')
        cnx.commit()
        cnx.execute('INSERT Elephant E, E name "Celeste"')
        cnx.commit()
    # Once you get out of the "with" clause, the connection is closed.
    

    Using the same Connection object in multiple threads will give you access to the same Transaction. However, Connection objects are not thread safe (hence at your own risks).

    repository.internal_session is deprecated in favor of repository.internal_cnx. Note that internal connections are now safe by default, i.e. the integrity hooks are enabled.

    Backward compatibility is preserved on Session.

    dbapi vs repoapi

    A new API has been introduced to replace the dbapi. It is called repoapi.

    There are three relevant functions for now:

    • repoapi.get_repository returns a Repository object either from an URI when used as repoapi.get_repository(uri) or from a config when used as repoapi.get_repository(config=config).
    • repoapi.connect(repo, login, **credentials) returns a ClientConnection associated with the user identified by the credentials. The ClientConnection is associated with its own Session that is closed when the ClientConnection is closed. A ClientConnection is a Connection-like object to be used client side.
    • repoapi.anonymous_cnx(repo) returns a ClientConnection associated with the anonymous user if described in the config.

    repoapi.ClientConnection replaces dbapi.Connection and company

    On the client/web side, the Request is now using a repoapi.ClientConnection instead of a dbapi.Connection. The ClientConnection has multiple backward compatible methods to make it look like a dbapi.Cursor and dbapi.Connection.

    Sessions used on the Web side are now the same as the ones used Server side. Some backward compatibility methods have been installed on the server side Session to ease the transition.

    The authentication stack has been altered to use the repoapi instead of the dbapi. Cubes adding new elements to this stack are likely to break.

    New API in tests

    All current methods and attributes used to access the repo on CubicWebTC are deprecated. You may now use a RepoAccess object. A RepoAccess object is linked to a new Session for a specified user. It is able to create Connection, ClientConnection and web side requests linked to this session:

    access = self.new_access('babar') # create a new RepoAccess for user babar
    with access.repo_cnx() as cnx:
        # some work with server side cnx
        cnx.execute(...)
        cnx.commit()
        cnx.execute(...)
        cnx.commit()
    
    with access.client_cnx() as cnx:
        # some work with client side cnx
        cnx.execute(...)
        cnx.commit()
    
    with access.web_request(elephant='babar') as req:
        # some work with web request
        elephant_name = req.form['elephant']
        req.execute(...)
        req.cnx.commit()
    

    By default testcase.admin_access contains a RepoAccess object for the default admin session.

    API changes

    • RepositorySessionManager.postlogin is now called with two arguments, request and session. And this now happens before the session is linked to the request.
    • SessionManager and AuthenticationManager now take a repo object at initialization time instead of a vreg.
    • The async argument of _cw.call_service has been dropped. All calls are now synchronous. The zmq notification bus looks like a good replacement for most async use cases.
    • repo.stats() is now deprecated. The same information is available through a service (_cw.call_service('repo_stats')).
    • repo.gc_stats() is now deprecated. The same information is available through a service (_cw.call_service('repo_gc_stats')).
    • repo.register_user() is now deprecated. The functionality is now available through a service (_cw.call_service('register_user')).
    • request.set_session no longer takes an optional user argument.
    • CubicwebTC does not have repo and cnx as class attributes anymore. They are standard instance attributes. set_cnx and _init_repo class methods become instance methods.
    • set_cnxset and free_cnxset are deprecated. The database connection acquisition and release cycle is now more transparent.
    • The implementation of cascading deletion when deleting composite entities has changed. There comes a semantic change: merely deleting a composite relation does not entail any more the deletion of the component side of the relation.
    • _cw.user_callback and _cw.user_rql_callback are deprecated. Users are encouraged to write an actual controller (e.g. using ajaxfunc) instead of storing a closure in the session data.
    • A new entity.cw_linkable_rql method provides the rql to fetch all entities that are already or may be related to the current entity using the given relation.

    Deprecated Code Drops

    • The session.hijack_user mechanism has been dropped.
    • EtypeRestrictionComponent has been removed, its functionality has been replaced by facets a while ago.
    • the old multi-source support has been removed. Only copy-based sources remain, such as datafeed or ldapfeed.

  • Logilab's roadmap for CubicWeb on March 7th, 2014

    2014/03/10 by Nicolas Chauvat

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. Here is the report about the Mar 7th, 2014 meeting. The previous report posted to the blog was the january 2014 roadmap.

    Version 3.17

    This version is stable but old and maintainance will stop in a few weeks (current is 3.17.13 and 3.17.14 is upcoming).

    Version 3.18

    This version is stable and maintained (current is 3.18.3 and 3.18.4 is upcoming).

    Version 3.19

    This version is about to be published. It includes a heavy refactoring that modifies sessions and sources to lay the path for CubicWeb 4.

    For details read list of tickets for CubicWeb 3.19.0.

    Version 3.20

    This version will try to reduce as much as possible the stock of patches in the state "reviewed", "awaiting review" and "in progress". If you have had something in the works that has not been accepted yet, please ready it for 3.20 and get it merged.

    It should also include the work done for CWEP-002 (computed attributes and relations) and CWEP-003 (adding a FROM clause to RQL).

    For details read list of tickets for CubicWeb 3.20.0.

    Cubes

    Here is a list of cubes that had versions published over the past two months: addressbook, awstats, blog, bootstrap, brainomics, comment, container, dataio, genomics, invoice, mediaplayer, medicalexp, neuroimaginge, person, preview, questionnaire, securityprofile, simplefacet, squareui, tag, tracker, varnish, vcwiki, vtimeline.

    Here are a the new cubes we are pleased to announce:

    collaboration is a building block that reuses container and helps to define collaborative workflows where entities are cloned, modified and shared.

    Our priorities for the next two months are collaboration and container, then narval/apycot, then mercurial-server, then rqlcontroller and signedrequest, then imagesearch.

    Mid-term goals

    The work done for CWEP-0002 (computed attributes and relations) is expected to land in CubicWeb 3.20.

    The work done for CWEP-0003 (explicit data source federation using FROM in RQL) is expected to land in CubicWeb 3.20.

    Tools to diagnose performance issues would be very useful. Maybe in 3.21 ?

    Caching session data would help and some work was done on this topic during the sprint in february. Maybe in 3.22 ?

    WSGI has made progress lately, but still needs work. Maybe in 3.23 ?

    RESTfulness is a goal. Maybe in 3.24 ?

    Maybe 3.25 will be in fact 4.0 ?

    Events

    A spring sprint will take place in Logilab's offices in Paris from April 28th to 30th. We invite all the interested parties to join us there!

    Last but not least

    As already said on the mailing list, other developers and contributors are more than welcome to share their own goals in order to define a roadmap that best fits everyone's needs.

    Logilab's next roadmap meeting will be held at the beginning of may 2014.


  • CubicWeb sprint / winter 2014

    2014/02/12 by Nicolas Chauvat

    This sprint took place at Logilab's offices in Paris on Feb 13/14. People from CEA, Unlish, Crealibre and Logilab teamed up to push CubicWeb forward.

    We did not forget the priorities from the roadmap:

    • CubicWeb 3.17.13 and 3.18.3 were released, and CubicWeb 3.19 made progress
    • the branch about ComputedAttributes and ComputedRelations (CWEP-002) is ready to be merged,
    • the branch about the FROM clause (CWEP-003) made progress (the CWEP was reviewed and part of the resulting spec was implemented),
    • in order to reduce work in progress, the number of patches in state reviewed or pending-review was brought down to 243 (from 302, that is 60 or 20%, which is not bad).

  • CubicWeb using Postgresql at its best

    2014/02/08 by Nicolas Chauvat

    We had a chat today with a core contributor to Postgresql from whom we may buy consulting services in the future. We discussed how CubicWeb could get the best out of Postgresql:

    • making use of the LISTEN/NOTIFY mechanism built into PG could be useful (to warn the cache about modified items for example) and PgQ is its good friend;
    • views (materialized or not) are another way to implement computed attributes and relations (see CWEP number 002) and it could be that the Entities table is in fact a view of other tables;
    • implementing RQL as an in-database language could open the door to new things (there is PL/pgSQL, PL/Python, what if we had PL/RQL?);
    • Foreign Data Wrappers written with Multicorn would be another way to write data feeds (see LDAP integration for an example);
    • managing dates can be tricky when users reside in different timezones and UTC is important to keep in mind (unicode/str is a good analogy);
    • for transitive closures that are often needed when implementing access control policies with __permissions, Postgresql can go a long way with queries like "WITH ... (SELECT UNION ALL SELECT RETURNING *) UPDATE USING ...";
    • the fastest way to load tabular data that does not need too much pre-processing is to create a temporary table in memory, then COPY-FROM the data into that table, then index it, then write the transform and load step in SQL (maybe with PL/Python);
    • when executing more than 10 updates in a row, it is better to write into a temporary table in memory, then update the actual tables with UPDATE USING (let's check if the psycopg driver does that when executemany is called);
    • reaching 10e8 rows in a table is at the time of this writing the stage when you should start monitoring your db seriously and start considering replication, partition and sharding.
    • full-text search is much better in Postgresql than the general public thinks it is and recent developments made it orders of magnitude faster than tools like Lucene or Solr and ElasticSearch;
    • when dealing with complex queries (searching graphs maybe), an option to consider is to implement a specific data type, use it into a materialized view and use GIN or GIST indexes over it;
    • for large scientific data sets, it could be interesting to link the numpy library into Postgresql and turn numpy arrays into a new data type;
    • Oh, and one last thing: the object-oriented tables of Postgresql are not such a great idea, unless you have a use case that fits them perfectly and does not hit their limitations (CubicWeb's is_instance_of does not seem to be one of these).

    Hopin' I got you thinkin' :)

    http://developer.postgresql.org/~josh/graphics/logos/elephant.png

  • Cubicweb sprints winter/spring 2014

    2014/01/24 by David Douard

    The Logilab team is pleased to announce two Cubicweb sprints to be held in its Paris offices in the upcoming months:

    February 13/14th at Logilab in Paris

    The agenda would be the FROM clause for which a CWEP is still awaited, and the RQL rewriter according to the CWEP02.

    April 28/30th at Logilab in Paris

    Agenda to be defined.

    Join the party

    All users and contributors of CubicWeb are invited to join the party. Just send an email to contact at Logilab.fr if you plan to come.

    http://farm1.static.flickr.com/183/419945378_4ead41a76d_m.jpg

  • Logilab's roadmap for CubicWeb on January 9th, 2014

    2014/01/14 by Nicolas Chauvat

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. Here is the report about the Jan 9th, 2014 meeting. The previous report posted to the blog was the november 2013 roadmap.

    Version 3.17

    This version is stable and maintained (current is 3.17.11 and 3.17.12 is upcoming).

    Version 3.18

    This version was released on Jan 10th. Read the release notes or the details of CubicWeb 3.18.0.

    Version 3.19

    This version includes a heavy refactoring that modifies sessions and sources to lay the path for CubicWeb 4. It is currently the default development head in the repository and is expected to be released before the end of january.

    For details read list of tickets for CubicWeb 3.19.0.

    Version 3.20

    This version will try to reduce as much as possible the stock of patches in the state "reviewed", "awaiting review" and "in progress". If you have had something in the works that has not been accepted yet, please ready it for 3.20 and get it merged.

    For details read list of tickets for CubicWeb 3.20.0.

    Cubes

    The current trend is to develop more and more new features in dedicated cubes than to add more code to the core of CubicWeb. If you thought CubicWeb development was slowing down, you made a mistake, because cubes are ramping up.

    Here is a list of versions that were published in the past two months: timesheet, postgis, leaflet, bootstrap, worker, container, embed, geocoding, vcreview, trackervcs, vcsfile, zone, dataio, mercurial-server, queueing, questionnaire, genomics, medicalexp, neuroimaging, brainomics, elections.

    Here are a the new cubes we are pleased to announce:

    Bootstrap works and we do not create a new application without it.

    relationwidget provides a modal window to edit relations in forms (use uicfg to activate it).

    resourcepicker provides a modal window to insert links to images and files into structured text.

    rqlcontroller allows to use the INSERT, DELETE and SET keywords when sending RQL queries over HTTP. It returns JSON. Get used to it and you may forget about asking for specific web services in your apps, for it is a generic web service.

    imagesearch is an image gallery with facets. You may use it as a demo of a visual search tool.

    Mid-term goals

    A new repository was created to have all the CubicWeb Evolution Proposals in one place.

    CWEP-0002 is a work in progress about computed relations and computed attributes, or maybe more. It will be a focus of the next sprint and is targeted at CubicWeb 3.20.

    A new CWEP is expected about the adding FROM keyword to RQL to implement explicit data source federation. It will be a focus of the next sprint and is targeted at CubicWeb 3.21.

    Tools to diagnose performance issues would be very useful. Maybe in 3.22 ?

    Caching session data would help. Maybe in 3.23 ?

    WSGI has made progress lately, but still needs work. Maybe in 3.24 ?

    RESTfulness is a goal. Maybe in 3.25 ?

    Maybe 3.26 will be in fact 4.0 ?

    Events

    A sprint will take place in Logilab's offices in Paris around mid-february or at the end of april. We invite all the interested parties to join us there!

    Last but not least

    As already said on the mailing list, other developers and contributors are more than welcome to share their own goals in order to define a roadmap that best fits everyone's needs.

    Logilab's next roadmap meeting will be held at the beginning of march 2014.


  • What's new in CubicWeb 3.18

    2014/01/10 by Aurelien Campeas

    The migration script does not handle sqlite nor mysql instances.

    New functionalities

    • add a security debugging tool (see #2920304)
    • introduce an add permission on attributes, to be interpreted at entity creation time only and allow the implementation of complex update rules that don't block entity creation (before that the update attribute permission was interpreted at entity creation and update time) (see #2965518)
    • the primary view display controller (uicfg) now has a set_fields_order method similar to the one available for forms
    • new method ResultSet.one(col=0) to retrieve a single entity and enforce the result has only one row (see #3352314)
    • new method RequestSessionBase.find to look for entities (see #3361290)
    • the embedded jQuery copy has been updated to version 1.10.2, and jQuery UI to version 1.10.3.
    • initial support for wsgi for the debug mode, available through the new wsgi cubicweb-ctl command, which can use either python's builtin wsgi server or the werkzeug module if present.
    • a rql-table directive is now available in ReST fields
    • cubicweb-ctl upgrade can now generate the static data resource directory directly, without a manual call to gen-static-datadir.

    API changes

    • not really an API change, but the entity write permission checks are now systematically deferred to an operation, instead of a) trying in a hook and b) if it failed, retrying later in an operation
    • The default value storage for attributes is no longer String, but Bytes. This opens the road to storing arbitrary python objects, e.g. numpy arrays, and fixes a bug where default values whose truth value was False were not properly migrated.
    • symmetric relations are no more handled by an rql rewrite but are now handled with hooks (from the activeintegrity category); this may have some consequences for applications that do low-level database manipulations or at times disable (some) hooks.
    • unique together constraints (multi-columns unicity constraints) get a name attribute that maps the CubicWeb contraint entities to the corresponding backend index.
    • BreadCrumbEntityVComponent's open_breadcrumbs method now includes the first breadcrumbs separator
    • entities can be compared for equality and hashed
    • the on_fire_transition predicate accepts a sequence of possible transition names
    • the GROUP_CONCAT rql aggregate function no longer repeats duplicate values, on the sqlite and postgresql backends

    Deprecation

    • pyrorql sources have been deprecated. Multisource will be fully dropped in the next version. If you are still using pyrorql, switch to datafeed NOW!
    • the old multi-source system
    • find_one_entity and find_entities in favor of find (see #3361290)
    • the TmpFileViewMixin and TmpPngView classes (see #3400448)

    Deprecated Code Drops

    • ldapuser have been dropped; use ldapfeed now (see #2936496)
    • action GotRhythm was removed, make sure you do not import it in your cubes (even to unregister it) (see #3093362)
    • all 3.8 backward compat is gone
    • all 3.9 backward compat (including the javascript side) is gone
    • the twisted (web-only) instance type has been removed

    For a complete list of tickets, read CubicWeb 3.18.0.


  • Logilab's roadmap for CubicWeb on November 8th, 2013

    2013/11/11 by Nicolas Chauvat

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. Here is the report about the Nov 8th, 2013 meeting. The previous report posted to the blog was the september 2013 roadmap.

    Version 3.17

    This version is stable and maintained (cubicweb 3.17.11 is upcoming).

    Version 3.18

    This version was supposed to be released in september or october, but is stalled at the integration stage. All open tickets were moved to 3.19 and existing patches that are not ready to be merged will be more aggressively delayed to 3.19. The goal is to release 3.18 as soon as possible.

    For details read list of tickets for CubicWeb 3.18.0.

    Version 3.19

    This version will probably be published early next year (read january or february 2014). it is planned to include a heavy refactoring that modifies sessions and sources to lay the path for CubicWeb 4.

    For details read list of tickets for CubicWeb 3.19.0.

    Squareui

    Logilab is now developping all its new projects based on Squareui (and Bootstrap 3.0). Squareui can be considered as a usable beta, but not as feature-complete.

    Logilab is looking for a UX designer to work on the general ergonomy of CubicWeb. Read the job offer.

    Mid-term goals

    The mid-term goals include better REST support (Representational State Transfer), complete WSGI (Python's Web Server Gateway Interface) and the FROM clause for RQL queries (to reinvent db federation outside of the core).

    On the front-end side, it would be nice to be able to improve forms, maybe with client-side javascript and better support for a "json on server, js in browser" separation of concerns.

    Cubes

    A cube oauth was contributed in large part by Unlish, a startup that is using CubicWeb to implement its service.

    A cube vcwiki is being developed by Logilab, to manage the content of a wiki with a version control system (built with the cube vcsfile).

    Last but not least

    As already said on the mailing list, other developers and contributors are more than welcome to share their own goals in order to define a roadmap that best fits everyone's needs.

    Logilab's next roadmap meeting will be held at the beginning of january 2014.


  • Apache authentication

    2013/10/09 by Dimitri Papadopoulos

    An Apache front end might be useful, as Apache provides standard log files, monitoring or authentication. In our case, we have Apache authenticate users before they are cleared to access our CubicWeb application. Still, we would like user accounts to be managed within a CubicWeb instance, avoiding separate sets of identifiers, one for Apache and the other for CubicWeb.

    We have to address two issues:

    • have Apache authenticate users against accounts in the CubicWeb database,
    • have CubicWeb trust Apache authentication.

    Apache authentication against CubicWeb accounts

    A possible solution would be to access the identifiers associated to a CubicWeb account at the SQL level, directly from the SQL database underneath a CubicWeb instance. The login password can be found in the cw_login and cw_upassword columns of the cw_cwuser table. The benefit is that we can use existing Apache modules for authentication against SQL databases, typically mod_authn_dbd. On the other hand this is highly dependant on the underlying SQL database.

    Instead we have chosen an alternate solution, directly accessing the CubicWeb repository. Since we need Python to access the repository, our sysasdmins have deployed mod_python on our Apache server.

    We wrote a Python authentication module that accesses the repository using ZMQ. Thus ZMQ needs be enabled. To enable ZMQ uncomment and complete the following line in all-in-one.conf:

    zmq-repository-address=zmqpickle-tcp://localhost:8181
    

    The Python authentication module looks like:

    from mod_python import apache
    from cubicweb import dbapi
    from cubicweb import AuthenticationError
    
    def authenhandler(req):
        pw = req.get_basic_auth_pw()
        user = req.user
    
        database = 'zmqpickle-tcp://localhost:8181'
        try:
            cnx = dbapi.connect(database, login=user, password=pw)
        except AuthenticationError:
            return apache.HTTP_UNAUTHORIZED
        else:
            cnx.close()
            return apache.OK
    

    CubicWeb trusts Apache

    Our sysadmins set up Apache to add x-remote-user to the HTTP headers forwarded to CubicWeb - more on the relevant Apache configuration in the next paragraph.

    We then add the cubicweb-trustedauth cube to the dependencies of our CubicWeb application. We simply had to add to the __pkginfo__.py file of our CubicWeb application:

    __depends__ =  {
        'cubicweb': '>= 3.16.1',
        'cubicweb-trustedauth': None,
    }
    

    This cube gets CubicWeb to trust the x-remote-user header sent by the Apache front end. CubicWeb bypasses its own authentication mechanism. Users are directly logged into CubicWeb as the user with a login identical to the Apache login.

    Apache configuration and deployment

    Our Apache configuration looks like:

    <Location /apppath >
      AuthType Basic
      AuthName "Restricted Area"
      AuthBasicAuthoritative Off
      AuthUserFile /dev/null
      require valid-user
    
      PythonAuthenHandler cubicwebhandler
    
      RewriteEngine On
      RewriteCond %{REMOTE_USER} (.*)
      RewriteRule . - [E=RU:%1]
    </Location>
    
    RequestHeader set X-REMOTE-USER %{RU}e
    
    ProxyPass          /apppath  http://127.0.0.1:8080
    ProxyPassReverse   /apppath  http://127.0.0.1:8080
    

    The CubicWeb application is accessed as http://ourserver/apppath/.

    The Python authentication module is deployed as /usr/lib/python2.7/dist-packages/cubicwebhandler/handler.py where cubicwebhandler is the attribute associated to PythonAuthenHandler in the Apache configuration.


  • Brainomics / CrEDIBLE conference report

    2013/10/09 by Vincent Michel

    Cubicweb and the Brainomics project were presented last week at the CrEDIBLE workshop (October 2-4, 2013, Sophia-Antipolis) on "Federating distributed and heterogeneous biomedical data and knowledge". We would like to thank the organizers for this nice opportunity to show the features of CubicWeb and Brainomics in the context of biomedical data.

    http://credible.i3s.unice.fr/lib/tpl/credible/images/credible.png

    Workshop highlights

    • A short presentation of SHI3LD that defines data access based on conditions that are based on ASK request. The other part was a state of the art of Open data license, and the (poor) existence of licenses expressed in RDF. Future work seems to be an interesting combination of both SHI3LD and RDF-based licenses for data access.
    • MIDAS, an open-source software for sharing medical data. This project could be an interesting source of inspiration for the file sharing part of CubicWeb, even if the (really complicated in my opinion) case of large files downloads is not addressed for now.
    • Federated queries based on FedX - the optimization techniques based on source selection & exclusive groups seems a good approach for avoiding large data transfers and finding some (sub-)optimal ways to join the different data sources. This should be taken into account in the future work on the "FROM" clause in CubicWeb.
    • WebPIE/QueryPIE: a map-reduce-based approach for large-scale reasoning.

    CubicWeb and Brainomics

    The slides of the presentation can be download as a PDF or viewed on slideshare.

    Some people seem confused on the RQL to SQL translation. This relies on a simple translation logic that is implemented in the rql2sql file. This is only an implementation trick, not so different from the one used in RDBMS-based triplestores that have to convert SPARQL into SQL.

    RQL inference : there is no magic behind the RQL inference process. As opposed to triplestores that store RDF triples that contain their own schema, and thus cannot easily know the full data model in these triples without looking at all the triples, RQL relies on a relational database with an fixed (at a given moment) data model, thus allowing inference and simple checks. In particular, in this example, we want All the Cities of `Île de France` with more than 100 000 inhabitants ?, which is expressed in RQL:

    Any X WHERE X region Y, X population > 100000,
                Y uri "http://fr.dbpedia.org/resource/Île-de-France"
    

    and SPARQL:

    select ?ville where {
    ?ville db-owl:region <http://fr.dbpedia.org/resource/Île-de-France> .
    ?ville db-owl:populationTotal ?population .
    FILTER (?population > 100000)
    }
    

    Beside the fact that RQL is less verbose that SPARQL (syntax matters), the simplicity of RQL relies on the fact that it can automatically infer (similarly to SPARQL) that if X is related to Y by the region relation and has a population attribute, it should be a city. If city and district both have the region relation and a population attribute, the RQL inference allows to fetch them both transparently, otherwise one can be specific by using the is relation:

    Any X WHERE X is City, X region Y, X population > 100000,
                Y uri "http://fr.dbpedia.org/resource/Île-de-France"
    

    RQL also allows subqueries, union, full-text search, stored procedures, ... (see the doc).

    These really interesting discussions convinced us that we should write a journal paper for detailing the theoretical and technical concepts behind RQL and the YAMS schema.


  • Logilab will be in Toulouse métropole Open Data Barcamp tomorrow

    2013/10/08 by Sylvain Thenault

    Meet us tomorrow at the Toulouse's Cantine where several people from Logilab will be there for the open data barcamp organized by Toulouse Metropole.

    More infos on barcamp.org. We'll probably talk abouthow CubicWeb manages to import large amounts of open-data to reuse.


  • Logilab's roadmap for CubicWeb on September 6th, 2013

    2013/09/17 by Nicolas Chauvat

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. Here is the report about the Sept 6th, 2013 meeting. The previous report posted to the blog was the february 2013 roadmap.

    Version 3.17

    This version is now stable and maintained (release 3.17.7 is upcoming). It added a couple features and focused on putting CW to the diet by extracting some functionnalities provided by the core into external cubes: sioc, embed, massmailing, geocoding, etc.

    For details read what's new in CubicWeb 3.17.

    Version 3.18

    This version is now freezed and will be published as soon as all the patches are tested and merged. Since we have a lot of work for clients until the end of the year at Logilab, the community should feel free to help (as usual) if it wants this version to be released rather sooner than later.

    This version will remove the ldapuser source that is replaced by ldapfeed, implement Cross Origin Resource Sharing, drop some very old compatibility code, deprecate the old version of the multi-source system and provide various other features and bugfixes.

    For details read list of tickets for CubicWeb 3.18.0.

    Version 3.19

    This version will probably be publish early next year (read january or february 2014) unless someone who is not working at Logilab takes responsibility for its release.

    It should include the heavy refactoring work done by Pierre-Yves and Sylvain over the past year, that modifies sessions and sources to lay the path for CubicWeb 4.

    For details read list of tickets for CubicWeb 3.19.0 or take a look at this head.

    Squareui

    Since Orbui changes the organization of the default user interface on screen, it was decided to share the low-level bootstrap related views that could be shared and build a SquareUI cube that would conform design choices of the default UI.

    Logilab is now developping all its new projects based on Squareui 0.2. Read about it on the mailing list archives.

    Mid-term goals

    The mid-term goals include better REST support (Representational State Transfer), complete WSGI (Python's Web Server Gateway Interface) and the FROM clause for RQL queries (to reinvent db federation outside of the core).

    Cubes

    Our current plan is to extract as much as possible to cubes. We started CubicWeb many years ago with the Python motto "batteries included", but have since realized that having too much in the core contributes to making CubicWeb difficult to learn.

    Since we would very much like the community to grow, we are now aiming for something more balanced, like Mercurial does. The core is designed such that most features can be developed as an extension. Once they are stable, popular extensions can be moved to the main library that is distributed with the core, and be activated with a switch in the configuration file.

    Several cubes are under active development: oauth, signedrequest, dataio, etc.

    Last but not least

    As already said on the mailing list, other developers and contributors are more than welcome to share their own goals in order to define a roadmap that best fits everyone's needs.

    Logilab's next roadmap meeting will be held at the beginning of November 2013.


  • Brainomics - A management system for exploring and merging heterogeneous brain mapping data

    2013/09/12 by Arthur Lutz

    At OBHM 2013, the 19th Annual Meeting of the Organization for Human Brain Mapping, Logilab presented a poster which explains the work done using CubicWeb on brain imaging and genetics data in collaboration with INRIA, INSERM and the CEA during the Brainomics project co-financed by Agence nationale de la Rercherche.

    http://www.cubicweb.org/file/3123353/raw/Screenshot%20from%202013-09-12%2010%3A27%3A27.png

    You can download this poster and try the demo online.


  • What's new in CubicWeb 3.17

    2013/06/21 by Aurelien Campeas

    What's new in CubicWeb 3.17?

    New functionalities

    • add a command to compare db schema and file system schema (see #464991)
    • Add CubicWebRequestBase.content with the content of the HTTP request (see #2742453)
    • Add directive bookmark to ReST rendering (see #2545595)
    • Allow user defined final type (see #124342)

    API changes

    • drop typed_eid() in favour of int() (see #2742462)
    • The SIOC views and adapters have been removed from CubicWeb and moved to the sioc cube.
    • The web page embedding views and adapters have been removed from CubicWeb and moved to the embed cube.
    • The email sending views and controllers have been removed from CubicWeb and moved to the massmailing cube.
    • RenderAndSendNotificationView is deprecated in favor of ActualNotificationOp the new operation uses the more efficient data idiom.
    • Looping task can now have an interval <= 0. Negative interval disable the looping task entirely.
    • We now serve html instead of xhtml. (see #2065651)

    Deprecation

    • ldapuser has been deprecated. It will be removed in a future version. If you are still using ldapuser switch to ldapfeed NOW!
    • hijack_user has been deprecated. It will be dropped soon.

    Deprecated Code Drops

    • The progress views and adapters have been removed from CubicWeb. These classes were deprecated since 3.14.0. They are still available in the iprogress cube.
    • The part of the API deprecated since 3.7 was dropped.

  • We're going to PGDay France, the Postgresql Community conference

    2013/06/11 by Arthur Lutz

    A few people of the CubicWeb team are going to attend the French PostgreSQL community conference in Nantes (France) on the 13th of june.

    http://www.cubicweb.org/file/2932005/raw/hdr_left.png

    We're excited to learn more about the following topics that are relevant to CubicWeb's development and features :

    https://www.pgday.fr/_media/pgfr2.png

    Obviously we'll pay attention to all the talks during the day. If you're attending, we hope to see you there.


  • OpenData meets the Semantic Web at WOD2013

    2013/06/10 by Arthur Lutz

    With a few people from Logilab we went to the 2nd International Workshop on Open Data (WOD), on the 3rd of june.

    Although the main focus was an academic take on OpenData, a lot of talks were related to the Semantic Web technologies and especially LinkedData.

    http://www.logilab.org/file/144837/raw/banniere-wod2013.png

    The full program (and papers) is on the following website. Here is a quick review of the things we though worth sharing.

    • privacy oriented ontologies : http://l2tap.org/
    • interesting automations done to suggest alignments when initial data is uploaded to an opendata website
    • some opendata platforms have built-in APIs to get files, one example is Socrata : http://dev.socrata.com/
    • some work is being done to scale processing of linked data in the cloud (did you know you could access ready available datasets in the Amazon cloud ? DBPedia for example )
    • the data stored in wikipedia can be a good source of vocabulary on certain machine learning tasks (and in the future, wikidata project)
    • there is an RDF extension to Google Refine (or OpenRefine), but we haven't managed to get it working out of the box,
    • WebSmatch uses morphological operators (erosion / dilation) to identify grids and zones in Excel Spreadsheets and then aligns column data on known reference values (e.g. country lists).

    We naturally enjoyed the presentation made by Romain Wenz about http://data.bnf.fr with the unavoidable mention of Victor Hugo (and CubicWeb).

    Thanks to the organizers of the conference and to the National French Library for hosting the event.


  • data.bnf.fr gets the Stanford Prize for Innovation in Research Libraries

    2013/03/01 by Nicolas Chauvat

    data.bnf.fr and Gallica just got awarded the Stanford Prize for Innovation in Research Libraries 2013. The CubicWeb community is very pleased to see that data.bnf.fr, which is built with CubicWeb, is being recognized at the top international level as leading innovation its domain! Read the comments of the judges for more details.


  • CubicWeb at Data Tuesday on Feb 26th 2013

    2013/02/14 by Nicolas Chauvat

    CubicWeb was showcased at Data Tuesday on Feb 26th 2013. The other presentations were interesting, especially shacache.org, the soon-to-be-launched OpenMeteoData and the very useful scikit.learn.


  • CubicWeb rewarded at Dataconnexion 2013

    2013/02/06 by Nicolas Chauvat

    CubicWeb got rewarded yesterday at the award ceremony of the Dataconnexions 2013 contest.

    http://www.cubicweb.org/2710848?vid=download

    Dataconnexions is a contest organized by Etalab, the organization part of the French State that is in charge of data.gouv.fr, that catalogs the open data published by the french administration.

    Congratulations to all the developers and users of CubicWeb and welcome to the people who will join the CW community thanks to the media coverage we are now experiencing.

    Read the announce to the press and the slides.


  • Logilab's roadmap for CubicWeb as of February 2013

    2013/02/04 by Nicolas Chauvat

    The Logilab team now holds a roadmap meeting every two months to plan its CubicWeb development effort. Here are the decisions that were taken on Feb 1st, 2013.

    Version 3.17

    This version should be published before the end of March and will finish all the things that are work in progress. It will include:

    • the refactoring necessary to introduce persistant sessions,
    • the shrinking of web/views: everything that does not deserve its own cube (like sioc, embed, geocoding, etc) will go into a cube named legacyui (this will open the door to squareui),
    • stop serving pages with "content-type: application/xhtml",
    • handling postgresql schemas (will require a new version of logilab.database),
    • a new logo.

    Squareui

    Once the cube legacyui extracted (in version 3.17), it will be possible to move forward swiftly with squareui. Due to its other duties, one can not expect the core CW team to develop squareui. People interested will be in charge and ideally the squareui cube could be released when cubicweb 3.17 will be published.

    Cleaning up the backlog

    The lead CW developers will spend about 20% of their time cleaning up the ticket backlog at the forge (900 open tickets and 50 in progress !)

    The first step will be to reduce the number of tickets "in progress", then to organize the open tickets and merge the duplicates.

    Version 3.18

    This version is due at the end of may 2013. It will include:

    • persisting sessions,
    • WSGI,
    • RESTfulness: support for HTTP verbs PUT / DELETE, enforcement of the semantics of GET / POST (may be difficult to maintain backward-compatibility)

    Mid-term goals

    The mid-term goals are:

    • possibility to add new base types (Array, HStore, Geometry, TSVector, etc.) that would use extensions from the SQL backend

    • FROM clause in rql queries

    • websockets

    • defining attribute on relations and defining "virtual" relations or rules:

      class Contribution(EntityType):
          author = SubjectRelation('Person', cardinality='1*', inlined=True)
          book = SubjectRelation('Book', cardinality='1*', inlined=True)
          role = SubjectRelation('Role', cardinality='1*', inlined=True)
      
      preface_writer = VirtualRelation('C is Contribution, C author S, C book O, '
                                       'C role R, R name "preface writer"')
      

      And:

      Any P WHERE B is Book, P preface_writer B
      

      Will we need a materialized view in the database, a standard relation maintained by hooks, rewrite the RQL on-the-fly ? Time will tell.

    • cards with logic (mustache js templates for example)

    • coffeescript ? brython ? javascript ? prototype something with CubicDB + WebService that outputs json + user interface in full javascript

    • package separately Cubic(Web)DB et CubicWeb ?

    • think about the overall architecture (using WSGI, persistent sessions, etc.), and find solutions that fit a distributed architecture (look at paste.deploy, circus, etc.)

    • clean up the javascript en web/data/*.js

    • configurable metadata, managing the size of the entities table

    • more SPARQL

    • namespaces for the data models of the cubes

    As already said on the mailing list, other developers and contributors are more than welcome to share their own goals in order to define a roadmap that best fits everyone's needs.

    Logilab's next roadmap meeting will be held at the beginning of April 2013.


  • What's new in CubicWeb 3.16

    2013/01/23 by Aurelien Campeas

    What's new in CubicWeb 3.16?

    New functionalities

    • Add a new dataimport store (SQLGenObjectStore). This store enables a fast import of data (entity creation, link creation) in CubicWeb, by directly flushing information in SQL. This may only be used with PostgreSQL, as it requires the 'COPY FROM' command.

    API changes

    • Orm: set_attributes and set_relations are unified (and deprecated) in favor of cw_set that works in all cases.

    • db-api/configuration: all the external repository connection information is now in an URL (see #2521848), allowing to drop specific options of pyro nameserver host, group, etc and fix broken ZMQ source. Configuration related changes:

      • Dropped 'pyro-ns-host', 'pyro-instance-id', 'pyro-ns-group' from the client side configuration, in favor of 'repository-uri'. NO MIGRATION IS DONE, supposing there is no web-only configuration in the wild.
      • Stop discovering the connection method through repo_method class attribute of the configuration, varying according to the configuration class. This is a first step on the way to a simpler configuration handling.

      DB-API related changes:

      • Stop indicating the connection method using ConnectionProperties.
      • Drop _cnxtype attribute from Connection and cnxtype from Session. The former is replaced by a is_repo_in_memory property and the later is totaly useless.
      • Turn repo_connect into _repo_connect to mark it as a private function.
      • Deprecate in_memory_cnx which becomes useless, use _repo_connect instead if necessary.
    • the "tcp://" uri scheme used for ZMQ communications (in a way reminiscent of Pyro) is now named "zmqpickle-tcp://", so as to make room for future zmq-based lightweight communications (without python objects pickling).

    • Request.base_url gets a secure=True optional parameter that yields an https url if possible, allowing hook-generated content to send secure urls (e.g. when sending mail notifications)

    • Dataimport ucsvreader gets a new boolean ignore_errors parameter.

    Unintrusive API changes

    • Drop of cubicweb.web.uicfg.AutoformSectionRelationTags.bw_tag_map, deprecated since 3.6.

    User interface changes

    • The RQL search bar has now some auto-completion support. It means relation types or entity types can be suggested while typing. It is an awesome improvement over the current behaviour !
    • The action box associated with table views (from tableview.py) has been transformed into a nice-looking series of small tabs; it means that the possible actions are immediately visible and need not be discovered by clicking on an almost invisible icon on the upper right.
    • The uicfg module has moved to web/views/ and ui configuration objects are now selectable. This will reduce the amount of subclassing and whole methods replacement usually needed to customize the ui behaviour in many cases.
    • Remove changelog view, as neither cubicweb nor known cubes/applications were properly feeding related files.

    Other changes

    • 'pyrorql' sources will be automatically updated to use an URL to locate the source rather than configuration option. 'zmqrql' sources were broken before this change, so no upgrade is needed...
    • Debugging filters for Hooks and Operations have been added.
    • Some cubicweb-ctl commands used to show the output of msgcat and msgfmt; they don't anymore.

  • December 2012 CubicWeb Sprint Report

    2012/12/21 by Nicolas Chauvat

    For two days, on dec 13th/14th 2012, ten hackers gathered at Logilab to improve the user interface of CubicWeb. This hackathon was initiated by Crealibre. About a year ago, they started the Orbui project, a new user interface for CubicWeb based on the Bootstrap HTML/CSS framework.

    http://www.orbui.com/images/itisa960.png

    Several projects at Logilab and Crealibre proved that Orbui was heading in the right direction, but that it had to fight with the default user interface of Cubicweb. Orbui makes different design/ergonomic choices and needs different HTML/CSS structure and Javascript components.

    Sylvain published a roadmap back in may with a section titled "on the road to Bootstrap". After more than half a day of heated debate on the firts day, it was decided to follow the direction he pointed to. We started extracting from CubicWeb the default user interface and turning it into a set of cubes:

    • cubicweb-legacyui: css, views and templates extracted from CubicWeb 3.16, so as to provide full backward compatibility
    • cubicweb-bootstrap: empty cube with only bootstrap version 2.2.2 in data/
    • cubicweb-squareui: bootstrapified version of legacyui (slightly altered to benefit from the bootstrap css without breaking backward compatibility too hard)

    At the end of the sprint, one could add_cube('squareui') on an existing application and keep it usable... and get "some kind of responsiveness" for free, thus proving that we were on the right track.

    A lot of work is still ahead of us, but we have moved a few step forward towards the goal of making it easier to implement different UIs on top of CubicWeb 3.17.

    For the curious, here is what the skeleton of legacyui.views.maintemplate (aka cw.web.views.maintemplate) looks like:

    <body> (MainTemplate.template_body_header)
      <table id="header"> (HTMLPageHeader.main_header)
        for header in self.headers:
           <td id="header-{left,center,right}">
               render selected components(ctxcomponents, header-{left,center,right})
           </td>
      </table>
      <div id="stateheader"> HTMLPageHeader.call
         <div class="stateMessage"> HTMLPageHeader.state_header
      </div>
      <div id="page"> MainTemplate.template_body_header
        <table id="mainLayout"> MainTemplate.template_body_header
          if boxes (selected components(ctxcomponents, left): MainTemplate.nav_column
            <td id="navColumnLeft">
              <div class="navboxes">
                 render boxes
              </div>
            </td>
          <td id="contentColumn"> MainTemplate.template_body_header
             render selected components(rqlinput)
             render selected components(applmessages)
             if navtop (selected components(ctxcomponents, navtop): HTMLContentHeader.call
               <div id="contentheader">
                 render components
               </div>
               <div class='clear'/>
             <div id="pageContent"> MainTemplate.call
               if vtitle:
                  <div class="vtitle" />
               if etypenavigation:
                  render etypenavigation
               view pagination
               <div id="contentmain">
                  render view
               </div>
               view pagination
             </div>
             if navbottom (selected components(ctxcomponents, navbottom): HTMLContentFooter.call
               <div id="contentfooter">
                 render components
               </div>
          </td>
          if boxes (selected components(ctxcomponents, right): MainTemplate.nav_column
            <div id="navColumnRight">
              <div class="navboxes">
                 render boxes
              </div>
        </table>
      </div>
      <div id="footer"> HTMLPageFooter.call
         render actions selected (actions, 'footer')
      </div>
    </body>
    

    and here is what the skeleton from squareui.views.maintemplate looks like:

    <body>
    <div class="container-fluid">
      <div id="header" class="row-fluid">
        <!-- .header -->
      </div>
      <div class="row-fluid">
        <div id="navColumnLeft" class="span3">
          <!-- .leftcolumn -->
        </div>
        <div id="contentColumn" class="span6">
          <!-- .contentcol -->
          <div class="row-fluid">
            <div id="contentheader" class="span12">
              <!-- .contentheader -->
            </div>
          </div>
          <div class="row-fluid">
            <div id="contentmain" class="span12">
              <!-- .contentmain -->
            </div>
          </div>
          <div class="row-fluid">
            <div id="contentfooter" class="span12">
              <!-- .contentfooter -->
            </div>
          </div>
        </div>
        <div id="navColumnRight" class="span3">
          <!-- .rightcolumn -->
        </div>
      </div>
      <div id="footer" class="row-fluid">
        <!-- .footer -->
      </div>
    </div>
    </body>
    

    Stay tuned for the updates on this (important) topic!


  • Géo − Geonames alignment

    2012/12/20 by Simon Chabot

    This blog post describes the main points of the alignment process between the French National Library's Géo repository of data, and the data extracted from Geonames.

    Alignment is the process of finding similar entities in different repositories. The Géo repository of data contains a lot of locations and the goal is to find those locations in the Geonames repository, and to be able to say that location in *Géo* is the same than this one in *Geonames*. For that purpose, Logilab developed a library, called Nazca, to build those links.

    To process the alignment between Géo and Geonames, we divided the Géo repository into two groups:

    • A group gathering the Géo data having information about longitude and latitude.
    • An other, gathering the data having no information about longitude and latitude.

    Group 1 - Data having geographical information

    The alignment process is made in five steps (see figure below):

    1. Data gathering

    We gather the information needed to align, that is to say, the unique identifier, the name, the longitude and the latitude. The same applies to the Geonames data.

    2. Standardization

    This step aims to make the data the as standard as possible. ie, set to lower case, remove the stop words, remove the punctuation and so on.

    4. Alignment

    Thanks to the Kdtree, we can quickly find the geographical nearest neighbours. During this fourth step, we loop over the nearest neighbours and assign to each a grade according to the similarity of its name and the name of the location we're looking for, using the Levenshtein distance. The alignment will be made with the best graded one.

    5. Saving the results

    Finally, we save all the results into a file.

    Group 2 - Data having no geographical information

    Let's have a look to the data having no information on the longitude and the latitude. The steps are more or less the same than before, except that we cannot find neighbours using a Kdtree. So, we use an other method to find location having a quite high level of similarity in their names. This method is called the Minhashing which has been shown to be quite relevant for this purpose.

    To minimise the amount of mistakes, we try to gather locations according to their country, knowing the country in often written in the location's preferred_label. This pre-treatment helps us to filter out the cities having the same name but located in different countries. For instance, there is Paris in France, there is Paris in the United States, and there is Paris in Canada. So the alignment is made country by country.

    The fourth and the fifth steps remain the sames.

    Results obtained

    The results we got are the followings :

      Amount of locations Aligned Non-aligned
    Group 1 97572 (89.3%) (10.7%)
    Group 2 150528 (72.9%) (27.1%)
    Total 248100 (79.3%) (20.7%)

    One problem we met is the language used to describe the location. Indeed, the similarity grade is given according the distance between the names, and one can notice that Londres and London, for instance, do not having the same spelling.despite they represent the same location.

    Results improvement

    In order to improve a little bit the results, we had a closer look to the 10.7% non-aligned of the first group. The problem of the language mentioned before was pretty clear. So we decided to use the following definition : two locations are identical, if they are geographically very close. Using this definition, we get rid of the name, and focus on the longitude and the latitude only.

    To estimate the exactness of the results, we pick 50 randomly chosen location and process to a manual checking. And the results are pretty good ! 98% are correct (49/50). That's how, based on a purely geographical approach, we can increase the results covering rate (from 89.3% to 99.6%).

    In the end, we get those results :

      Amount of locations Aligned Non-aligned
    Group 1 97572 (99.6%) (0.4%)
    Group 2 150528 (72.9%) (27.1%)
    Total 248100 (83.4%) (16.4%)

  • Candidature au concours dataconnexions#2

    2012/12/20 by Nicolas Chauvat

    Au nom de la communauté des utilisateurs et développeurs de CubicWeb, je viens de déposer la candidature suivante au concours dataconnexions#2.

    1. Questionnaire de description du Projet

    Intitulé du projet

    CubicWeb - plate-forme libre de développement pour le web sémantique

    Catégorie de concours choisie

    Choisir parmi: Grand public / Professionnel / Utilité publique / Mobilité et territoires

    Utilité publique (?)

    Quel problème tentez-vous de résoudre ?

    Décrivez le (ou les) problème(s) que votre projet tente de résoudre, ainsi que son (leur) importance : taille du marché, fréquence d’utilisation potentielle, population concernée, bénéfices éventuels de service public, etc. (maximum 1000 signes).

    L'avènement du web sémantique et de l'Open Data nécessite de disposer d'outils adaptés pour développer des applications centrées sur les données.

    Ces outils doivent permettre d'importer des données facilement, de les mettre en relation lorsqu'elles proviennent de sources disjointes, de les republier et de faciliter leur interrogation et leur visualisation.

    Idéalement, ces outils doivent utiliser et respecter les standards ouverts d'internet afin de simplifier les communications et les échanges, mais aussi faciliter le développement pour les terminaux multiples (ordinateur, tablette, smartphone).

    Comment tentez-vous de le résoudre ?

    Décrivez votre produit, service ou visualisation, dans sa forme actuelle et le cas échéant après les développements futurs éventuels que vous envisagez. Précisez le ou les jeux de données publiques que vous utilisez à cet effet (maximum 1000 signes).

    CubicWeb est une plate-forme libre de développement pour le web sémantique.

    CubicWeb permet aux développeurs de se concentrer sur les spécificités de leur application plutôt que d'avoir à réinventer les briques essentielles de l'import, la fusion, la publication, l'interrogation et la visualisation de données.

    CubicWeb est un logiciel libre développé ouvertement sur internet par une communauté réduite mais déjà internationale. CubicWeb est disponible sous licence LGPL, respecte les standards du W3C (RDF, SPARQL, HTML5, CSS3, Responsive Design) et sait gérer nativement plusieurs modèles de données faisant office de standards de fait (FOAF, SIOC, DOAP, etc).

    Quel est votre modèle d’affaire ?

    Décrivez le modèle d’affaire de votre projet, c’est-à-dire les conditions de sa pérennité et de son développement : plan d’affaires et projections commerciales dans le cas d’un projet entrepreneurial ; objectifs, donneurs clés, partie prenantes dans le cas d’un projet d’ordre civique (maximum 1000 signes).

    Plusieurs sociétés commerciales s'appuient aujourd'hui sur CubicWeb pour vendre des services informatiques. L'objectif de cette communauté est de croître pour bénéficier d'une audience plus large et d'une mutualisation plus importante des coûts de maintenance et de développement de la plate-forme CubicWeb.

    Parmi les utilisateurs de CubicWeb, on compte à ce jour la Bibliothèque nationale de France, EDF, GDF-Suez, le Commissariat à l'Energie Atomique, le Centre National d'Etudes Spatiales, l'Institut Radioprotection et Sûreté Nucléaire, l'INRIA, des laboratoires de recherche médicale et des entreprises du domaine informatique.

    Quel est l’état d’avancement de votre projet ?

    Décrivez les étapes que vous avez franchies, les ressources mobilisées, les indicateurs et métriques déjà établies, etc. (maximum 1000 signes).

    Le projet CubicWeb est issu d'un effort de R&D commencé en 2001 par la société Logilab, qui avait comme objectif de se doter d'un outil permettant le développement d'applications centrées sur les données et respectant les standards du web sémantique en cours d'élaboration au W3C.

    Depuis 2008, CubicWeb est un logiciel libre dont le développement est mené ouvertement sur internet.

    Qui vous accompagne sur ce projet ?

    Décrivez l’équipe qui vous accompagne dans votre projet (le cas échéant), vos compétences, expériences et réalisations, ainsi que les partenaires éventuels qui vous soutiennent (maximum 1000 signes).

    N/A.

    Comment DataConnexions peut-­il vous aider ?

    Détaillez toutes les précisions additionnelles que vous souhaiteriez apporter au sujet de votre projet, et expliquez en quoi DataConnexions peut contribuer à pérenniser son développement (maximum 1000 signes).

    Plusieurs sociétés commerciales s'appuient aujourd'hui sur CubicWeb pour vendre des services informatiques. Les utilisations industrielles de CubicWeb sont variées et concernent des applications importantes, voire critiques.

    CubicWeb est un outil peu (re)connu et sa communauté est aujourd'hui réduite, malgré ses solides références et le récent engouement pour l'Open Data.

    DataConnexions pourrait être une tribune et une vitrine permettant à CubicWeb de trouver de nouveaux développeurs d'applications préférant bénéficier de l'expérience capitalisée dans cet outil libre plutôt que de rédécouvrir et déjouer un par un les pièges rencontrés au cours des dix ans qui ont été nécessaires à sa réalisation.

    L'objectif de cette candidature est donc de faire croître la communauté des utilisateurs et contributeurs de CubicWeb.

    2. Vidéo de présentation

    Lien permettant de télécharger une vidéo décrivant le Projet et ses fonctionnalités, d’une durée maximale de 3 minutes

    Ce n’est pas la qualité de la vidéo qui est jugée, mais le projet lui-même. La vidéo doit permettre de rendre compte des fonctionnalités du projet. Les candidats sont encouragés à réaliser une capture d’écran ou un « screencast » (par exemple avec des outils tels que CamStudio, Jing ou Screenr).

    Démonstration de l'utilisation de CubicWeb pour importer et visualiser la liste des gares françaises téléchargée depuis data.gouv.fr. Sélection des gares par le filtre à facettes et affichage sur fond de carte openstreetmap, puis export en RDF, JSON et CSV.

    CubicWeb est une plate-forme libre de développement pour le web sémantique, qui permet aux développeurs de se concentrer sur les spécificités de leur application plutôt que d'avoir à réinventer les briques essentielles de l'import, la fusion, la publication, l'interrogation et la visualisation de données.

    Lien vers vidéo sur youtube. Miroir de la vidéo sur vimeo.com.

    3. Accès en ligne au projet

    Lien permettant d’accéder au Projet, ou au code informatique compilé et interprétable du Projet

    Par exemple : URL permettant de consulter, ou, le cas échéant, de télécharger l’application, accompagnée, si nécessaire, d’instructions à cet effet. L’application devra être facile à installer et aisément démontrable sur sa plateforme de destination.

    http://www.cubicweb.org

    4. Supports de communication

    Description Non Confidentielle

    Décrivez le Projet dans des termes compatibles avec une diffusion au grand public : non confidentiels, compréhensibles par le plus grand nombre, et mettant en avant l’intérêt du projet (maximum 1000 signes).

    cf "comment tentez-vous de le résoudre"

    Elément visuel de description

    Lien vers un élément visuel décrivant et mettant en valeur le projet et ses fonctionnalités (capture d’écran, page d’accueil, schéma de description).

    /file/2544364?vid=download

    Logo du projet

    Lien vers le logo du projet.

    /file/2544362?vid=download

  • Links roundup from dotjs.eu

    2012/12/05 by Arthur Lutz

    A few people from Logilab attended the dotjs conference in Paris last week. The conference wasn't exactly what we expected, we were hoping for more technical talks. Nevertheless, some of the things we saw were quite interesting. Some of them could be relevant to CubicWeb.

    http://www.cubicweb.org/file/2532779?vid=download

    Here is a raw roundup of links collected last friday :

    Chrome developer toolsyeomangrunt.jsbackbone.jsDartTypeScriptLangExpress.jsMochaTestacularSASSAngular.jsEnyo.jsSocket.iowhen.jsCoffeescriptSource Maps explained


  • CubicWeb sprint in Paris - 2012/12/13-14

    2012/11/11 by Nicolas Chauvat

    Topics

    To be decided. Some possible topics are :

    • Work on CubicWeb front end : Anything related to Themaintemplate, primaryview, reledit, tables handling etc.
    • Share the Evolution and more integration of the OrbUI project for CW
    • Things to do for HTML5 and bootstrap integration
    • Work on ideas from Thoughts on CubicWeb 4
    • ...

    other ideas are welcome, please bring them up on cubicweb@lists.cubicweb.org

    Location

    This sprint will take place in decembre 2012 from thursday the 13th to friday the 14th. You are more than welcome to come along, help out and contribute. An introduction is planned for newcomers.

    Network resources will be available for those bringing laptops.

    Address : 104 Boulevard Auguste-Blanqui, Paris. Ring "Logilab" (googlemap)

    Metro : Glacière

    Contact : http://www.logilab.fr/contact

    Dates : 13/12/2012 to 14/12/2012

    Participants

    • Celso Flores (Crealibre - Mexico)
    • Carine Fourrier (Crealibre - Mexico)
    • ...

show 131 results