subscribe to this blog

CubicWeb Blog

News about the framework and its uses.

back to pagination (10 results)
  • Monitor all the things! ... and early too!

    2016/09/16 by Arthur Lutz

    Following the "release often, release early" mantra, I thought it might be a good idea to apply it to monitoring on one of our client projects. So right from the demo stage where we deliver a new version every few weeks (and sometimes every few days), we setup some monitoring.

    https://www.cubicweb.org/file/15338085/raw/66511658.jpg

    Monitoring performance

    The project is an application built with the CubicWeb platform, with some ElasticSearch for indexing and searching. As with any complex stack, there are a great number of places where one could monitor performance metrics.

    https://www.cubicweb.org/file/15338628/raw/Screenshot_2016-09-16_12-19-21.png

    Here are a few things we have decided to monitor, and with what tools.

    Monitoring CubicWeb

    To monitor our running Python code, we have decided to use statsd, since it is already built into CubicWeb's core. Out of the box, you can configure a statsd server address in your all-in-one.conf configuration. That will send out some timing statistics about some core functions.

    The statsd server (there a numerous implementations, we use a simple one : python-pystatsd) gets the raw metrics and outputs them to carbon which stores the time series data in whisper files (which can be swapped out for a different technology if need be).

    https://www.cubicweb.org/file/15338392/raw/Screenshot_2016-09-16_11-56-44.png

    If we are curious about a particular function or view that might be taking too long to generate or slow down the user experience, we can just add the @statsd_timeit decorator there. Done. It's monitored.

    statsd monitoring is a fire-and-forget UDP type of monitoring, it should not have any impact on the performance of what you are monitoring.

    Monitoring Apache

    Simply enough we re-use the statsd approach by plugging in an apache module to time the HTTP responses sent back by apache. With nginx and varnish, this is also really easy.

    https://www.cubicweb.org/file/15338407/raw/Screenshot_2016-09-16_11-56-54.png

    One of the nice things about this part is that we can then get graphs of errors since we will differentiate OK 200 type codes from 500 type codes (HTTP codes).

    Monitoring ElasticSearch

    ElasticSearch comes with some metrics in GET /_stats endpoint, the same goes for individual nodes, individual indices and even at cluster level. Some popular tools can be installed through the ElasticSearch plugin system or with Kibana (plugin system there too).

    We decided on a different approach that fitted well with our other tools (and demonstrates their flexibility!) : pull stats out of ElasticSearch with SaltStack, push them to Carbon, pull them out with Graphite and display them in Grafana (next to our other metrics).

    https://www.cubicweb.org/file/15338399/raw/Screenshot_2016-09-16_11-56-34.png

    On the SaltStack side, we wrote a two line execution module (elasticsearch.py)

    import requests
    def stats:
        return request.get('http://localhost:9200/_stats').json()
    

    This gets shipped using the custom execution modules mechanism (_modules and saltutils.sync_modules), and is executed every minute (or less) in the salt scheduler. The resulting dictionary is fed to the carbon returner that is configured to talk to a carbon server somewhere nearby.

    # salt demohost elasticsearch.stats
    [snip]
      { "indextime_inmillis" : 30,
    [snip]
    

    Monitoring web metrics

    To evaluate parts of the performance of a web page we can look at some metrics such as the number of assets the browser will need to download, the size of the assets (js, css, images, etc) and even things such as the number of subdomains used to deliver assets. You can take a look at such metrics in most developer tools available in the browser, but we want to graph this over time. A nice tool for this is sitespeed.io (written in javascript with phantomjs). Out of the box, it has a graphite outputter so we just have to add --graphiteHost FQDN. sitespeed.io even recommends using grafana to visualize the results and publishes some example dashboards that can be adapted to your needs.

    https://www.cubicweb.org/file/15338109/raw/sitespeed-logo-2c.png

    The sitespeed.io command is configured and run by salt using pillars and its scheduler.

    We will have to take a look at using their jenkins plugin with our jenkins continuous integration instance.

    Monitoring crashes / errors / bugs

    Applications will have bugs (in particular when released often to get a client to validate some design choices early). Level 0 is having your client calling you up saying the application has crashed. The next level is watching some log somewhere to see those errors pop up. The next level is centralised logs on which you can monitor the numerous pieces of your application (rsyslog over UDP helps here, graylog might be a good solution for visualisation).

    https://www.cubicweb.org/file/15338139/raw/Screenshot_2016-09-16_11-30-53.png

    When it starts getting useful and usable is when your bugs get reported with some rich context. That's when using sentry gets in. It's free software developed on github (although the website does not really show that) and it is written in python, so it was a good match for our culture. And it is pretty awesome too.

    We plug sentry into our WSGI pipeline (thanks to cubicweb-pyramid) by installing and configuring the sentry cube : cubicweb-sentry. This will catch rich context bugs and provide us with vital information about what the user was doing when the crash occured.

    This also helps sharing bug information within a team.

    The sentry cube reports on errors being raised when using the web application, but can also catch some errors when running some maintenance or import commands (ccplugins in CubicWeb). In this particular case, a lot of importing is being done and Sentry can detect and help us triage the import errors with context on which files are failing.

    Monitoring usage / client side

    This part is a bit neglected for the moment. Client side we can use Javascript to monitor usage. Some basic metrics can come from piwik which is usually used for audience statistics. To get more precise statistics we've been told Boomerang has an interesting approach, enabling a closer look at how fast a page was displayed client side, how much time was spend on DNS, etc.

    On the client side, we're also looking at two features of the Sentry project : the raven-js client which reports Javascript errors directly from the browser to the Sentry server, and the user feedback form which captures some context when something goes wrong or a user/client wants to report that something should be changed on a given page.

    Load testing - coverage

    To wrap up, we also often generate traffic to catch some bugs and performance metrics automatically :

    • wget --mirror $URL
    • linkchecker $URL
    • for $search_term in cat corpus; do wget URL/$search_term ; done
    • wapiti $URL --scope page
    • nikto $URL

    Then watch the graphs and the errors in Sentry... Fix them. Restart.

    Graphing it in Grafana

    We've spend little time on the dashboard yet since we're concentrating on collecting the metrics for now. But here is a glimpse of the "work in progress" dashboard which combines various data sources and various metrics on the same screen and the same time scale.

    https://www.cubicweb.org/file/15338648/raw/Screenshot_2016-09-13_09-41-45.png

    Further plans

    • internal health checks, we're taking a look at python-hospital and healthz: Stop reverse engineering applications and start monitoring from the inside (Monitorama) (the idea is to distinguish between the app is running and the app is serving it's purpose), and pyramid_health
    • graph the number of Sentry errors and the number of types of errors: the sentry API should be able to give us this information. Feed it to Salt and Carbon.
    • setup some alerting : next versions of Grafana will be doing that, or with elastalert
    • setup "release version X" events in Graphite that are displayed in Grafana, maybe with some manual command or a postcreate command when using docker-compose up ?
    • make it easier for devs to have this kind of setup. Using this suite of tools in developement might sometimes be overkill, but can be useful.

  • Status of the CubicWeb python3 porting effort, February 2016

    2016/02/05 by Julien Cristau

    An effort to port CubicWeb to a dual python 2.6/2.7 and 3.3+ code base was started by Rémi Cardona in summer of 2014. The first task was to port all of CubicWeb's dependencies:

    • logilab-common 0.63
    • logilab-database 1.14
    • logilab-mtconverter 0.9
    • logilab-constraint 0.6
    • yams 0.40
    • rql 0.34

    Once that was out of the way, we could start looking at CubicWeb itself. We first set out to make sure we used python3-compatible syntax in all source files, then started to go and make as much of the test suite as possible pass under both python2.7 and python3.4. As of the 3.22 release, we are almost there. The remaining pain points are:

    • cubicweb's setup.py hadn't been converted. This is fixed in the 3.23 branch as of https://hg.logilab.org/master/cubicweb/rev/0b59724cb3f2 (don't follow that link, the commit is huge)
    • the CubicWebServerTC test class uses twisted to start an http server thread, and twisted itself is not available for python3
    • the current method to serialize schema constraints into CWConstraint objects gives different results on python2 and python3, so it needs to be fixed (https://www.logilab.org/ticket/296748)
    • various questions around packaging and deployment: what happens to e.g. the cubicweb-common package installing into python2's site-packages directory? What does the ${prefix}/share/cubicweb directory become? How do cubes express their dependencies? Do we need a flag day? What does that mean for applications?

  • Using JSONAPI as a Web API format for CubicWeb

    2016/01/26 by Denis Laxalde

    Following the introduction post about rethinking the web user interface of CubicWeb, this article will address the topic of the Web API to exchange data between the client and the server. As mentioned earlier, this question is somehow central and deserves particular interest, and better early than late. Of the two candidate representations previously identified Hydra and JSON API, this article will focus on the later. Hopefully, this will give a better insight of the capabilities and limits of this specification and would help take a decision, though a similar experiment with another candidate would be good to have. Still in the process of blog driven development, this post has several open questions from which a discussion would hopefully emerge...

    A glance at JSON API

    JSON API is a specification for building APIs that use JSON as a data exchange format between clients and a server. The media type is application/vnd.api+json. It has a 1.0 version available from mid-2015. The format has interesting features such as the ability to build compound documents (i.e. response made of several, usually related, resources) or to specify filtering, sorting and pagination.

    A document following the JSON API format basically represents resource objects, their attributes and relationships as well as some links also related to the data of primary concern.

    Taking the example of a Ticket resource modeled after the tracker cube, we could have a JSON API document formatted as:

    GET /ticket/987654
    Accept: application/vnd.api+json
    
    {
      "links": {
        "self": "https://www.cubicweb.org/ticket/987654"
      },
      "data": {
        "type": "ticket",
        "id": "987654",
        "attributes": {
          "title": "Let's use JSON API in CubicWeb"
          "description": "Well, let's try, at least...",
        },
        "relationships": {
          "concerns": {
            "links": {
              "self": "https://www.cubicweb.org/ticket/987654/relationships/concerns",
              "related": "https://www.cubicweb.org/ticket/987654/concerns"
            },
            "data": {"type": "project", "id": "1095"}
          },
          "done_in": {
            "links": {
              "self": "https://www.cubicweb.org/ticket/987654/relationships/done_in",
              "related": "https://www.cubicweb.org/ticket/987654/done_in"
            },
            "data": {"type": "version", "id": "998877"}
          }
        }
      },
      "included": [{
        "type": "project",
        "id": "1095",
        "attributes": {
            "name": "CubicWeb"
        },
        "links": {
          "self": "https://www.cubicweb.org/project/cubicweb"
        }
      }]
    }
    

    In this JSON API document, top-level members are links, data and included. The later is here used to ship some resources (here a "project") related to the "primary data" (a "ticket") through the "concerns" relationship as denoted in the relationships object (more on this later).

    While the decision of including or not these related resources along with the primary data is left to the API designer, JSON API also offers a specification to build queries for inclusion of related resources. For example:

    GET /ticket/987654?include=done_in
    Accept: application/vnd.api+json
    

    would lead to a response including the full version resource along with the above content.

    Enough for the JSON API overview. Next I'll present how various aspects of data fetching and modification can be achieved through the use of JSON API in the context of a CubicWeb application.

    CRUD

    CRUD of resources is handled in a fairly standard way in JSON API, relying of HTTP protocol semantics.

    For instance, creating a ticket could be done as:

    POST /ticket
    Content-Type: application/vnd.api+json
    Accept: application/vnd.api+json
    
    {
      "data": {
        "type": "ticket",
        "attributes": {
          "title": "Let's use JSON API in CubicWeb"
          "description": "Well, let's try, at least...",
        },
        "relationships": {
          "concerns": {
            "data": { "type": "project", "id": "1095" }
          }
        }
      }
    }
    

    Then updating it (assuming we got its id from a response to the above request):

    PATCH /ticket/987654
    Content-Type: application/vnd.api+json
    Accept: application/vnd.api+json
    
    {
      "data": {
        "type": "ticket",
        "id": "987654",
        "attributes": {
          "description": "We'll succeed, for sure!",
        },
      }
    }
    

    Relationships

    In JSON API, a relationship is in fact a first class resource as it is defined by a noun and an URI through a link object. In this respect, the client just receives a couple of links and can eventually operate on them using the proper HTTP verb. Fetching or updating relationships is done using the special <resource url>/relationships/<relation type> endpoint (self member of relationships items in the first example). Quite naturally, the specification relies on GET verb for fetching targets, PATCH for (re)setting a relation (i.e. replacing its targets), POST for adding targets and DELETE to drop them.

    GET /ticket/987654/relationships/concerns
    Accept: application/vnd.api+json
    
    {
      "data": {
        "type": "project",
        "id": "1095"
      }
    }
    
    PATCH /ticket/987654/relationships/done_in
    Content-Type: application/vnd.api+json
    Accept: application/vnd.api+json
    
    {
      "data": {
        "type": "version",
        "id": "998877"
      }
    }
    

    The body of request and response of this <resource url>/relationships/<relation type> endpoint consists of so-called resource identifier objects which are lightweight representation of resources usually only containing information about their "type" and "id" (enough to uniquely identify them).

    Related resources

    Remember the related member appearing in relationships links in the first example?

      [ ... ]
      "done_in": {
        "links": {
          "self": "https://www.cubicweb.org/ticket/987654/relationships/done_in",
          "related": "https://www.cubicweb.org/ticket/987654/done_in"
        },
        "data": {"type": "version", "id": "998877"}
      }
      [ ... ]
    

    While this is not a mandatory part of the specification, it has an interesting usage for fetching relationship targets. In contrast with the .../relationships/... endpoint, this one is expected to return plain resource objects (which attributes and relationships information in particular).

    GET /ticket/987654/done_in
    Accept: application/vnd.api+json
    
    {
      "links": {
        "self": "https://www.cubicweb.org/998877"
      },
      "data": {
        "type": "version",
        "id": "998877",
        "attributes": {
            "number": 4.2
        },
        "relationships": {
          "version_of": {
            "self": "https://www.cubicweb.org/998877/relationships/version_of",
            "data": { "type": "project", "id": "1095" }
          }
        }
      },
      "included": [{
        "type": "project",
        "id": "1095",
        "attributes": {
            "name": "CubicWeb"
        },
        "links": {
          "self": "https://www.cubicweb.org/project/cubicweb"
        }
      }]
    }
    

    Meta information

    The JSON API specification allows to include non-standard information using a so-called meta object. This can be found in various place of the document (top-level, resource objects or relationships object). Usages of this field is completely free (and optional). For instance, we could use this field to store the workflow state of a ticket:

    {
      "data": {
        "type": "ticket",
        "id": "987654",
        "attributes": {
          "title": "Let's use JSON API in CubicWeb"
        },
        "meta": { "state": "open" }
    }
    

    Permissions

    Permissions are part of metadata to be exchanged during request/response cycles. As such, the best place to convey this information is probably within the headers. According to JSON API's FAQ, this is also the recommended way for a resource to advertise on supported actions.

    So for instance, response to a GET request could include Allow headers, indicating which request methods are allowed on the primary resource requested:

    GET /ticket/987654
    Allow: GET, PATCH, DELETE
    

    An HEAD request could also be used for querying allowed actions on links (such as relationships):

    HEAD /ticket/987654/relationships/comments
    Allow: POST
    

    This approach has the advantage of being standard HTTP, no particular knowledge of the permissions model is required and the response body is not cluttered with these metadata.

    Another possibility would be to rely use the meta member of JSON API data.

    {
      "data": {
        "type": "ticket",
        "id": "987654",
        "attributes": {
          "title": "Let's use JSON API in CubicWeb"
        },
        "meta": {
          "permissions": ["read", "update"]
        }
      }
    }
    

    Clearly, this would minimize the amount client/server requests.

    More Hypermedia controls

    With the example implementation described above, it appears already possible to manipulate several aspects of the entity-relationship database following a CubicWeb schema: resources fetching, CRUD operations on entities, set/delete operations on relationships. All these "standard" operations are discoverable by the client simply because they are baked into the JSON API format: for instance, adding a target to some relationship is possible by POSTing to the corresponding relationship resource something that conforms to the schema.

    So, implicitly, this already gives us a fairly good level of Hypermedia control so that we're not so far from having a mature REST architecture according to the Richardson Maturity Model. But beyond these "standard" discoverable actions, the JSON API specification does not address yet Hypermedia controls in a generic manner (see this interesting discussion about extending the specification for this purpose).

    So the question is: would we want more? Or, in other words, do we need to define "actions" which would not map directly to a concept in the application model?

    In the case of a CubicWeb application, the most obvious example (that I could think of) of where such an "action" would be needed is workflow state handling. Roughly, workflows in CubicWeb are modeled through two entity types State and TrInfo (for "transition information"), the former being handled through the latter, and a relationship in_state between the workflowable entity type at stake and its current State. It does not appear so clearly how would one model this in terms of HTTP resource. (Arguably we wouldn't want to expose the complexity of Workflow/TrInfo/State data model to the client, nor can we simply expose this in_state relationship, as a client would not be able to simply change the state of a entity by updating the relation). So what would be a custom "action" to handle the state of a workflowable resource? Back in our tracker example, how would we advertise to the client the possibility to perform "open"/"close"/"reject" actions on a ticket resource? Open question...

    Request for comments

    In this post, I tried to give an overview of a possible usage of JSON API to build a Web API for CubicWeb. Several aspects were discussed from simple CRUD operations, to relationships handling or non-standard actions. In many cases, there are open questions for which I'd love to receive feedback from the community. Recalling that this topic is a central part of the experiment towards building a client-side user interface to CubicWeb, the more discussion it gets, the better!

    For those wanting to try and play themselves with the experiments, have a look at the code. This is a work-in-progress/experimental implementation, relying on Pyramid for content negotiation and route traversals.

    What's next? Maybe an alternative experiment relying on Hydra? Or an orthogonal one playing with the schema client-side?


  • Happy New Year CubicWeb !

    2016/01/25 by Nicolas Chauvat

    This CubicWeb blog that has been asleep for some months, whereas the development was active. Let me try to summarize the recent progress.

    https://upload.wikimedia.org/wikipedia/commons/thumb/f/f1/New_Year_Ornaments_%282%29.JPG/320px-New_Year_Ornaments_%282%29.JPG

    CubicWeb 3.21

    CubicWeb 3.21 was published in July 2015. The announce was sent to the mailing list and changes were listed in the documentation.

    The main goal of this release was to reduce the technical debt. The code was improved, but the changes were not directly visible to users.

    CubicWeb 3.22

    CubicWeb 3.22 was published in January 2016. A mail was sent to the mailing list and the documentation was updated with the list of changes.

    The main achievements of this release were the inclusion of a new procedure to massively import data when using a Postgresql backend, improvements of migrations and customization of generic JSON exports.

    Roadmap and bi-monthly meetings

    After the last-minute cancellation of the may 2015 roadmap meeting, we failed to reschedule in june, the summer arrived, then the busy-busy end of the year... and voilà, we are in 2016.

    During that time, Logilab has been working on massive data import, full-js user interfaces exchanging JSON with the CubicWeb back-end, 3D in the browser, switching CubicWeb to Python3, moving its own apps to Bootstrap, using CubicWeb-Pyramid in production and improving management/supervision, etc. We will be more than happy to discuss this with the rest of the (small but strong) CubicWeb community.

    So let's wish a happy new year to everyone and meet again in March for a new roadmap session !


  • Towards building a JavaScript user interface to CubicWeb

    2016/01/08 by Denis Laxalde

    This post is an introduction of a series of articles dealing with an on-going experiment on building a JavaScript user interface to CubicWeb, to ultimately replace the web component of the framework. The idea of this series is to present the main topics of the experiment, with open questions in order to eventually engage the community as much as possible. The other side of this is to experiment a blog driven development process, so getting feedback is the very point of it!

    As of today, three main topics have been identified:

    • the Web API to let the client and server communicate,
    • the issue of representing the application schema client-side, and,
    • the construction of components of the web interface (client-side).

    As part of the first topic, we'll probably rely on another experimental work about REST-fulness undertaken recently in pyramid-cubicweb (see this head for source code). Then, it appears quite clearly that we'll need sooner or later a representation of data on the client-side and that, quite obviously, the underlying format would be JSON. Apart from exchanging of entities (database) information, we already anticipate on the need for the HATEOAS part of REST. We already took some time to look at the existing possibilities. At a first glance, it seems that hydra is the most promising in term of capabilities. It's also built using semantic web technologies which definitely grants bonus point for CubicWeb. On the other hand, it seems a bit isolated and very experimental, while JSON API follows a more pragmatic approach (describe itself as an anti-bikeshedding tool) and appears to have more traction from various people. For this reason, we choose it for our first draft, but this topic seems so central in a new UI, and hard to hide as an implementation detail; that it definitely deserves more discussion. Other candidates could be Siren, HAL or Uber.

    Concerning the schema, it seems that there is consensus around JSON-Schema so we'll certainly give it a try.

    Finally, while there is nothing certain as of today we'll probably start on building components of the web interface using React, which is also getting quite popular these days. Beyond that choice, the first practical task in this topic will concern the primary view system. This task being neither too simple nor too complicated will hopefully result in a clearer overview of what the project will imply. Then, the question of edition will come up at some point. In this respect, perhaps it'll be a good time to put the UX question at a central place, in order to avoid design issues that we had in the past.

    Feedback welcome!


  • Serving Cubicweb via WSGI with Pyramid: comparing the options

    2015/04/21 by David Douard

    CubicWeb can now be powered by Pyramid (thank you so much Christophe) instead of Twisted.

    I aim at moving all our applications to CubicWeb/Pyramid, so I wonder what will be the best way to deliver them. For now, we have a setup made of Apache + Varnish + Cubicweb/Twisted. In some applications we have two CubicWeb instances with a naive load balacing managed by Varnish.

    When moving to cubicweb-pyramid, there are several options. By default, a cubicweb-pyramid instance started via the cubicweb-ctl pyramid command, is running a waitress wsgi http server. I read it is common to deliver wsgi applications with nginx + uwsgi, but I wanted to play with mongrel2 (that I already tested with Cubicweb a while ago), and give a try to the circus + chaussette stack.

    I ran my tests :

    • using ab the simple Apache benchmark tool (aka ApacheBench) ;
    • on a clone of our logilab.org forge ;
    • on my laptop (Intel Core i7, 2.67GHz, quad core, 8Go),
    • using a postgresql 9.1 database server.

    Setup

    In order to be able to start the application as a wsgi app, a small python script is required. I extracted a small part of the cubicweb-pyramid ccplugin.py file into a elo.py file for this:

    appid = 'elo2'
    
    cwconfig = cwcfg.config_for(appid)
    application = wsgi_application_from_cwconfig(cwconfig)
    repo = cwconfig.repository()
    repo.start_looping_tasks()
    

    I tested 5 configurations: twisted, pyramid, mongrel2+wsgid, uwsgi and circus+chaussette. When possible, they were tested with 1 worker and 4 workers.

    Legacy Twisted mode

    Using good old legacy twisted setup:

    cubicwebctl start -D -l info elo
    

    The config setting that worth noting are:

    webserver-threadpool-size=6
    connections-pool-size=6
    

    Basic Pyramid mode

    Using the pyramid command that uses waitress:

    cubicwebctl pyramid --no-daemon -l info elo
    

    Mongrel2 + wsgid

    I have not been able to use uwsgi-mongrel2 as wsgi backend for mongrel2, since this uwsgi plugin is not provided by the uwsgi debian packages. I've used wsgid instead (sadly, the project appears to be dead).

    The mongrel config is:

    main = Server(
       uuid="f400bf85-4538-4f7a-8908-67e313d515c2",
       access_log="/logs/access.log",
       error_log="/logs/error.log",
       chroot="./",
       default_host="localhost",
       name="test",
       pid_file="/pid/mongrel2.pid",
       bind_addr="0.0.0.0",
       port=8083,
       hosts = [
           Host(name="localhost",
                routes={'/': Handler(send_spec='tcp://127.0.0.1:5000',
                                     send_ident='2113523d-f5ff-4571-b8da-8bddd3587475',
                                     recv_spec='tcp://127.0.0.1:5001',
                                     recv_ident='')
                       })
               ]
       )
    
    servers = [main]
    

    and the wsgid server is started with:

    wsgid --recv tcp://127.0.0.1:5000 --send tcp://127.0.0.1:5001 --keep-alive \
    --workers <N> --wsgi-app elo.application --app-path .
    

    uwsgi

    The config file used to start uwsgi is:

    [uwsgi]
    stats = 127.0.0.1:9191
    processes = <N>
    wsgi-file = elo.py
    http = :8085
    plugin = http,python
    virtualenv = /home/david/hg/grshells/venv/jpl
    enable-threads = true
    lazy-apps = true
    

    The tricky config option there is lazy-apps which must be set, otherwise the worker processes are forked after loading the cubicweb application, which this later does not support. If you omit this, only one worker will get the requests.

    circus + chaussette

    For the circus setup, I have used this configuration file:

    [circus]
    check_delay = 5
    endpoint = tcp://127.0.0.1:5555
    pubsub_endpoint = tcp://127.0.0.1:5556
    stats_endpoint = tcp://127.0.0.1:5557
    statsd = True
    httpd = True
    httpd_host = localhost
    httpd_port = 8086
    
    [watcher:webworker]
    cmd = /home/david/hg/grshells/venv/jpl/bin/chaussette --fd $(circus.sockets.webapp) elo2.app
    use_sockets = True
    numprocesses = 4
    
    [env:webworker]
    PATH=/home/david/hg/grshells/venv/jpl/bin:/usr/local/bin:/usr/bin:/bin
    CW_INSTANCES_DIR=/home/david/hg/grshells/grshell-jpl/etc
    PYTHONPATH=/home/david/hg/grshells//grshell-jpl
    
    [socket:webapp]
    host = 127.0.0.1
    port = 8085
    

    Results

    The bench are very simple; 100 requests from 1 worker or 500 requests from 5 concurrent workers, getting the main index page for the application:

    One ab worker

    ab -n 100 -c 1 http://127.0.0.1:8085/
    

    We get:

    Synthesis (1 client)

    Response times are:

    Response time (1 client)

    Five ab workers

    ab -n 500 -c 5 http://127.0.0.1:8085/
    

    We get:

    Synthesis (5 clients)

    Response times are:

    Response time (5 clients)

    Conclusion

    As expected, the legacy (and still default) twisted-based server is the least efficient method to serve a cubicweb application.

    When comparing results with only one CubicWeb worker, the pyramid+waitress solution that comes with cubicweb-pyramid is the most efficient, but mongrel2 + wsgid and circus + chaussette solutions mostly have similar performances when only one worker is activated. Surprisingly, the uwsgi solution is significantly less efficient, and especially have some requests that take significantly longer than other solutions (even the legacy twisted-based server).

    The price for activating several workers is small (around 3%) but significant when only one client is requesting the application. It is still unclear why.

    When there are severel workers requesting the application, it's not a surpsise that solutions with 4 workers behave significanly better (we are still far from a linear response however, roughly a 2x better for 4x the horsepower; maybe the hardware is the main reason for this unexpected non-linear response).

    I am quite surprised that uwsgi behaved significantly worse than the 2 other scalable solutions.

    Mongrel2 is still very efficient, but sadly the wsgid server I've used for these tests has not been developed for 2 years, and the uwsgi plugin for mongrel2 is not yet available on Debian.

    On the other side, I am very pleasantly surprised by circus + chaussette. Circus also comes with some nice features like a nice web dashboard which allows to add or remove workers dynamically:

    //www.cubicweb.org/file/5272071/raw //www.cubicweb.org/file/5272077/raw

  • CubicWeb Roadmap meeting on March 5th 2015

    2015/03/11 by David Douard

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. The previous roadmap meeting was in January 2015.

    Christophe de Vienne (Unlish) and Aurélien Campéas (self-employed) joined us.

    Christophe de Vienne asked for discussions on:

    • Security Context: settle on an approach, and make it happen.
    • Pyramid Cubicweb adoption: where are we? what authentication stack do we want by default?
    • Package layout (aka "develop mode" friendliness): let's get real
    • Documentation: is the restructuration attempt (https://www.cubicweb.org/ticket/4832808) a credible path for the documentation?

    Aurélien Campéas asked for discussions on:

    • status of integration in the 3.21 branch
    • a new API for cubicweb stores

    Sylvain Thénault asked for discussions on:

    • a new API for dataimport (including cubicweb stores, but not only),
    • new integrators on CW

    Versions

    Cubicweb

    Version 3.18

    This version is stable but old and maintained (current is 3.18.8).

    Version 3.19

    This version is stable and maintained (current is 3.19.9).

    Version 3.20

    This version is now stable and maintained (current is 3.20.4).

    Version 3.21

    See below

    Agenda

    Next roadmap meeting will be held at the beginning of may 2015 at Logilab. Interested parties are invited to get in touch.

    Open Discussions

    New integrators

    Rémi Cardona (rcardona) and Denis Laxaldle (dlaxalde) have now the publish access level on Cubicweb repositories.

    Security context

    Christophe exposed his proposal for a "security context" in Cubicweb, as exposed in https://lists.cubicweb.org/pipermail/cubicweb/2015-February/002278.html and https://lists.cubicweb.org/pipermail/cubicweb/2015-February/002297.html with a proposition of implementation (see https://www.cubicweb.org/ticket/4919855 )

    The idea has been validated based on a substitution variables, which names will start with "ctx:" (the RQL grammar will have to be modified to accept a ":")

    This will then allow to write RQL queries like (API still to be tuned):

    X owned_by U, U eid %(ctx:cwuser_eid)s
    

    Pyramid

    The pyramid-based web server proposed by Christophe and used for its unlish website is still under test and evaluation at Logilab. There are missing features (implemented in cubes) required to be able to deploy pyramid-cubicweb for most of the applications used at Logilab, especially cubicweb-signedrequest

    In order to make it possible to implement authentication cubes like cubicweb-signedrequest, the pyramid-cubicweb requires some modifications. These has been developped and are about to be published, along with a new version of signedrequest that provide pyramid compatibility.

    There are still some dependencies that lack a proper Debian package, but that should be done in the next few weeks.

    In order to properly identify pyramid-related code in a cube, it has been proposed that these code should go in modules in the cube named pviews and pconfig (note that most cube won't require any pyramid specific code). The includeme function should however be in the cube's main packgage (in the __init__.py file)

    There have been some discussions about the fact that, for now, a pyramid-cubicweb instance requires an anonymous user/access, which can also be a problem for some application.

    Layout

    Christophe pointed the fact that the directory/files layout of cubicweb and cubes do not follow current Python's de facto standards, which makes cubicweb hard to use in a context of virtualenv/pip based installation. There is the CWEP004 discussing some aspects of this problem.

    The decision has been taken to move toward a Cubicweb ecosystem that is more pip-friendly. This will be done step by step, starting with the dependencies (packages currently living in the logilab "namespace").

    Then we will investigate the feasibility of migrating the layout of Cubicweb itself.

    Documentation

    The new documentation structure has been approved.

    It has been proposed (and more or less accepted) to extract the documentation in a dedicated project. This is not a priority, however.

    Roadmap for 3.21

    No change since last meeting:

    • the complete removal of the dbapi, the merging of Connection and ClientConnection. remains
    • Integrate the pyramid cube to provide the pyramid command if the pyramid framework can be imported: removed (too soon, pyramid-cubicweb's APIs are not stable enough)
    • Integration of CWEP-003 (FROM clause for RQL): removed (will probably never be included unless someone needs it)
    • CWEP-004 (cubes as standard python packages) is being discussed: removed (not for 3.21, see above)

    dataimports et stores

    A heavy refactoring is under way that concerns data import in CubicWeb. The main goal is to design a single API to be used by the various cubes that accelerate the insertion of data (dataio, massiveimport, fastimport, etc) as well as the internal CWSource and its data feeds.

    For details, see the thread on the mailing-list and the patches arriving in the review pipeline.


  • CubicWeb roadmap meeting on January 8th, 2015

    2015/01/05 by Nicolas Chauvat

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. The previous roadmap meeting was in November 2014.

    Here is the report about the January 8th, 2015 meeting.

    Christophe de Vienne (Unlish) and Aurélien Campéas (self-employed) joined us to express their concerns and discuss the future of CubicWeb.

    Versions

    Version 3.18

    This version is stable but old and maintained (current is 3.18.7).

    Version 3.19

    This version is stable and maintained (current is 3.19.8).

    Version 3.20

    This version has been released a few days ago. It has not been deployed on production systems yet.

    Its main features are:

    • virtual relations: a new ComputedRelation class can be used in schema.py; its rule attribute is an RQL snippet that defines the new relation.

    • computed attributes: an attribute can now be defined with a formula argument (also an RQL snippet); it will be read-only, and updated automatically.

      Both of these features are described in CWEP-002, and the updated "Data model" chapter of the CubicWeb book.

    • cubicweb-ctl plugins can use the cubicweb.utils.admincnx function to get a Connection object from an instance name.

    • new 'tornado' wsgi backend

    • session cookies have the HttpOnly flag, so they're no longer exposed to javascript

    • rich text fields can be formatted as markdown

    • the edit controller detects concurrent editions, and raises a ValidationError if an entity was modified between form generation and submission

    • cubicweb can use a postgresql "schema" (namespace) for its tables

    • cubicweb-ctl configure can be used to set values of the admin user credentials in the sources configuration file

    For details read list of tickets for CubicWeb 3.20.0.

    We would have loved to integrate the pyramid cube in this release, but the debian packaging effort needed by the pyramid stack is quite big and is acceptable if we target jessie only (at decent price).

    Version 3.21

    For now, the roadmap for 3.21 is still the complete removal of the dbapi, the merging of Connection and ClientConnection.

    Integrate the pyramid cube to provide the pyramid command if the pyramid framework can be imported.

    Integration of CWEP-003 (FROM clause for RQL) and CWEP-004 (cubes as standard python packages) is being discussed.

    Version 4.0

    We expect to accelerate development of CubicWeb 4, which exact roadmap is still to be discussed, but we may already want:

    • be pyramid-based (remove twisted, auth management, etc.),
    • do not have anything left of old dbapi and ClientConnection,
    • integrate squareui as main (and only) web-ui "template" or remove web generation (almost) completely from cubicweb-core and provide it only through the cube system.

    Agenda

    Next roadmap meeting will be held at the beginning of march 2015 at Logilab. Interested parties are invited to get in touch.

    Open Discussions

    Refactoring the documentation

    Christophe de Vienne suggested to completely revamp the documentation and intends to lead this effort.

    Training material

    Aurélien Campéas asks if Logilab would be willing to share its training material under a free license to help interested parties organize and sell trainings.

    Towards making squareui the default rendering engine for cubicweb

    We are expecting to be able to use squareui/bootstrap as "rendering engine" for our forge applications (like http://www.cubicweb.org and http://www.logilab.org) as soon as possible. However to achieve to goal, there are still too many "visual bugs", some of which may require a discussion.

    Among others:

    • put the ctxtoolbar component in the <nav> div
    • each box component should have an icon (what API for this?)
    • we cannot easily make the left column of the main template responsive-aware (requires to change the html flow), so it's probably best to take inspiration from things like http://wrapbootstrap.com/preview/WB0N89JMK
    • facet boxes are a mess, there is no simple solution to have a "smart layout"

    Migration

    • AppObjects should not be loaded by default
    • Have a look at Alembic the migration tool for SQLAlchemy and take inspiration from there.

  • CubicWeb roadmap meeting on November 6th, 2014

    2014/11/03 by Nicolas Chauvat

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. The previous roadmap meeting was in September 2014.

    Here is the report about the November 6th, 2014 meeting. Christophe de Vienne (Unlish) joined us to express their concerns and discuss the future of CubicWeb. Dimitri Papadopoulos (CEA) could not come.

    Versions

    Version 3.17

    This version is stable but old and maintainance will continue only as long as some customers will be willing to pay for it (current is 3.17.17).

    If you're still using 3.17, you should go directly to 3.19.

    Version 3.18

    This version is stable but old and maintained (current is 3.18.6).

    Version 3.19

    This version is stable and maintained (current is 3.19.5).

    Version 3.20

    This version is still under development but should be released very soon now (expected next week). Its main feature being the inclusion of CWEP-002 (computed attributes and relations), along with many small improvement patches.

    For details read list of tickets for CubicWeb 3.20.0.

    We would have loved to integrate the pyramid cube in this release, but the debian packaging effort needed by the pyramid stack is quite big and is acceptable if we target jessie only (at decent price).

    Version 3.21

    For now, the roadmap for 3.21 is still the complete removal of the dbapi, the merging of Connection and ClientConnection, and possibly including CWEP-003 (adding a FROM clause to RQL).

    Integrate the pyramid cube to provide the pyramid command if the pyramid framework can be imported.

    Integration of CWEP-004 is being discussed.

    Version 4.0

    We expect to accelerate development of CubicWeb 4, which exact roadmap is still to be discussed, but we may already want:

    • be pyramid-based (remove twisted, auth management, etc.),
    • do not have anything left of old dbapi and ClientConnection,
    • integrate squareui as main (and only) web-ui "template" or remove web generation (almost) completely from cubicweb-core and provide it only through the cube system.

    CWEPs

    Here is the status of open CubicWeb Evolution Proposals:

    to be written

    Work in progress

    Some work is in progress around CKAN, DCAT and othr Open Data and Semantic Web related technologies.

    Agenda

    Next roadmap meeting will be held at the beginning of january 2015 at Logilab, and Christophe and Dimitri (or Yann) are invited.

    Open Discussions

    Migration:

    • AppObjects should not be loaded by default
    • Have a look at Alembic the migration tool for SQLAlchemy and take inspiration from there

  • Exploring the datafeed API in CubicWeb

    2014/09/26 by Denis Laxalde

    The datafeed API is one of the nice features of the CubicWeb framework. It makes it possible to easily build such things as a news aggregator (or even a semantic news feed reader), a LDAP importer or an application importing data from another web platform. The underlying API is quite flexible and powerful. Yet, the documentation being quite thin, it may be hard to find one's way through. In this article, we'll describe the basics of the datafeed API and provide guiding examples.

    The datafeed API is essentially built around two things: a CWSource entity and a parser, which is a kind of AppObject.

    The CWSource entity defines a list of URL from which to fetch data to be imported in the current CubicWeb instance, it is linked to a parser through its __regid__. So something like the following should be enough to create a usable datafeed source [1].

    create_entity('CWSource', name=u'some name', type=u'datafeed', parser=u'myparser')
    

    The parser is usually a subclass of DataFeedParser (from cubicweb.server.sources.datafeed). It should at least implement the two methods process and before_entity_copy. To make it easier, there are specialized parsers such as DataFeedXMLParser that already define process so that subclasses only have to implement the process_item method.

    Overview of the datafeed API

    Before going into further details about the actual implementation of a DataFeedParser, it's worth having in mind a few details about the datafeed parsing and import process. This involves various players from the CubicWeb server, namely: a DataFeedSource (from cubicweb.server.sources.datafeed), the Repository and the DataFeedParser.

    • Everything starts from the Repository which loops over its sources and pulls data from each of these (this is done using a looping task which is setup upon repository startup). In the case of datafeed sources, Repository sources are instances of the aforementioned DataFeedSource class [2].
    • The DataFeedSource selects the appropriate parser from the registry and loops on each uri defined in the respective CWSource entity by calling the parser's process method with that uri as argument (methods pull_data and process_urls of DataFeedSource).
    • If the result of the parsing step is successful, the DataFeedSource will call the parser's handle_deletion method, with the URI of the previously imported entities.
    • Then, the import log is formatted and the transaction committed. The DataFeedSource and DataFeedParser are connected to an import_log which feeds the CubicWeb instance with a CWDataImport per data pull. This usually contains the number of created and updated entities along with any error/warning message logged by the parser. All this is visible in a table from the CWSource primary view.

    So now, you might wonder what actually happens during the parser's process method call. This method takes an URL from which to fetch data and processes further each piece of data (using a process_item method for instance). For each data-item:

    1. the repository is queried to retrieve or create an entity in the system source: this is done using the extid2entity method;
    2. this extid2entity method essentially needs two pieces of information:
      • a so-called extid, which uniquely identifies an item in the distant source
      • any other information needed to create or update the corresponding entity in the system source (this will be later refered to as the sourceparams)
    3. then, given the (new or existing) entity returned by extid2entity, the parser can perform further postprocessing (for instance, updating any relation on this entity).

    In step 1 above, the parser method extid2entity in turns calls the repository method extid2eid given the current source and the extid value. If an entry in the entities table matches with the specified extid, the corresponding eid (identifier in the system source) is returned. Otherwise, a new eid is created. It's worth noting that the created entity (in case the entity is to be created) is not complete with respect to the data model at this point. In order the entity to be completed, the source method before_entity_insertion is called. This is where the aforementioned sourceparams are used. More specifically, on the parser side the before_entity_copy method is called: it usually just updates (using entity.cw_set() for instance) the fetched entity with any relevant information.

    Case study: a news feeds parser

    Now we'll go through a concrete example to illustrate all those fairly abstract concepts and implement a datafeed parser which can be used to import news feeds. Our parser will create entities of type FeedArticle, which minimal data model would be:

    class FeedArticle(EntityType):
        title = String(fulltextindexed=True)
        uri = String(unique=True)
        author = String(fulltextindexed=True)
        content = RichString(fulltextindexed=True, default_format='text/html')
    

    Here we'll reuse the DataFeedXMLParser, not because we have XML data to parse, but because its interface fits well with our purpose, namely: it ships an item-based processing (a process_item method) and it relies on a parse method to fetch raw data. The underlying parsing of the news feed resources will be handled by feedparser.

    class FeedParser(DataFeedXMLParser):
        __regid__ = 'newsaggregator.feed-parser'
    

    The parse method is called by process, it should return a list tuples with items information.

    def parse(self, url):
        """Delegate to feedparser to retrieve feed items"""
        data = feedparser.parse(url)
        return zip(data.entries)
    

    Then the process_item method takes an individual item (i.e. an entry of the result obtained from feedparser in our case). It essentially defines an extid, here the uri of the feed entry (good candidate for unicity) and calls extid2entity with that extid, the entity type to be created / retrieved and any additional data useful for entity completion passed as keyword arguments. (The process_feed method call just transforms the results obtained from feedparser into a dict suitable for entity creation following the data model described above.)

    def process_item(self, entry):
        data = self.process_feed(entry)
        extid = data['uri']
        entity = self.extid2entity(extid, 'FeedArticle', feeddata=data)
    

    The before_entity_copy method is called before the entity is actually created (or updated) in order to give the parser a chance to complete it with any other attribute that could be set from source data (namely feedparser data in our case).

    def before_entity_copy(self, entity, sourceparams):
        feeddata = sourceparams['feeddata']
        entity.cw_edited.update(feeddata)
    

    And this is all what's essentially needed for a simple parser. Further details could be found in the news aggregator cube. More sophisticated parsers may use other concepts not described here, such as source mappings.

    Testing datafeed parsers

    Testing a datafeed parser often involves pulling data from the corresponding datafeed source. Here is a minimal test snippet that illustrates how to retrieve the datafeed source from a CWSource entity and to pull data from it.

    with self.admin_access.repo_cnx() as cnx:
        # Assuming one knows the URI of a CWSource.
        rset = cnx.execute('CWSource X WHERE X uri %s' % uri)
        # Retrieve the datafeed source instance.
        dfsource = self.repo.sources_by_eid[rset[0][0]]
        # Make sure it's parser matches the expected.
        self.assertEqual(dfsource.parser_id, '<my-parser-id>')
        # Pull data using an internal connection.
        with self.repo.internal_cnx() as icnx:
            stats = dfsource.pull_data(icnx, force=True, raise_on_error=True)
            icnx.commit()
    

    The resulting stats is a dictionnary containing eids of created and updated entities during the pull. In addition all entities created should have the cw_source relation set to the corresponding CWSource entity.

    Notes

    [1]

    It is possible to add some configuration to the CWSource entity in the form a string of configuration items (one per line). Noteworthy items are:

    • the synchronization-interval;
    • use-cwuri-as-url=no, which avoids using external URL inside the CubicWeb instance (leading to any link on an imported entity to point to the external source URI);
    • delete-entities=[yes,no] which controls if entities not found anymore in the distant source should be deleted from the CubicWeb instance.
    [2]The mapping between CWSource entities' type (e.g. "datafeed") and DataFeedSource object is quite unusual as it does not rely on the vreg but uses a specific sources registry (defined in cubicweb.server.SOURCE_TYPES).

  • CubicWeb roadmap meeting on September 4th, 2014

    2014/09/01 by Nicolas Chauvat

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. The previous roadmap meeting was in July 2014.

    Here is the report about the September 4th, 2014 meeting. Christophe de Vienne (Unlish) and Dimitri Papadopoulos (CEA) joined us to express their concerns and discuss the future of CubicWeb.

    Versions

    Version 3.17

    This version is stable but old and maintainance will continue only as long as some customers will be willing to pay for it (current is 3.17.16 with 3.17.17 in development).

    Version 3.18

    This version is stable and maintained (current is 3.18.5 with 3.18.6 in development).

    Version 3.19

    This version is stable and maintained (current is 3.19.3 with 3.19.4 in development).

    Version 3.20

    This version is under development. It will try to reduce as much as possible the stock of patches in the state "reviewed", "awaiting review" and "in progress". If you have had something in the works that has not been accepted yet, please ready it for 3.20 and get it merged.

    It should still include the work done for CWEP-002 (computed attributes and relations).

    For details read list of tickets for CubicWeb 3.20.0.

    Version 3.21

    Removal of the dbapi, merging of Connection and ClientConnection, CWEP-003 (adding a FROM clause to RQL).

    Version 4.0

    When the work done for Pyramid will have been tested, it will become the default runner and a lot of things will be dropped: twisted, dead code, ui and core code that would be better cast into cubes, etc.

    This version could happen early in 2015.

    Cubes

    New cubes and libraries

    CWEPs

    Here is the status of open CubicWeb Evolution Proposals:

    CWEP-0002 full-featured implementation, to be merged in 3.20

    CWEP-0003 patches sent to the review. . Champion will be adim.

    Work in progress

    PyConFR

    Christophe will try to present at PyConFR the work he did on getting CubicWeb to work with Pyramid.

    Pip-friendly source layout

    Logilab and Christophe will try to make CubicWeb more pip/virtualenv-friendly. This may involve changing the source layout to include a sub-directory, but the impact on existing devs is expected to be too much and could be delayed to CubicWeb 4.0.

    Pyramid

    Christophe has made good progress on getting CubicWeb to work with Pyramid and he intends to put it into production real soon now. There is a Pyramid extension named pyramid_cubicweb and a CubicWeb cube named cubicweb-pyramid. Both work with CubicWeb 3.19. Christophe demonstrated using the debug toolbar, authenticating users with Authomatic and starting multiple workers with uWSGI.

    Early adopters are now invited to jump in and help harden the code!

    Agenda

    Logilab's next roadmap meeting will be held at the beginning of november 2014 and Christophe and Dimitri were invited.


  • Handling dependencies between form fields in CubicWeb

    2014/07/11 by Denis Laxalde

    This post considers the issue of building an edition form of a CubicWeb entity with dependencies on its fields. It's a quite common issue that needs to be handled client-side, based on user interaction.

    Consider the following example schema:

    from yams.buildobjs import EntityType, RelationDefinition, String, SubjectRelation
    from cubicweb.schema import RQLConstraint
    
    _ = unicode
    
    class Country(EntityType):
        name = String(required=True)
    
    class City(EntityType):
        name = String(required=True)
    
    class in_country(RelationDefinition):
        subject = 'City'
        object = 'Country'
        cardinality = '1*'
    
    class Citizen(EntityType):
        name = String(required=True)
        country = SubjectRelation('Country', cardinality='1*',
                                  description=_('country the citizen lives in'))
        city = SubjectRelation('City', cardinality='1*',
                               constraints=[
                                   RQLConstraint('S country C, O in_country C')],
                               description=_('city the citizen lives in'))
    

    The main entity of interest is Citizen which has two relation definitions towards Country and City. Then, a City is bound to a Country through the in_country relation definition.

    In the automatic edition form of Citizen entities, we would like to restrict the choices of cities depending on the selected Country, to be determined from the value of the country field. (In other words, we'd like the constraint on city relation defined above to be fulfilled during form rendering, not just validation.) Typically, in the image below, cities not in Italy should be available in the city select widget:

    Example of Citizen entity edition form.

    The issue will be solved by little customization of the automatic entity form, some uicfg rules and a bit of Javascript. In the following, the country field will be referred to as the master field whereas the city field as the dependent field.

    So here the code of the views.py module:

    from cubicweb.predicates import is_instance
    from cubicweb.web.views import autoform, uicfg
    from cubicweb.uilib import js
    
    _ = unicode
    
    
    class CitizenAutoForm(autoform.AutomaticEntityForm):
        """Citizen autoform handling dependencies between Country/City form fields
        """
        __select__ = is_instance('Citizen')
    
        needs_js = autoform.AutomaticEntityForm.needs_js + ('cubes.demo.js', )
    
        def render(self, *args, **kwargs):
            master_domid = self.field_by_name('country', 'subject').dom_id(self)
            dependent_domid = self.field_by_name('city', 'subject').dom_id(self)
            self._cw.add_onload(js.cw.cubes.demo.initDependentFormField(
                master_domid, dependent_domid))
            super(CitizenAutoForm, self).render(*args, **kwargs)
    
    
    def city_choice(form, field):
        """Vocabulary function grouping city choices by country."""
        req = form._cw
        vocab = [(req._('<unspecified>'), '')]
        for eid, name in req.execute('Any X,N WHERE X is Country, X name N'):
            rset = req.execute('Any N,E ORDERBY N WHERE'
                               ' X name N, X eid E, X in_country C, C eid %(c)s',
                               {'c': eid})
            if rset:
                # 'optgroup' tag.
                oattrs = {'id': 'country_%s' % eid}
                vocab.append((name, None, oattrs))
                for label, value in rset.rows:
                    # 'option' tag.
                    vocab.append((label, str(value)))
        return vocab
    
    
    uicfg.autoform_field_kwargs.tag_subject_of(('Citizen', 'city', '*'),
                                               {'choices': city_choice, 'sort': False})
    

    The first thing (reading from the bottom of the file) is that we've added a choices function on city relation of the Citizen automatic entity form via uicfg. This function city_choice essentially generates the HTML content of the field value by grouping available cities by respective country through the addition of some optgroup tags.

    Then, we've overridden the automatic entity form for Citizen entity type by essentially calling a piece of Javascript code fed with the DOM ids of the master and dependent fields. Fields are retrieved by their name (field_by_name method) and respective id using the dom_id method.

    Now the Javascript part of the picture:

    cw.cubes.demo = {
        // Initialize the dependent form field select and bind update event on
        // change on the master select.
        initDependentFormField: function(masterSelectId,
                                         dependentSelectId) {
            var masterSelect = cw.jqNode(masterSelectId);
            cw.cubes.demo.updateDependentFormField(masterSelect, dependentSelectId);
            masterSelect.change(function(){
                cw.cubes.demo.updateDependentFormField(this, dependentSelectId);
            });
        },
    
        // Update the dependent form field select.
        updateDependentFormField: function(masterSelect,
                                           dependentSelectId) {
            // Clear previously selected value.
            var dependentSelect = cw.jqNode(dependentSelectId);
            $(dependentSelect).val('');
            // Hide all optgroups.
            $(dependentSelect).find('optgroup').hide();
            // But the one corresponding to the master select.
            $('#country_' + $(masterSelect).val()).show();
        }
    }
    

    It consists of two functions. The initDependentFormField is called during form rendering and it essentially bind the second function updateDependentFormField to the change event of the master select field. The latter "update" function retrieves the dependent select field, hides all optgroup nodes (i.e. the whole content of the select widget) and then only shows dependent options that match with selected master option, identified by a custom country_<eid> set by the vocabulary function above.


  • CubicWeb roadmap meeting on July 3rd, 2014

    2014/06/26 by Nicolas Chauvat

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. The previous roadmap meeting was in May 2014.

    Here is the report about the July 3rd, 2014 meeting. Christophe de Vienne (Unlish) and Dimitri Papadopoulos (CEA) joined us to express their concerns and discuss the future of CubicWeb.

    Versions

    Version 3.17

    This version is stable but old and maintainance will continue only as long as some customers will be willing to pay for it (current is 3.17.15 with 3.17.16 in development).

    Version 3.18

    This version is stable and maintained (current is 3.18.5 with 3.18.6 in development).

    Version 3.19

    This version was published at the end of April and has now been tested on our internal servers. It includes support for Cross Origin Resource Sharing (CORS) and a heavy refactoring that modifies sessions and sources to lay the path for CubicWeb 4.

    For details read the release notes or the list of tickets for CubicWeb 3.19.0. Current is 3.19.2

    Version 3.20

    This version is under development. It will try to reduce as much as possible the stock of patches in the state "reviewed", "awaiting review" and "in progress". If you have had something in the works that has not been accepted yet, please ready it for 3.20 and get it merged.

    It should still include the work done for CWEP-002 (computed attributes and relations.

    For details read list of tickets for CubicWeb 3.20.0.

    Version 3.21 (or maybe 4.0?)

    Removal of the dbapi, merging of Connection and ClientConnection, CWEP-003 (adding a FROM clause to RQL).

    Cubes

    Cubes published over the past two months

    New cubes

    • cubicweb-frbr: Cube providing a schema based on FRBR entities
    • cubicweb-clinipath
    • cubicweb-fastimport

    CWEPs

    Here is the status of open CubicWeb Evolution Proposals:

    CWEP-0002 only missing a bit of migration support, to be finished soon for inclusion in 3.20.

    CWEP-0003 has been reviewed and is waiting for a bit of reshaping that should occurs soon. It's targeted for 3.21.

    New CWEPs are expected to be written for clarifying the API of the _cw object, supporting persistent sessions and improving the performance of massive imports.

    Work in progress

    Design

    The new logo is now published in the 3.19 line. David showed us his experimentation that modernize a forge's ui with a bit of CSS. There is still a bit of pressure on the bootstrap side though, as it still rely on heavy monkey-patching in the cubicweb-bootstrap cube.

    Data import

    Also, Dimitry expressed is concerns with the lack of proper data import API. We should soon have some feedback from Aurelien's cubicweb-fastimport experimentation, which may be an answer to Dimitry's need. In the end, we somewhat agreed that there were different needs (eg massive-no-consistency import vs not-so-big-but-still-safe), that cubicweb.dataimport was an attempt to answer them all and then cubicweb-dataio and cubicweb-fastimport were more specific responses. In the end we may reasonably hope that an API will emerge.

    Removals

    On his way to persistent sessions, Aurélien made a huge progress toward silence of warnings in the 3.19 tests. dbapi has been removed, ClientConnection / Connection merged. We decided to take some time to think about the recurring task management as it is related to other tricky topics (application / instance configuration) and it's not directly related to persistent session.

    Rebasing on Pyramid

    Last but not least, Christophe demonstrated that CubicWeb could basically live with Pyramid. This experimentation will be pursued as it sounds very promising to get the good parts from the two framework.

    Agenda

    Logilab's next roadmap meeting will be held at the beginning of september 2014 and Christophe and Dimitri were invited.


  • Logilab's roadmap for CubicWeb on May 15th, 2014

    2014/05/21 by Nicolas Chauvat

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. Here is the report about the May 15th, 2014 meeting. The previous report posted to the blog was the march 2014 roadmap.

    Versions

    Version 3.17

    This version is stable but old and maintainance will continue only as long as some customers will be willing to pay for it (current is 3.17.15).

    Version 3.18

    This version is stable and maintained (current is 3.18.4).

    Version 3.19

    This version was published at the end of April. It includes support for Cross Origin Resource Sharing (CORS) and a heavy refactoring that modifies sessions and sources to lay the path for CubicWeb 4.

    For details read the release notes or the list of tickets for CubicWeb 3.19.0.

    Version 3.20

    This version is under development. It will try to reduce as much as possible the stock of patches in the state "reviewed", "awaiting review" and "in progress". If you have had something in the works that has not been accepted yet, please ready it for 3.20 and get it merged.

    It should also include the work done for CWEP-002 (computed attributes and relations) and the merging of Connection and ClientConnection if it happens to be simple enough to get done quickly (in case the removal of dbapi would really help, this merging will wait for 3.21).

    For details read list of tickets for CubicWeb 3.20.0.

    Version 3.21 (or maybe 4.0?)

    Removal of the dbapi and merging of CWEP-003 (adding a FROM clause to RQL).

    Cubes

    Here is a list of cubes that had versions published over the past two months: accidents, awstats, book, bootstrap, brainomics, cmt, collaboration, condor, container, dataio, expense, faq, file, forge, forum, genomics, geocoding, inlineedit, inventory, keyword, link, mailinglist, mediaplayer, medicalexp, nazcaui, ner, neuroimaging, newsaggregator, processing, questionnaire, rqlcontroller, semnews, signedrequest, squareui, task, testcard, timesheet, tracker, treeview, vcsfile, workorder.

    Here are a the new cubes we are pleased to announce:

    rqlcontroller receives via a POST a list of RQL queries and executes them. This is a way to build web services.

    wsme is helping build a web service API on top of a CubicWeb database.

    signedrequest is a simple token based authentication system. This is a way for scripts or callback urls to access an instance without login/pwd information.

    relationwidget is a widget usable in forms to edit relationships between objects. It depends on CubicWeb 3.19.

    searchui is an experiment on adding blocks to the list of facets that allow building complex RQL queries step by step by clicking with the mouse instead of directly writing the RQL with the keyboard.

    ckan is using the REST API of a CKAN data portal to mirror its content.

    CWEPs

    Here is the status of open CubicWeb Evolution Proposals:

    CWEP-0002 is now in good shape and the goal is to have it merged into 3.20. It lacks some documentation and a migration script.

    CWEP-0003 has made good progress during the latest sprint, but will need a thorough review before being merged. It will probably not be ready for 3.20 and have to wait for 3.21.

    New CWEPs are expected to be written for clarifying the API of the _cw object, supporting persistent sessions and improving the performance of massive imports.

    Visual identity

    CubicWeb has a new logo that will appear before the end of may on its revamped homepage at http://www.cubicweb.org

    Last but not least

    As already said on the mailing list, other developers and contributors are more than welcome to share their own goals in order to define a roadmap that best fits everyone's needs.

    Logilab's next roadmap meeting will be held at the beginning of july 2014.


  • What's new in CubicWeb 3.19

    2014/05/05 by Aurelien Campeas

    New functionalities

    • implement Cross Origin Resource Sharing (CORS) (see #2491768)
    • system_source.create_eid can return a range of IDs, to reduce overhead of batch entity creation

    Behaviour Changes

    • The anonymous property of Session and Connection is now computed from the related user login. If it matches the anonymous-user in the config the connection is anonymous. Beware that the anonymous-user config is web specific. Therefore, no session may be anonymous in a repository only setup.

    New Repository Access API

    Connection replaces Session

    A new explicit Connection object replaces Session as the main repository entry point. A Connection holds all the necessary methods to be used server-side (execute, commit, rollback, call_service, entity_from_eid, etc...). One obtains a new Connection object using session.new_cnx(). Connection objects need to have an explicit begin and end. Use them as a context manager to never miss an end:

    with session.new_cnx() as cnx:
        cnx.execute('INSERT Elephant E, E name "Babar"')
        cnx.commit()
        cnx.execute('INSERT Elephant E, E name "Celeste"')
        cnx.commit()
    # Once you get out of the "with" clause, the connection is closed.
    

    Using the same Connection object in multiple threads will give you access to the same Transaction. However, Connection objects are not thread safe (hence at your own risks).

    repository.internal_session is deprecated in favor of repository.internal_cnx. Note that internal connections are now safe by default, i.e. the integrity hooks are enabled.

    Backward compatibility is preserved on Session.

    dbapi vs repoapi

    A new API has been introduced to replace the dbapi. It is called repoapi.

    There are three relevant functions for now:

    • repoapi.get_repository returns a Repository object either from an URI when used as repoapi.get_repository(uri) or from a config when used as repoapi.get_repository(config=config).
    • repoapi.connect(repo, login, **credentials) returns a ClientConnection associated with the user identified by the credentials. The ClientConnection is associated with its own Session that is closed when the ClientConnection is closed. A ClientConnection is a Connection-like object to be used client side.
    • repoapi.anonymous_cnx(repo) returns a ClientConnection associated with the anonymous user if described in the config.

    repoapi.ClientConnection replaces dbapi.Connection and company

    On the client/web side, the Request is now using a repoapi.ClientConnection instead of a dbapi.Connection. The ClientConnection has multiple backward compatible methods to make it look like a dbapi.Cursor and dbapi.Connection.

    Sessions used on the Web side are now the same as the ones used Server side. Some backward compatibility methods have been installed on the server side Session to ease the transition.

    The authentication stack has been altered to use the repoapi instead of the dbapi. Cubes adding new elements to this stack are likely to break.

    New API in tests

    All current methods and attributes used to access the repo on CubicWebTC are deprecated. You may now use a RepoAccess object. A RepoAccess object is linked to a new Session for a specified user. It is able to create Connection, ClientConnection and web side requests linked to this session:

    access = self.new_access('babar') # create a new RepoAccess for user babar
    with access.repo_cnx() as cnx:
        # some work with server side cnx
        cnx.execute(...)
        cnx.commit()
        cnx.execute(...)
        cnx.commit()
    
    with access.client_cnx() as cnx:
        # some work with client side cnx
        cnx.execute(...)
        cnx.commit()
    
    with access.web_request(elephant='babar') as req:
        # some work with web request
        elephant_name = req.form['elephant']
        req.execute(...)
        req.cnx.commit()
    

    By default testcase.admin_access contains a RepoAccess object for the default admin session.

    API changes

    • RepositorySessionManager.postlogin is now called with two arguments, request and session. And this now happens before the session is linked to the request.
    • SessionManager and AuthenticationManager now take a repo object at initialization time instead of a vreg.
    • The async argument of _cw.call_service has been dropped. All calls are now synchronous. The zmq notification bus looks like a good replacement for most async use cases.
    • repo.stats() is now deprecated. The same information is available through a service (_cw.call_service('repo_stats')).
    • repo.gc_stats() is now deprecated. The same information is available through a service (_cw.call_service('repo_gc_stats')).
    • repo.register_user() is now deprecated. The functionality is now available through a service (_cw.call_service('register_user')).
    • request.set_session no longer takes an optional user argument.
    • CubicwebTC does not have repo and cnx as class attributes anymore. They are standard instance attributes. set_cnx and _init_repo class methods become instance methods.
    • set_cnxset and free_cnxset are deprecated. The database connection acquisition and release cycle is now more transparent.
    • The implementation of cascading deletion when deleting composite entities has changed. There comes a semantic change: merely deleting a composite relation does not entail any more the deletion of the component side of the relation.
    • _cw.user_callback and _cw.user_rql_callback are deprecated. Users are encouraged to write an actual controller (e.g. using ajaxfunc) instead of storing a closure in the session data.
    • A new entity.cw_linkable_rql method provides the rql to fetch all entities that are already or may be related to the current entity using the given relation.

    Deprecated Code Drops

    • The session.hijack_user mechanism has been dropped.
    • EtypeRestrictionComponent has been removed, its functionality has been replaced by facets a while ago.
    • the old multi-source support has been removed. Only copy-based sources remain, such as datafeed or ldapfeed.

  • Logilab's roadmap for CubicWeb on March 7th, 2014

    2014/03/10 by Nicolas Chauvat

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. Here is the report about the Mar 7th, 2014 meeting. The previous report posted to the blog was the january 2014 roadmap.

    Version 3.17

    This version is stable but old and maintainance will stop in a few weeks (current is 3.17.13 and 3.17.14 is upcoming).

    Version 3.18

    This version is stable and maintained (current is 3.18.3 and 3.18.4 is upcoming).

    Version 3.19

    This version is about to be published. It includes a heavy refactoring that modifies sessions and sources to lay the path for CubicWeb 4.

    For details read list of tickets for CubicWeb 3.19.0.

    Version 3.20

    This version will try to reduce as much as possible the stock of patches in the state "reviewed", "awaiting review" and "in progress". If you have had something in the works that has not been accepted yet, please ready it for 3.20 and get it merged.

    It should also include the work done for CWEP-002 (computed attributes and relations) and CWEP-003 (adding a FROM clause to RQL).

    For details read list of tickets for CubicWeb 3.20.0.

    Cubes

    Here is a list of cubes that had versions published over the past two months: addressbook, awstats, blog, bootstrap, brainomics, comment, container, dataio, genomics, invoice, mediaplayer, medicalexp, neuroimaginge, person, preview, questionnaire, securityprofile, simplefacet, squareui, tag, tracker, varnish, vcwiki, vtimeline.

    Here are a the new cubes we are pleased to announce:

    collaboration is a building block that reuses container and helps to define collaborative workflows where entities are cloned, modified and shared.

    Our priorities for the next two months are collaboration and container, then narval/apycot, then mercurial-server, then rqlcontroller and signedrequest, then imagesearch.

    Mid-term goals

    The work done for CWEP-0002 (computed attributes and relations) is expected to land in CubicWeb 3.20.

    The work done for CWEP-0003 (explicit data source federation using FROM in RQL) is expected to land in CubicWeb 3.20.

    Tools to diagnose performance issues would be very useful. Maybe in 3.21 ?

    Caching session data would help and some work was done on this topic during the sprint in february. Maybe in 3.22 ?

    WSGI has made progress lately, but still needs work. Maybe in 3.23 ?

    RESTfulness is a goal. Maybe in 3.24 ?

    Maybe 3.25 will be in fact 4.0 ?

    Events

    A spring sprint will take place in Logilab's offices in Paris from April 28th to 30th. We invite all the interested parties to join us there!

    Last but not least

    As already said on the mailing list, other developers and contributors are more than welcome to share their own goals in order to define a roadmap that best fits everyone's needs.

    Logilab's next roadmap meeting will be held at the beginning of may 2014.


  • CubicWeb sprint / winter 2014

    2014/02/12 by Nicolas Chauvat

    This sprint took place at Logilab's offices in Paris on Feb 13/14. People from CEA, Unlish, Crealibre and Logilab teamed up to push CubicWeb forward.

    We did not forget the priorities from the roadmap:

    • CubicWeb 3.17.13 and 3.18.3 were released, and CubicWeb 3.19 made progress
    • the branch about ComputedAttributes and ComputedRelations (CWEP-002) is ready to be merged,
    • the branch about the FROM clause (CWEP-003) made progress (the CWEP was reviewed and part of the resulting spec was implemented),
    • in order to reduce work in progress, the number of patches in state reviewed or pending-review was brought down to 243 (from 302, that is 60 or 20%, which is not bad).

  • CubicWeb using Postgresql at its best

    2014/02/08 by Nicolas Chauvat

    We had a chat today with a core contributor to Postgresql from whom we may buy consulting services in the future. We discussed how CubicWeb could get the best out of Postgresql:

    • making use of the LISTEN/NOTIFY mechanism built into PG could be useful (to warn the cache about modified items for example) and PgQ is its good friend;
    • views (materialized or not) are another way to implement computed attributes and relations (see CWEP number 002) and it could be that the Entities table is in fact a view of other tables;
    • implementing RQL as an in-database language could open the door to new things (there is PL/pgSQL, PL/Python, what if we had PL/RQL?);
    • Foreign Data Wrappers written with Multicorn would be another way to write data feeds (see LDAP integration for an example);
    • managing dates can be tricky when users reside in different timezones and UTC is important to keep in mind (unicode/str is a good analogy);
    • for transitive closures that are often needed when implementing access control policies with __permissions, Postgresql can go a long way with queries like "WITH ... (SELECT UNION ALL SELECT RETURNING *) UPDATE USING ...";
    • the fastest way to load tabular data that does not need too much pre-processing is to create a temporary table in memory, then COPY-FROM the data into that table, then index it, then write the transform and load step in SQL (maybe with PL/Python);
    • when executing more than 10 updates in a row, it is better to write into a temporary table in memory, then update the actual tables with UPDATE USING (let's check if the psycopg driver does that when executemany is called);
    • reaching 10e8 rows in a table is at the time of this writing the stage when you should start monitoring your db seriously and start considering replication, partition and sharding.
    • full-text search is much better in Postgresql than the general public thinks it is and recent developments made it orders of magnitude faster than tools like Lucene or Solr and ElasticSearch;
    • when dealing with complex queries (searching graphs maybe), an option to consider is to implement a specific data type, use it into a materialized view and use GIN or GIST indexes over it;
    • for large scientific data sets, it could be interesting to link the numpy library into Postgresql and turn numpy arrays into a new data type;
    • Oh, and one last thing: the object-oriented tables of Postgresql are not such a great idea, unless you have a use case that fits them perfectly and does not hit their limitations (CubicWeb's is_instance_of does not seem to be one of these).

    Hopin' I got you thinkin' :)

    http://developer.postgresql.org/~josh/graphics/logos/elephant.png

  • Cubicweb sprints winter/spring 2014

    2014/01/24 by David Douard

    The Logilab team is pleased to announce two Cubicweb sprints to be held in its Paris offices in the upcoming months:

    February 13/14th at Logilab in Paris

    The agenda would be the FROM clause for which a CWEP is still awaited, and the RQL rewriter according to the CWEP02.

    April 28/30th at Logilab in Paris

    Agenda to be defined.

    Join the party

    All users and contributors of CubicWeb are invited to join the party. Just send an email to contact at Logilab.fr if you plan to come.

    http://farm1.static.flickr.com/183/419945378_4ead41a76d_m.jpg

  • Logilab's roadmap for CubicWeb on January 9th, 2014

    2014/01/14 by Nicolas Chauvat

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. Here is the report about the Jan 9th, 2014 meeting. The previous report posted to the blog was the november 2013 roadmap.

    Version 3.17

    This version is stable and maintained (current is 3.17.11 and 3.17.12 is upcoming).

    Version 3.18

    This version was released on Jan 10th. Read the release notes or the details of CubicWeb 3.18.0.

    Version 3.19

    This version includes a heavy refactoring that modifies sessions and sources to lay the path for CubicWeb 4. It is currently the default development head in the repository and is expected to be released before the end of january.

    For details read list of tickets for CubicWeb 3.19.0.

    Version 3.20

    This version will try to reduce as much as possible the stock of patches in the state "reviewed", "awaiting review" and "in progress". If you have had something in the works that has not been accepted yet, please ready it for 3.20 and get it merged.

    For details read list of tickets for CubicWeb 3.20.0.

    Cubes

    The current trend is to develop more and more new features in dedicated cubes than to add more code to the core of CubicWeb. If you thought CubicWeb development was slowing down, you made a mistake, because cubes are ramping up.

    Here is a list of versions that were published in the past two months: timesheet, postgis, leaflet, bootstrap, worker, container, embed, geocoding, vcreview, trackervcs, vcsfile, zone, dataio, mercurial-server, queueing, questionnaire, genomics, medicalexp, neuroimaging, brainomics, elections.

    Here are a the new cubes we are pleased to announce:

    Bootstrap works and we do not create a new application without it.

    relationwidget provides a modal window to edit relations in forms (use uicfg to activate it).

    resourcepicker provides a modal window to insert links to images and files into structured text.

    rqlcontroller allows to use the INSERT, DELETE and SET keywords when sending RQL queries over HTTP. It returns JSON. Get used to it and you may forget about asking for specific web services in your apps, for it is a generic web service.

    imagesearch is an image gallery with facets. You may use it as a demo of a visual search tool.

    Mid-term goals

    A new repository was created to have all the CubicWeb Evolution Proposals in one place.

    CWEP-0002 is a work in progress about computed relations and computed attributes, or maybe more. It will be a focus of the next sprint and is targeted at CubicWeb 3.20.

    A new CWEP is expected about the adding FROM keyword to RQL to implement explicit data source federation. It will be a focus of the next sprint and is targeted at CubicWeb 3.21.

    Tools to diagnose performance issues would be very useful. Maybe in 3.22 ?

    Caching session data would help. Maybe in 3.23 ?

    WSGI has made progress lately, but still needs work. Maybe in 3.24 ?

    RESTfulness is a goal. Maybe in 3.25 ?

    Maybe 3.26 will be in fact 4.0 ?

    Events

    A sprint will take place in Logilab's offices in Paris around mid-february or at the end of april. We invite all the interested parties to join us there!

    Last but not least

    As already said on the mailing list, other developers and contributors are more than welcome to share their own goals in order to define a roadmap that best fits everyone's needs.

    Logilab's next roadmap meeting will be held at the beginning of march 2014.


  • What's new in CubicWeb 3.18

    2014/01/10 by Aurelien Campeas

    The migration script does not handle sqlite nor mysql instances.

    New functionalities

    • add a security debugging tool (see #2920304)
    • introduce an add permission on attributes, to be interpreted at entity creation time only and allow the implementation of complex update rules that don't block entity creation (before that the update attribute permission was interpreted at entity creation and update time) (see #2965518)
    • the primary view display controller (uicfg) now has a set_fields_order method similar to the one available for forms
    • new method ResultSet.one(col=0) to retrieve a single entity and enforce the result has only one row (see #3352314)
    • new method RequestSessionBase.find to look for entities (see #3361290)
    • the embedded jQuery copy has been updated to version 1.10.2, and jQuery UI to version 1.10.3.
    • initial support for wsgi for the debug mode, available through the new wsgi cubicweb-ctl command, which can use either python's builtin wsgi server or the werkzeug module if present.
    • a rql-table directive is now available in ReST fields
    • cubicweb-ctl upgrade can now generate the static data resource directory directly, without a manual call to gen-static-datadir.

    API changes

    • not really an API change, but the entity write permission checks are now systematically deferred to an operation, instead of a) trying in a hook and b) if it failed, retrying later in an operation
    • The default value storage for attributes is no longer String, but Bytes. This opens the road to storing arbitrary python objects, e.g. numpy arrays, and fixes a bug where default values whose truth value was False were not properly migrated.
    • symmetric relations are no more handled by an rql rewrite but are now handled with hooks (from the activeintegrity category); this may have some consequences for applications that do low-level database manipulations or at times disable (some) hooks.
    • unique together constraints (multi-columns unicity constraints) get a name attribute that maps the CubicWeb contraint entities to the corresponding backend index.
    • BreadCrumbEntityVComponent's open_breadcrumbs method now includes the first breadcrumbs separator
    • entities can be compared for equality and hashed
    • the on_fire_transition predicate accepts a sequence of possible transition names
    • the GROUP_CONCAT rql aggregate function no longer repeats duplicate values, on the sqlite and postgresql backends

    Deprecation

    • pyrorql sources have been deprecated. Multisource will be fully dropped in the next version. If you are still using pyrorql, switch to datafeed NOW!
    • the old multi-source system
    • find_one_entity and find_entities in favor of find (see #3361290)
    • the TmpFileViewMixin and TmpPngView classes (see #3400448)

    Deprecated Code Drops

    • ldapuser have been dropped; use ldapfeed now (see #2936496)
    • action GotRhythm was removed, make sure you do not import it in your cubes (even to unregister it) (see #3093362)
    • all 3.8 backward compat is gone
    • all 3.9 backward compat (including the javascript side) is gone
    • the twisted (web-only) instance type has been removed

    For a complete list of tickets, read CubicWeb 3.18.0.


  • Logilab's roadmap for CubicWeb on November 8th, 2013

    2013/11/11 by Nicolas Chauvat

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. Here is the report about the Nov 8th, 2013 meeting. The previous report posted to the blog was the september 2013 roadmap.

    Version 3.17

    This version is stable and maintained (cubicweb 3.17.11 is upcoming).

    Version 3.18

    This version was supposed to be released in september or october, but is stalled at the integration stage. All open tickets were moved to 3.19 and existing patches that are not ready to be merged will be more aggressively delayed to 3.19. The goal is to release 3.18 as soon as possible.

    For details read list of tickets for CubicWeb 3.18.0.

    Version 3.19

    This version will probably be published early next year (read january or february 2014). it is planned to include a heavy refactoring that modifies sessions and sources to lay the path for CubicWeb 4.

    For details read list of tickets for CubicWeb 3.19.0.

    Squareui

    Logilab is now developping all its new projects based on Squareui (and Bootstrap 3.0). Squareui can be considered as a usable beta, but not as feature-complete.

    Logilab is looking for a UX designer to work on the general ergonomy of CubicWeb. Read the job offer.

    Mid-term goals

    The mid-term goals include better REST support (Representational State Transfer), complete WSGI (Python's Web Server Gateway Interface) and the FROM clause for RQL queries (to reinvent db federation outside of the core).

    On the front-end side, it would be nice to be able to improve forms, maybe with client-side javascript and better support for a "json on server, js in browser" separation of concerns.

    Cubes

    A cube oauth was contributed in large part by Unlish, a startup that is using CubicWeb to implement its service.

    A cube vcwiki is being developed by Logilab, to manage the content of a wiki with a version control system (built with the cube vcsfile).

    Last but not least

    As already said on the mailing list, other developers and contributors are more than welcome to share their own goals in order to define a roadmap that best fits everyone's needs.

    Logilab's next roadmap meeting will be held at the beginning of january 2014.


  • Apache authentication

    2013/10/09 by Dimitri Papadopoulos

    An Apache front end might be useful, as Apache provides standard log files, monitoring or authentication. In our case, we have Apache authenticate users before they are cleared to access our CubicWeb application. Still, we would like user accounts to be managed within a CubicWeb instance, avoiding separate sets of identifiers, one for Apache and the other for CubicWeb.

    We have to address two issues:

    • have Apache authenticate users against accounts in the CubicWeb database,
    • have CubicWeb trust Apache authentication.

    Apache authentication against CubicWeb accounts

    A possible solution would be to access the identifiers associated to a CubicWeb account at the SQL level, directly from the SQL database underneath a CubicWeb instance. The login password can be found in the cw_login and cw_upassword columns of the cw_cwuser table. The benefit is that we can use existing Apache modules for authentication against SQL databases, typically mod_authn_dbd. On the other hand this is highly dependant on the underlying SQL database.

    Instead we have chosen an alternate solution, directly accessing the CubicWeb repository. Since we need Python to access the repository, our sysasdmins have deployed mod_python on our Apache server.

    We wrote a Python authentication module that accesses the repository using ZMQ. Thus ZMQ needs be enabled. To enable ZMQ uncomment and complete the following line in all-in-one.conf:

    zmq-repository-address=zmqpickle-tcp://localhost:8181
    

    The Python authentication module looks like:

    from mod_python import apache
    from cubicweb import dbapi
    from cubicweb import AuthenticationError
    
    def authenhandler(req):
        pw = req.get_basic_auth_pw()
        user = req.user
    
        database = 'zmqpickle-tcp://localhost:8181'
        try:
            cnx = dbapi.connect(database, login=user, password=pw)
        except AuthenticationError:
            return apache.HTTP_UNAUTHORIZED
        else:
            cnx.close()
            return apache.OK
    

    CubicWeb trusts Apache

    Our sysadmins set up Apache to add x-remote-user to the HTTP headers forwarded to CubicWeb - more on the relevant Apache configuration in the next paragraph.

    We then add the cubicweb-trustedauth cube to the dependencies of our CubicWeb application. We simply had to add to the __pkginfo__.py file of our CubicWeb application:

    __depends__ =  {
        'cubicweb': '>= 3.16.1',
        'cubicweb-trustedauth': None,
    }
    

    This cube gets CubicWeb to trust the x-remote-user header sent by the Apache front end. CubicWeb bypasses its own authentication mechanism. Users are directly logged into CubicWeb as the user with a login identical to the Apache login.

    Apache configuration and deployment

    Our Apache configuration looks like:

    <Location /apppath >
      AuthType Basic
      AuthName "Restricted Area"
      AuthBasicAuthoritative Off
      AuthUserFile /dev/null
      require valid-user
    
      PythonAuthenHandler cubicwebhandler
    
      RewriteEngine On
      RewriteCond %{REMOTE_USER} (.*)
      RewriteRule . - [E=RU:%1]
    </Location>
    
    RequestHeader set X-REMOTE-USER %{RU}e
    
    ProxyPass          /apppath  http://127.0.0.1:8080
    ProxyPassReverse   /apppath  http://127.0.0.1:8080
    

    The CubicWeb application is accessed as http://ourserver/apppath/.

    The Python authentication module is deployed as /usr/lib/python2.7/dist-packages/cubicwebhandler/handler.py where cubicwebhandler is the attribute associated to PythonAuthenHandler in the Apache configuration.


  • Brainomics / CrEDIBLE conference report

    2013/10/09 by Vincent Michel

    Cubicweb and the Brainomics project were presented last week at the CrEDIBLE workshop (October 2-4, 2013, Sophia-Antipolis) on "Federating distributed and heterogeneous biomedical data and knowledge". We would like to thank the organizers for this nice opportunity to show the features of CubicWeb and Brainomics in the context of biomedical data.

    http://credible.i3s.unice.fr/lib/tpl/credible/images/credible.png

    Workshop highlights

    • A short presentation of SHI3LD that defines data access based on conditions that are based on ASK request. The other part was a state of the art of Open data license, and the (poor) existence of licenses expressed in RDF. Future work seems to be an interesting combination of both SHI3LD and RDF-based licenses for data access.
    • MIDAS, an open-source software for sharing medical data. This project could be an interesting source of inspiration for the file sharing part of CubicWeb, even if the (really complicated in my opinion) case of large files downloads is not addressed for now.
    • Federated queries based on FedX - the optimization techniques based on source selection & exclusive groups seems a good approach for avoiding large data transfers and finding some (sub-)optimal ways to join the different data sources. This should be taken into account in the future work on the "FROM" clause in CubicWeb.
    • WebPIE/QueryPIE: a map-reduce-based approach for large-scale reasoning.

    CubicWeb and Brainomics

    The slides of the presentation can be download as a PDF or viewed on slideshare.

    Some people seem confused on the RQL to SQL translation. This relies on a simple translation logic that is implemented in the rql2sql file. This is only an implementation trick, not so different from the one used in RDBMS-based triplestores that have to convert SPARQL into SQL.

    RQL inference : there is no magic behind the RQL inference process. As opposed to triplestores that store RDF triples that contain their own schema, and thus cannot easily know the full data model in these triples without looking at all the triples, RQL relies on a relational database with an fixed (at a given moment) data model, thus allowing inference and simple checks. In particular, in this example, we want All the Cities of `Île de France` with more than 100 000 inhabitants ?, which is expressed in RQL:

    Any X WHERE X region Y, X population > 100000,
                Y uri "http://fr.dbpedia.org/resource/Île-de-France"
    

    and SPARQL:

    select ?ville where {
    ?ville db-owl:region <http://fr.dbpedia.org/resource/Île-de-France> .
    ?ville db-owl:populationTotal ?population .
    FILTER (?population > 100000)
    }
    

    Beside the fact that RQL is less verbose that SPARQL (syntax matters), the simplicity of RQL relies on the fact that it can automatically infer (similarly to SPARQL) that if X is related to Y by the region relation and has a population attribute, it should be a city. If city and district both have the region relation and a population attribute, the RQL inference allows to fetch them both transparently, otherwise one can be specific by using the is relation:

    Any X WHERE X is City, X region Y, X population > 100000,
                Y uri "http://fr.dbpedia.org/resource/Île-de-France"
    

    RQL also allows subqueries, union, full-text search, stored procedures, ... (see the doc).

    These really interesting discussions convinced us that we should write a journal paper for detailing the theoretical and technical concepts behind RQL and the YAMS schema.


  • Logilab will be in Toulouse métropole Open Data Barcamp tomorrow

    2013/10/08 by Sylvain Thenault

    Meet us tomorrow at the Toulouse's Cantine where several people from Logilab will be there for the open data barcamp organized by Toulouse Metropole.

    More infos on barcamp.org. We'll probably talk abouthow CubicWeb manages to import large amounts of open-data to reuse.


  • Logilab's roadmap for CubicWeb on September 6th, 2013

    2013/09/17 by Nicolas Chauvat

    The Logilab team holds a roadmap meeting every two months to plan its CubicWeb development effort. Here is the report about the Sept 6th, 2013 meeting. The previous report posted to the blog was the february 2013 roadmap.

    Version 3.17

    This version is now stable and maintained (release 3.17.7 is upcoming). It added a couple features and focused on putting CW to the diet by extracting some functionnalities provided by the core into external cubes: sioc, embed, massmailing, geocoding, etc.

    For details read what's new in CubicWeb 3.17.

    Version 3.18

    This version is now freezed and will be published as soon as all the patches are tested and merged. Since we have a lot of work for clients until the end of the year at Logilab, the community should feel free to help (as usual) if it wants this version to be released rather sooner than later.

    This version will remove the ldapuser source that is replaced by ldapfeed, implement Cross Origin Resource Sharing, drop some very old compatibility code, deprecate the old version of the multi-source system and provide various other features and bugfixes.

    For details read list of tickets for CubicWeb 3.18.0.

    Version 3.19

    This version will probably be publish early next year (read january or february 2014) unless someone who is not working at Logilab takes responsibility for its release.

    It should include the heavy refactoring work done by Pierre-Yves and Sylvain over the past year, that modifies sessions and sources to lay the path for CubicWeb 4.

    For details read list of tickets for CubicWeb 3.19.0 or take a look at this head.

    Squareui

    Since Orbui changes the organization of the default user interface on screen, it was decided to share the low-level bootstrap related views that could be shared and build a SquareUI cube that would conform design choices of the default UI.

    Logilab is now developping all its new projects based on Squareui 0.2. Read about it on the mailing list archives.

    Mid-term goals

    The mid-term goals include better REST support (Representational State Transfer), complete WSGI (Python's Web Server Gateway Interface) and the FROM clause for RQL queries (to reinvent db federation outside of the core).

    Cubes

    Our current plan is to extract as much as possible to cubes. We started CubicWeb many years ago with the Python motto "batteries included", but have since realized that having too much in the core contributes to making CubicWeb difficult to learn.

    Since we would very much like the community to grow, we are now aiming for something more balanced, like Mercurial does. The core is designed such that most features can be developed as an extension. Once they are stable, popular extensions can be moved to the main library that is distributed with the core, and be activated with a switch in the configuration file.

    Several cubes are under active development: oauth, signedrequest, dataio, etc.

    Last but not least

    As already said on the mailing list, other developers and contributors are more than welcome to share their own goals in order to define a roadmap that best fits everyone's needs.

    Logilab's next roadmap meeting will be held at the beginning of November 2013.


  • Brainomics - A management system for exploring and merging heterogeneous brain mapping data

    2013/09/12 by Arthur Lutz

    At OBHM 2013, the 19th Annual Meeting of the Organization for Human Brain Mapping, Logilab presented a poster which explains the work done using CubicWeb on brain imaging and genetics data in collaboration with INRIA, INSERM and the CEA during the Brainomics project co-financed by Agence nationale de la Rercherche.

    http://www.cubicweb.org/file/3123353/raw/Screenshot%20from%202013-09-12%2010%3A27%3A27.png

    You can download this poster and try the demo online.


  • What's new in CubicWeb 3.17

    2013/06/21 by Aurelien Campeas

    What's new in CubicWeb 3.17?

    New functionalities

    • add a command to compare db schema and file system schema (see #464991)
    • Add CubicWebRequestBase.content with the content of the HTTP request (see #2742453)
    • Add directive bookmark to ReST rendering (see #2545595)
    • Allow user defined final type (see #124342)

    API changes

    • drop typed_eid() in favour of int() (see #2742462)
    • The SIOC views and adapters have been removed from CubicWeb and moved to the sioc cube.
    • The web page embedding views and adapters have been removed from CubicWeb and moved to the embed cube.
    • The email sending views and controllers have been removed from CubicWeb and moved to the massmailing cube.
    • RenderAndSendNotificationView is deprecated in favor of ActualNotificationOp the new operation uses the more efficient data idiom.
    • Looping task can now have an interval <= 0. Negative interval disable the looping task entirely.
    • We now serve html instead of xhtml. (see #2065651)

    Deprecation

    • ldapuser has been deprecated. It will be removed in a future version. If you are still using ldapuser switch to ldapfeed NOW!
    • hijack_user has been deprecated. It will be dropped soon.

    Deprecated Code Drops

    • The progress views and adapters have been removed from CubicWeb. These classes were deprecated since 3.14.0. They are still available in the iprogress cube.
    • The part of the API deprecated since 3.7 was dropped.

  • We're going to PGDay France, the Postgresql Community conference

    2013/06/11 by Arthur Lutz

    A few people of the CubicWeb team are going to attend the French PostgreSQL community conference in Nantes (France) on the 13th of june.

    http://www.cubicweb.org/file/2932005/raw/hdr_left.png

    We're excited to learn more about the following topics that are relevant to CubicWeb's development and features :

    https://www.pgday.fr/_media/pgfr2.png

    Obviously we'll pay attention to all the talks during the day. If you're attending, we hope to see you there.


  • OpenData meets the Semantic Web at WOD2013

    2013/06/10 by Arthur Lutz

    With a few people from Logilab we went to the 2nd International Workshop on Open Data (WOD), on the 3rd of june.

    Although the main focus was an academic take on OpenData, a lot of talks were related to the Semantic Web technologies and especially LinkedData.

    http://www.logilab.org/file/144837/raw/banniere-wod2013.png

    The full program (and papers) is on the following website. Here is a quick review of the things we though worth sharing.

    • privacy oriented ontologies : http://l2tap.org/
    • interesting automations done to suggest alignments when initial data is uploaded to an opendata website
    • some opendata platforms have built-in APIs to get files, one example is Socrata : http://dev.socrata.com/
    • some work is being done to scale processing of linked data in the cloud (did you know you could access ready available datasets in the Amazon cloud ? DBPedia for example )
    • the data stored in wikipedia can be a good source of vocabulary on certain machine learning tasks (and in the future, wikidata project)
    • there is an RDF extension to Google Refine (or OpenRefine), but we haven't managed to get it working out of the box,
    • WebSmatch uses morphological operators (erosion / dilation) to identify grids and zones in Excel Spreadsheets and then aligns column data on known reference values (e.g. country lists).

    We naturally enjoyed the presentation made by Romain Wenz about http://data.bnf.fr with the unavoidable mention of Victor Hugo (and CubicWeb).

    Thanks to the organizers of the conference and to the National French Library for hosting the event.


  • data.bnf.fr gets the Stanford Prize for Innovation in Research Libraries

    2013/03/01 by Nicolas Chauvat

    data.bnf.fr and Gallica just got awarded the Stanford Prize for Innovation in Research Libraries 2013. The CubicWeb community is very pleased to see that data.bnf.fr, which is built with CubicWeb, is being recognized at the top international level as leading innovation its domain! Read the comments of the judges for more details.


  • CubicWeb at Data Tuesday on Feb 26th 2013

    2013/02/14 by Nicolas Chauvat

    CubicWeb was showcased at Data Tuesday on Feb 26th 2013. The other presentations were interesting, especially shacache.org, the soon-to-be-launched OpenMeteoData and the very useful scikit.learn.


  • CubicWeb rewarded at Dataconnexion 2013

    2013/02/06 by Nicolas Chauvat

    CubicWeb got rewarded yesterday at the award ceremony of the Dataconnexions 2013 contest.

    http://www.cubicweb.org/2710848?vid=download

    Dataconnexions is a contest organized by Etalab, the organization part of the French State that is in charge of data.gouv.fr, that catalogs the open data published by the french administration.

    Congratulations to all the developers and users of CubicWeb and welcome to the people who will join the CW community thanks to the media coverage we are now experiencing.

    Read the announce to the press and the slides.


  • Logilab's roadmap for CubicWeb as of February 2013

    2013/02/04 by Nicolas Chauvat

    The Logilab team now holds a roadmap meeting every two months to plan its CubicWeb development effort. Here are the decisions that were taken on Feb 1st, 2013.

    Version 3.17

    This version should be published before the end of March and will finish all the things that are work in progress. It will include:

    • the refactoring necessary to introduce persistant sessions,
    • the shrinking of web/views: everything that does not deserve its own cube (like sioc, embed, geocoding, etc) will go into a cube named legacyui (this will open the door to squareui),
    • stop serving pages with "content-type: application/xhtml",
    • handling postgresql schemas (will require a new version of logilab.database),
    • a new logo.

    Squareui

    Once the cube legacyui extracted (in version 3.17), it will be possible to move forward swiftly with squareui. Due to its other duties, one can not expect the core CW team to develop squareui. People interested will be in charge and ideally the squareui cube could be released when cubicweb 3.17 will be published.

    Cleaning up the backlog

    The lead CW developers will spend about 20% of their time cleaning up the ticket backlog at the forge (900 open tickets and 50 in progress !)

    The first step will be to reduce the number of tickets "in progress", then to organize the open tickets and merge the duplicates.

    Version 3.18

    This version is due at the end of may 2013. It will include:

    • persisting sessions,
    • WSGI,
    • RESTfulness: support for HTTP verbs PUT / DELETE, enforcement of the semantics of GET / POST (may be difficult to maintain backward-compatibility)

    Mid-term goals

    The mid-term goals are:

    • possibility to add new base types (Array, HStore, Geometry, TSVector, etc.) that would use extensions from the SQL backend

    • FROM clause in rql queries

    • websockets

    • defining attribute on relations and defining "virtual" relations or rules:

      class Contribution(EntityType):
          author = SubjectRelation('Person', cardinality='1*', inlined=True)
          book = SubjectRelation('Book', cardinality='1*', inlined=True)
          role = SubjectRelation('Role', cardinality='1*', inlined=True)
      
      preface_writer = VirtualRelation('C is Contribution, C author S, C book O, '
                                       'C role R, R name "preface writer"')
      

      And:

      Any P WHERE B is Book, P preface_writer B
      

      Will we need a materialized view in the database, a standard relation maintained by hooks, rewrite the RQL on-the-fly ? Time will tell.

    • cards with logic (mustache js templates for example)

    • coffeescript ? brython ? javascript ? prototype something with CubicDB + WebService that outputs json + user interface in full javascript

    • package separately Cubic(Web)DB et CubicWeb ?

    • think about the overall architecture (using WSGI, persistent sessions, etc.), and find solutions that fit a distributed architecture (look at paste.deploy, circus, etc.)

    • clean up the javascript en web/data/*.js

    • configurable metadata, managing the size of the entities table

    • more SPARQL

    • namespaces for the data models of the cubes

    As already said on the mailing list, other developers and contributors are more than welcome to share their own goals in order to define a roadmap that best fits everyone's needs.

    Logilab's next roadmap meeting will be held at the beginning of April 2013.


  • What's new in CubicWeb 3.16

    2013/01/23 by Aurelien Campeas

    What's new in CubicWeb 3.16?

    New functionalities

    • Add a new dataimport store (SQLGenObjectStore). This store enables a fast import of data (entity creation, link creation) in CubicWeb, by directly flushing information in SQL. This may only be used with PostgreSQL, as it requires the 'COPY FROM' command.

    API changes

    • Orm: set_attributes and set_relations are unified (and deprecated) in favor of cw_set that works in all cases.

    • db-api/configuration: all the external repository connection information is now in an URL (see #2521848), allowing to drop specific options of pyro nameserver host, group, etc and fix broken ZMQ source. Configuration related changes:

      • Dropped 'pyro-ns-host', 'pyro-instance-id', 'pyro-ns-group' from the client side configuration, in favor of 'repository-uri'. NO MIGRATION IS DONE, supposing there is no web-only configuration in the wild.
      • Stop discovering the connection method through repo_method class attribute of the configuration, varying according to the configuration class. This is a first step on the way to a simpler configuration handling.

      DB-API related changes:

      • Stop indicating the connection method using ConnectionProperties.
      • Drop _cnxtype attribute from Connection and cnxtype from Session. The former is replaced by a is_repo_in_memory property and the later is totaly useless.
      • Turn repo_connect into _repo_connect to mark it as a private function.
      • Deprecate in_memory_cnx which becomes useless, use _repo_connect instead if necessary.
    • the "tcp://" uri scheme used for ZMQ communications (in a way reminiscent of Pyro) is now named "zmqpickle-tcp://", so as to make room for future zmq-based lightweight communications (without python objects pickling).

    • Request.base_url gets a secure=True optional parameter that yields an https url if possible, allowing hook-generated content to send secure urls (e.g. when sending mail notifications)

    • Dataimport ucsvreader gets a new boolean ignore_errors parameter.

    Unintrusive API changes

    • Drop of cubicweb.web.uicfg.AutoformSectionRelationTags.bw_tag_map, deprecated since 3.6.

    User interface changes

    • The RQL search bar has now some auto-completion support. It means relation types or entity types can be suggested while typing. It is an awesome improvement over the current behaviour !
    • The action box associated with table views (from tableview.py) has been transformed into a nice-looking series of small tabs; it means that the possible actions are immediately visible and need not be discovered by clicking on an almost invisible icon on the upper right.
    • The uicfg module has moved to web/views/ and ui configuration objects are now selectable. This will reduce the amount of subclassing and whole methods replacement usually needed to customize the ui behaviour in many cases.
    • Remove changelog view, as neither cubicweb nor known cubes/applications were properly feeding related files.

    Other changes

    • 'pyrorql' sources will be automatically updated to use an URL to locate the source rather than configuration option. 'zmqrql' sources were broken before this change, so no upgrade is needed...
    • Debugging filters for Hooks and Operations have been added.
    • Some cubicweb-ctl commands used to show the output of msgcat and msgfmt; they don't anymore.

  • December 2012 CubicWeb Sprint Report

    2012/12/21 by Nicolas Chauvat

    For two days, on dec 13th/14th 2012, ten hackers gathered at Logilab to improve the user interface of CubicWeb. This hackathon was initiated by Crealibre. About a year ago, they started the Orbui project, a new user interface for CubicWeb based on the Bootstrap HTML/CSS framework.

    http://www.orbui.com/images/itisa960.png

    Several projects at Logilab and Crealibre proved that Orbui was heading in the right direction, but that it had to fight with the default user interface of Cubicweb. Orbui makes different design/ergonomic choices and needs different HTML/CSS structure and Javascript components.

    Sylvain published a roadmap back in may with a section titled "on the road to Bootstrap". After more than half a day of heated debate on the firts day, it was decided to follow the direction he pointed to. We started extracting from CubicWeb the default user interface and turning it into a set of cubes:

    • cubicweb-legacyui: css, views and templates extracted from CubicWeb 3.16, so as to provide full backward compatibility
    • cubicweb-bootstrap: empty cube with only bootstrap version 2.2.2 in data/
    • cubicweb-squareui: bootstrapified version of legacyui (slightly altered to benefit from the bootstrap css without breaking backward compatibility too hard)

    At the end of the sprint, one could add_cube('squareui') on an existing application and keep it usable... and get "some kind of responsiveness" for free, thus proving that we were on the right track.

    A lot of work is still ahead of us, but we have moved a few step forward towards the goal of making it easier to implement different UIs on top of CubicWeb 3.17.

    For the curious, here is what the skeleton of legacyui.views.maintemplate (aka cw.web.views.maintemplate) looks like:

    <body> (MainTemplate.template_body_header)
      <table id="header"> (HTMLPageHeader.main_header)
        for header in self.headers:
           <td id="header-{left,center,right}">
               render selected components(ctxcomponents, header-{left,center,right})
           </td>
      </table>
      <div id="stateheader"> HTMLPageHeader.call
         <div class="stateMessage"> HTMLPageHeader.state_header
      </div>
      <div id="page"> MainTemplate.template_body_header
        <table id="mainLayout"> MainTemplate.template_body_header
          if boxes (selected components(ctxcomponents, left): MainTemplate.nav_column
            <td id="navColumnLeft">
              <div class="navboxes">
                 render boxes
              </div>
            </td>
          <td id="contentColumn"> MainTemplate.template_body_header
             render selected components(rqlinput)
             render selected components(applmessages)
             if navtop (selected components(ctxcomponents, navtop): HTMLContentHeader.call
               <div id="contentheader">
                 render components
               </div>
               <div class='clear'/>
             <div id="pageContent"> MainTemplate.call
               if vtitle:
                  <div class="vtitle" />
               if etypenavigation:
                  render etypenavigation
               view pagination
               <div id="contentmain">
                  render view
               </div>
               view pagination
             </div>
             if navbottom (selected components(ctxcomponents, navbottom): HTMLContentFooter.call
               <div id="contentfooter">
                 render components
               </div>
          </td>
          if boxes (selected components(ctxcomponents, right): MainTemplate.nav_column
            <div id="navColumnRight">
              <div class="navboxes">
                 render boxes
              </div>
        </table>
      </div>
      <div id="footer"> HTMLPageFooter.call
         render actions selected (actions, 'footer')
      </div>
    </body>
    

    and here is what the skeleton from squareui.views.maintemplate looks like:

    <body>
    <div class="container-fluid">
      <div id="header" class="row-fluid">
        <!-- .header -->
      </div>
      <div class="row-fluid">
        <div id="navColumnLeft" class="span3">
          <!-- .leftcolumn -->
        </div>
        <div id="contentColumn" class="span6">
          <!-- .contentcol -->
          <div class="row-fluid">
            <div id="contentheader" class="span12">
              <!-- .contentheader -->
            </div>
          </div>
          <div class="row-fluid">
            <div id="contentmain" class="span12">
              <!-- .contentmain -->
            </div>
          </div>
          <div class="row-fluid">
            <div id="contentfooter" class="span12">
              <!-- .contentfooter -->
            </div>
          </div>
        </div>
        <div id="navColumnRight" class="span3">
          <!-- .rightcolumn -->
        </div>
      </div>
      <div id="footer" class="row-fluid">
        <!-- .footer -->
      </div>
    </div>
    </body>
    

    Stay tuned for the updates on this (important) topic!


  • Géo − Geonames alignment

    2012/12/20 by Simon Chabot

    This blog post describes the main points of the alignment process between the French National Library's Géo repository of data, and the data extracted from Geonames.

    Alignment is the process of finding similar entities in different repositories. The Géo repository of data contains a lot of locations and the goal is to find those locations in the Geonames repository, and to be able to say that location in *Géo* is the same than this one in *Geonames*. For that purpose, Logilab developed a library, called Nazca, to build those links.

    To process the alignment between Géo and Geonames, we divided the Géo repository into two groups:

    • A group gathering the Géo data having information about longitude and latitude.
    • An other, gathering the data having no information about longitude and latitude.

    Group 1 - Data having geographical information

    The alignment process is made in five steps (see figure below):

    1. Data gathering

    We gather the information needed to align, that is to say, the unique identifier, the name, the longitude and the latitude. The same applies to the Geonames data.

    2. Standardization

    This step aims to make the data the as standard as possible. ie, set to lower case, remove the stop words, remove the punctuation and so on.

    4. Alignment

    Thanks to the Kdtree, we can quickly find the geographical nearest neighbours. During this fourth step, we loop over the nearest neighbours and assign to each a grade according to the similarity of its name and the name of the location we're looking for, using the Levenshtein distance. The alignment will be made with the best graded one.

    5. Saving the results

    Finally, we save all the results into a file.

    Group 2 - Data having no geographical information

    Let's have a look to the data having no information on the longitude and the latitude. The steps are more or less the same than before, except that we cannot find neighbours using a Kdtree. So, we use an other method to find location having a quite high level of similarity in their names. This method is called the Minhashing which has been shown to be quite relevant for this purpose.

    To minimise the amount of mistakes, we try to gather locations according to their country, knowing the country in often written in the location's preferred_label. This pre-treatment helps us to filter out the cities having the same name but located in different countries. For instance, there is Paris in France, there is Paris in the United States, and there is Paris in Canada. So the alignment is made country by country.

    The fourth and the fifth steps remain the sames.

    Results obtained

    The results we got are the followings :

      Amount of locations Aligned Non-aligned
    Group 1 97572 (89.3%) (10.7%)
    Group 2 150528 (72.9%) (27.1%)
    Total 248100 (79.3%) (20.7%)

    One problem we met is the language used to describe the location. Indeed, the similarity grade is given according the distance between the names, and one can notice that Londres and London, for instance, do not having the same spelling.despite they represent the same location.

    Results improvement

    In order to improve a little bit the results, we had a closer look to the 10.7% non-aligned of the first group. The problem of the language mentioned before was pretty clear. So we decided to use the following definition : two locations are identical, if they are geographically very close. Using this definition, we get rid of the name, and focus on the longitude and the latitude only.

    To estimate the exactness of the results, we pick 50 randomly chosen location and process to a manual checking. And the results are pretty good ! 98% are correct (49/50). That's how, based on a purely geographical approach, we can increase the results covering rate (from 89.3% to 99.6%).

    In the end, we get those results :

      Amount of locations Aligned Non-aligned
    Group 1 97572 (99.6%) (0.4%)
    Group 2 150528 (72.9%) (27.1%)
    Total 248100 (83.4%) (16.4%)

  • Candidature au concours dataconnexions#2

    2012/12/20 by Nicolas Chauvat

    Au nom de la communauté des utilisateurs et développeurs de CubicWeb, je viens de déposer la candidature suivante au concours dataconnexions#2.

    1. Questionnaire de description du Projet

    Intitulé du projet

    CubicWeb - plate-forme libre de développement pour le web sémantique

    Catégorie de concours choisie

    Choisir parmi: Grand public / Professionnel / Utilité publique / Mobilité et territoires

    Utilité publique (?)

    Quel problème tentez-vous de résoudre ?

    Décrivez le (ou les) problème(s) que votre projet tente de résoudre, ainsi que son (leur) importance : taille du marché, fréquence d’utilisation potentielle, population concernée, bénéfices éventuels de service public, etc. (maximum 1000 signes).

    L'avènement du web sémantique et de l'Open Data nécessite de disposer d'outils adaptés pour développer des applications centrées sur les données.

    Ces outils doivent permettre d'importer des données facilement, de les mettre en relation lorsqu'elles proviennent de sources disjointes, de les republier et de faciliter leur interrogation et leur visualisation.

    Idéalement, ces outils doivent utiliser et respecter les standards ouverts d'internet afin de simplifier les communications et les échanges, mais aussi faciliter le développement pour les terminaux multiples (ordinateur, tablette, smartphone).

    Comment tentez-vous de le résoudre ?

    Décrivez votre produit, service ou visualisation, dans sa forme actuelle et le cas échéant après les développements futurs éventuels que vous envisagez. Précisez le ou les jeux de données publiques que vous utilisez à cet effet (maximum 1000 signes).

    CubicWeb est une plate-forme libre de développement pour le web sémantique.

    CubicWeb permet aux développeurs de se concentrer sur les spécificités de leur application plutôt que d'avoir à réinventer les briques essentielles de l'import, la fusion, la publication, l'interrogation et la visualisation de données.

    CubicWeb est un logiciel libre développé ouvertement sur internet par une communauté réduite mais déjà internationale. CubicWeb est disponible sous licence LGPL, respecte les standards du W3C (RDF, SPARQL, HTML5, CSS3, Responsive Design) et sait gérer nativement plusieurs modèles de données faisant office de standards de fait (FOAF, SIOC, DOAP, etc).

    Quel est votre modèle d’affaire ?

    Décrivez le modèle d’affaire de votre projet, c’est-à-dire les conditions de sa pérennité et de son développement : plan d’affaires et projections commerciales dans le cas d’un projet entrepreneurial ; objectifs, donneurs clés, partie prenantes dans le cas d’un projet d’ordre civique (maximum 1000 signes).

    Plusieurs sociétés commerciales s'appuient aujourd'hui sur CubicWeb pour vendre des services informatiques. L'objectif de cette communauté est de croître pour bénéficier d'une audience plus large et d'une mutualisation plus importante des coûts de maintenance et de développement de la plate-forme CubicWeb.

    Parmi les utilisateurs de CubicWeb, on compte à ce jour la Bibliothèque nationale de France, EDF, GDF-Suez, le Commissariat à l'Energie Atomique, le Centre National d'Etudes Spatiales, l'Institut Radioprotection et Sûreté Nucléaire, l'INRIA, des laboratoires de recherche médicale et des entreprises du domaine informatique.

    Quel est l’état d’avancement de votre projet ?

    Décrivez les étapes que vous avez franchies, les ressources mobilisées, les indicateurs et métriques déjà établies, etc. (maximum 1000 signes).

    Le projet CubicWeb est issu d'un effort de R&D commencé en 2001 par la société Logilab, qui avait comme objectif de se doter d'un outil permettant le développement d'applications centrées sur les données et respectant les standards du web sémantique en cours d'élaboration au W3C.

    Depuis 2008, CubicWeb est un logiciel libre dont le développement est mené ouvertement sur internet.

    Qui vous accompagne sur ce projet ?

    Décrivez l’équipe qui vous accompagne dans votre projet (le cas échéant), vos compétences, expériences et réalisations, ainsi que les partenaires éventuels qui vous soutiennent (maximum 1000 signes).

    N/A.

    Comment DataConnexions peut-­il vous aider ?

    Détaillez toutes les précisions additionnelles que vous souhaiteriez apporter au sujet de votre projet, et expliquez en quoi DataConnexions peut contribuer à pérenniser son développement (maximum 1000 signes).

    Plusieurs sociétés commerciales s'appuient aujourd'hui sur CubicWeb pour vendre des services informatiques. Les utilisations industrielles de CubicWeb sont variées et concernent des applications importantes, voire critiques.

    CubicWeb est un outil peu (re)connu et sa communauté est aujourd'hui réduite, malgré ses solides références et le récent engouement pour l'Open Data.

    DataConnexions pourrait être une tribune et une vitrine permettant à CubicWeb de trouver de nouveaux développeurs d'applications préférant bénéficier de l'expérience capitalisée dans cet outil libre plutôt que de rédécouvrir et déjouer un par un les pièges rencontrés au cours des dix ans qui ont été nécessaires à sa réalisation.

    L'objectif de cette candidature est donc de faire croître la communauté des utilisateurs et contributeurs de CubicWeb.

    2. Vidéo de présentation

    Lien permettant de télécharger une vidéo décrivant le Projet et ses fonctionnalités, d’une durée maximale de 3 minutes

    Ce n’est pas la qualité de la vidéo qui est jugée, mais le projet lui-même. La vidéo doit permettre de rendre compte des fonctionnalités du projet. Les candidats sont encouragés à réaliser une capture d’écran ou un « screencast » (par exemple avec des outils tels que CamStudio, Jing ou Screenr).

    Démonstration de l'utilisation de CubicWeb pour importer et visualiser la liste des gares françaises téléchargée depuis data.gouv.fr. Sélection des gares par le filtre à facettes et affichage sur fond de carte openstreetmap, puis export en RDF, JSON et CSV.

    CubicWeb est une plate-forme libre de développement pour le web sémantique, qui permet aux développeurs de se concentrer sur les spécificités de leur application plutôt que d'avoir à réinventer les briques essentielles de l'import, la fusion, la publication, l'interrogation et la visualisation de données.

    Lien vers vidéo sur youtube. Miroir de la vidéo sur vimeo.com.

    3. Accès en ligne au projet

    Lien permettant d’accéder au Projet, ou au code informatique compilé et interprétable du Projet

    Par exemple : URL permettant de consulter, ou, le cas échéant, de télécharger l’application, accompagnée, si nécessaire, d’instructions à cet effet. L’application devra être facile à installer et aisément démontrable sur sa plateforme de destination.

    http://www.cubicweb.org

    4. Supports de communication

    Description Non Confidentielle

    Décrivez le Projet dans des termes compatibles avec une diffusion au grand public : non confidentiels, compréhensibles par le plus grand nombre, et mettant en avant l’intérêt du projet (maximum 1000 signes).

    cf "comment tentez-vous de le résoudre"

    Elément visuel de description

    Lien vers un élément visuel décrivant et mettant en valeur le projet et ses fonctionnalités (capture d’écran, page d’accueil, schéma de description).

    /file/2544364?vid=download

    Logo du projet

    Lien vers le logo du projet.

    /file/2544362?vid=download

  • Links roundup from dotjs.eu

    2012/12/05 by Arthur Lutz

    A few people from Logilab attended the dotjs conference in Paris last week. The conference wasn't exactly what we expected, we were hoping for more technical talks. Nevertheless, some of the things we saw were quite interesting. Some of them could be relevant to CubicWeb.

    http://www.cubicweb.org/file/2532779?vid=download

    Here is a raw roundup of links collected last friday :

    Chrome developer toolsyeomangrunt.jsbackbone.jsDartTypeScriptLangExpress.jsMochaTestacularSASSAngular.jsEnyo.jsSocket.iowhen.jsCoffeescriptSource Maps explained


  • CubicWeb sprint in Paris - 2012/12/13-14

    2012/11/11 by Nicolas Chauvat

    Topics

    To be decided. Some possible topics are :

    • Work on CubicWeb front end : Anything related to Themaintemplate, primaryview, reledit, tables handling etc.
    • Share the Evolution and more integration of the OrbUI project for CW
    • Things to do for HTML5 and bootstrap integration
    • Work on ideas from Thoughts on CubicWeb 4
    • ...

    other ideas are welcome, please bring them up on cubicweb@lists.cubicweb.org

    Location

    This sprint will take place in decembre 2012 from thursday the 13th to friday the 14th. You are more than welcome to come along, help out and contribute. An introduction is planned for newcomers.

    Network resources will be available for those bringing laptops.

    Address : 104 Boulevard Auguste-Blanqui, Paris. Ring "Logilab" (googlemap)

    Metro : Glacière

    Contact : http://www.logilab.fr/contact

    Dates : 13/12/2012 to 14/12/2012

    Participants

    • Celso Flores (Crealibre - Mexico)
    • Carine Fourrier (Crealibre - Mexico)
    • ...

  • Building your URLs in cubicweb

    2012/09/25 by Stéphane Bugat

    Building your URLs in cubicweb

    Aim

    In cubicweb, you often have to build url's that redirect the current view to a specific entity view or allow the execution of a given action. Moreover, you often want also to fallback to the previous view once the specific action or edition is done, or redirect also to another entity's specific view.

    To do so, cubicweb provides you with a set of powerful tools, however as there is often more than one way to do it, this blog entry is here to help you in choosing the preferred way.

    Tools at your disposal

    The universal URL builder: build_url()

    build_url is accessible in any context, so for instance in the rendering of a given entity view you can call self._cw.build_url to build you URLs easily, which is the most common case. In class methods (for instance, when declaring the rendering methods of an EntityTableView), you can access it through the context of instantiated appobject which are usually given as argument, e.g. entity._cw.build_url. For test purposes you can also call session.build_url in cubicweb shells.

    build_url basically take a first optional, the path, relative to the base url of the site, and arbitrary named arguments that will be encoded as url parameters. Unless you wish to direct to a custom controller, or to match an URL rewrite url, you don't have to specify the path.

    Extra parameters given to build_url will vary according to your needs, however most common arguments understood by default cubicweb views are the followings:

    • vid: the built view __regid__;
    • rql: the RQL query used to retreive data on which the view should be applied;
    • eid: the identifier of an entity, which you should use instead of rql when the view apply to a single entity (most often);
    • __message: an information message to display inside the view;
    • __linkto: in case of an entity creation url, will allow to set some specific relations between both entities;
    • __redirectpath: the URL of the entity of the redirection;
    • __redirectvid: the view id of the redirection.

    __redirectvid and __redirectpath are used to control redirection after posting a form and are more detailed in the cubicweb documentation, chapter related to the edition control (http://docs.cubicweb.org/devweb/edition/editcontroller.html).

    Exploring entities associated URLs

    Generally, an entity has two important methods that retrieve its absolute or relative urls:

    • entity.rest_path() will return something like <type>/<eid> where <type> corresponds to the entity type and <eid> the entity eid;
    • entity.absolute_url() will return the full url of the entity http://<baseurl>/<type>/<eid>. In case you want to access a specific view of the entity, just pass the vid='myviewid' argument. You can give arbitrary arguments to this method that will be encoded as url parameters.

    Getting a proper RQL

    Passing the rql to the build_url method requires to have a proper RQL expression. To do so, there is a convenience method, printable_rql(), that is accessible in rset resulting from RQL queries. This allows to apply a view to the same result set as the one currently process, simply using rql = self.cw_rset.printable_rql().

    Getting URLs from the current view

    There are several ways to get URL of the current view, the canonical one being to use self._cw.relative_path(includeparams=True) which will return the path of the current view relative to the base url of the site (otherwise use self._cw.url(), including parameters or not according to value given as includeparams).

    You can also retrieve values given to individual parameters using self._cw.form, eg:

    • self._cw.form.get('vid', '') will return only the view id;
    • self._cw.form.get('rql', '') will return only the RQL;
    • self._cw.form.get('__redirectvid', '') will return the redirection view if defined;
    • self._cw.form.get('__redirectpath', '') will return the redirection path if defined.

    How to redirect to non-entity view?

    This case often appears when you want to create a link to a startup view or a controller. It the first case, you simply build you URL like this:

    self._cw.build_url('view', vid='my_view_id')
    

    The latter case appears when you want to call a controller directly without having to define a form in your view. This can happen for instance when you want to create a URL that will set a relation between 2 objects and do not need any confirmation for that. The URL construction is done like this:

    self._cw.build_url('my_controller_id', arg1=value1, arg2=value2, ...)
    

    Any extra arguments passed to the build_url method will be available in the controller as key, values pairs of the self._cw.forms dictionary. This is especially useful when you want to define some kind of hidden attributes but there is not form to put them into.

    And, last but not least, a convenient way to get the root URL of the instance:

    self._cw.base_url()
    

    Some concrete cases

    Get the URL of the outofcontext view of an entity:

    link = entity.absolute_url(vid='outofcontext')
    

    Create a link to a given controller then fall back to the current view:

    • In your entity view:
    self.w(u'<a href="%s">Click me</a>' % xml_escape(
            self._cw.build_url('mycontrollerid',
                    arg1=value1, arg2=value2,
                    rql=self.cw_rset.printable_rql(),
                    __redirectvid=self._cw.form.get('vid',''))))
    
    • In your controller:
    def publish(self, rset):
         value1, value2 = self._cw.form['arg1'], self._cw.form['arg2']
         # do some stuff with value1 and value2 here...
         raise Redirect(self._cw.build_url(rql=self._cw.form['rql'],
             vid=self._cw.form['__redirectvid'],
             __message=_('you message')))
    

    Create a link to add a given entity and relate this entity to the current one with a relation 'child_of', then go back to the current entity's view:

    entity = self.cw_rset.get_entity(0,0)
    self.w(u'<a href="%s">Click me</a>' % xml_escape(
            self._cw.build_url('add/Mychildentity',
                    __linkto='child_of:%s:object' % entity.eid,
                    __redirectpath=entity.rest_path(),
                    __redirectvid=self._cw.form.get('vid', ''))))
    

    Same example, but we suppose that we are in a multiple rset entity view, and we want to go back afterwards to this view:

    entity = self.cw_rset.get_entity(0,0)
    self.w(u'<a href="%s">Click me</a>' % xml_escape(
            self._cw.build_url('add/Mychildentity',
                    rql=self.cw_rset.printable_rql(),
                    __linkto='child_of:%s:object' % entity.eid,
                    __redirectvid=self._cw.form.get('vid', ''))))
    

    Create links to all 'menuactions' in a view:

    actions = self._cw.vreg['actions'].possible_actions(self._cw, rset=self.cw_rset)
    action_links = [unicode(self.action_link(x)) for x in actions.get('menuactions', ())]
    self.w( u'  |  '.join(action_links))
    

  • How to create your own forms and controllers?

    2012/09/05 by Stéphane Bugat

    Aim

    Sometimes you need to associate to a given view your own specific form and the associated controller. We will see in this blog entry how it can be done in cubicweb on a concrete case.

    The case

    Let's suppose you're working on a social network project where you have to develop friend-of-a-frient (foaf) relationships between persons. For that purpose, we use the cubicweb-person cube and create in our scheme relations between persons like X in_contact_with Y:

    class in_contact_with(RelationDefinition):
          subject = 'Person'
          object = 'Person'
          cardinality = '**'
          symmetric = True
    

    We will also assume that a given Person corresponds to a unique CWUser through the relation is_user.

    Although it is not evident, we would like that any connected person can chose to disconnect himself from another person at any time. For that, we will create a table view that will display the list of connected users, with a custom column giving the ability to "disconnect" with the person.

    Before disconnecting with this particular person, we would like also to have a confirmation form.

    How to proceed

    The following steps were defined to address the above issue:

    1. Define a "contact view" that will display the list of known contacts of the connected user ;
    2. In this contact view, allow the user to click on a specific contact so as to remove him ;
    3. Create a deletion confirmation view, that will contain:
      • A form holding the buttons for deletion confirmation or cancel;
      • A controller responsible for the actual deletion or the cancelling.

    The contact view

    Rendering a table view of connected persons

    To display the list of connected persons to the current person, but also to add custom columns that do not refer specifically to attributes of a given entity, the best choice is to use EntityTableView (see here for more information):

    class ContactView(EntityTableView):
        __regid__ = 'contacts_tableview'
        __select__ = is_instance('Person')
        columns = ['person', 'firstname', 'surname', 'email', 'phone', 'remove']
        layout_args = {'display_filter': 'top', 'add_view_actions': None}
    
        def cell_remove(w, entity):
            """link to the suppression of the relation between both contacts"""
            icon_url = entity._cw.data_url('img/user_delete.png')
            action_url = entity._cw.build_url(eid=entity.eid,
                    vid='suppress_contact_view',
                    __redirectpath=entity._cw.relative_path(),
                    __redirectvid=entity._cw.form.get('__redirectvid', ''))
            w(u'<a href="%(actionurl)s" title="%(title)s">'
                    u'<img alt="%(title)s" src="%(url)s" /></a>'
                    % {'actionurl': xml_escape(action_url),
                       'title': _('remove from contacts'),
                       'url':icon_url})
    
        column_renderers = {
                'person': MainEntityColRenderer(),
                'email': RelatedEntityColRenderer(
                    getrelated=lambda x:x.primary_email and x.primary_email[0] \
                            or None),
                'phone': RelatedEntityColRenderer(
                    getrelated=lambda x:x.phone and x.phone[0] or None),
                'remove': EntityTableColRenderer(
                    renderfunc=cell_remove,
                    header=''),}
    

    A few explanations about the above view:

    • By default, the column attribute contains a list of displayable attributes of the entity. If one element of the list does not correspond to an attribute, which is the case for 'remove' here, it has to have rendering function defined in the dictionnary column_renderers.
    • However, when the column header refers to a related entity attribute, we can easily use the rendering function RelatedEntityColRenderer, as it is the case for the email and phone display.
    • As for concerns the 'remove' column, we render a clickable image in the cell_remove method. Here we have chosen an icon from famfamsilk that is putted in our data/ directory, but feel free to chose a predefined icon in the cubicweb shared data directory.

    The redirection URL associated to each image has to be a link to a specific action allowing the user to remove the selected person from its contacts. It is built using the self._cw.build_url() convenience function. The redirection view, 'suppress_contact_view', will be defined later on. The eid argument passed refers to the id of the contact person the user wants to remove.

    Calling the contact view

    The above view has to be called with a given rset which corresponds to the list of known contacts for the connected user. In our case, we have defined a StartupView for the contact management, in which in the call function we have added the following piece of code:

    person = self._cw.user.related('is_user', 'object').get_entity(0,0)
    rset = self._cw.execute(
            'Any X WHERE X is Person, X in_contact_with Y, '
            'Y eid %(eid)s', {'eid': person.eid})
    self.w(u'<h3>' + _('Number of contacts in my network:'))
    self.w(unicode(len(rset)) + u'</h3>')
    if len(rset) != 0:
        self.wview('contacts_tableview', rset)
    

    The Person corresponding to the connected user is retrieved thanks to the use of the related method and the is_user relation. The contact table view is displayed inside the parent StartupView.

    Creation of the deletion confirmation view

    Defining the confirmation view for contact deletion

    The corresponding view is a simple View class instance, that will display a confirmation message and the related buttons. It could be defined as follows:

    class SuppressContactView(View):
        __regid__ = 'suppress_contact_view'
    
        def cell_call(self, row, col):
            entity = self.cw_rset.get_entity(row, col)
            msg = self._cw._('Are you sure you want to remove %(name)s from your contacts?')
            self.w(u'<p>' + msg % {'name': entity.dc_long_title()} + u'</p>')
            form = self._cw.vreg['forms'].select('suppress_contact_form',
                    self._cw, rset=self.cw_rset)
            form.add_hidden(u'eidto', entity.eid)
            form.add_hidden(u'eidfrom', self._cw.user.related('is_user',
                'object').get_entity(0,0).eid)
            form.render(w=self.w)
    

    Inside the cell_call() method of this view, we will have to render a form which aims at displaying both buttons (confirm deletion or cancel deletion). This form will be described later on.

    The Person contact to remove is retrieved easily thanks to cw_rset. The Person corresponding to the connected user is here also retrieved thanks to the is_user relation. To make both of them available in the form, we add them at the instanciation of the form using the convenience function add_hidden(key,val).

    Defining the deletion form

    The deletion form as mentioned previously is only here to hold both buttons for the deletion confirmation or the cancelling. Both buttons are declared thanks to the form_buttons attribute of the form, which is instanciated from forms.FieldsForm:

    class SuppressContactForm(forms.FieldsForm):
        __regid__ = 'suppress_contact_form'
        domid = 'delete_contact_form'
        form_renderer_id = 'base'
    
        @property
        def action(self):
            return self._cw.build_url('suppress_contact_controller')
    
        form_buttons = [
                fw.Button(stdmsgs.BUTTON_DELETE, cwaction='delete'),
                fw.Button(stdmsgs.BUTTON_CANCEL, cwaction='cancel')]
    

    Specifying a given domid will ensure that your form will have a specific DOM identifier,the controller defined in the action method will be called without any ambiguity. The form_renderer_id is precised here so as to avoid additional display of informations which don't make sense here.

    Defining the controller

    The custom controller is instanciated from the Controller class in cubicweb.web.controller. The declaration of the controller should have the same domid than the calling form, as mentioned previously. The related actions are described in the publish() method of the controller:

    class SuppressContactController(Controller):
        __regid__ = 'suppress_contact_controller'
        domid = 'delete_contact_form'
    
        def publish(self, rset=None):
            if '__action_cancel' in self._cw.form.keys():
                msg = self._cw._('Deletion canceled')
                raise Redirect(self._cw.build_url(
                    vid='contact_management_view',
                    __message=msg))
            elif '__action_delete' in self._cw.form.keys():
                xid = self._cw.form['eidfrom']
                dead_contact = self._cw.entity_from_eid(xid)
                yid = self._cw.form['eidto']
                self._cw.execute(
                        'DELETE X in_contact_with Y'
                        '  WHERE X eid %(xid)s, Y eid %(yid)s',
                        {'xid': xid, 'yid': yid})
                msg = self._cw._('%s removed from your contacts') %\
                    dead_contact.dc_long_title()
                raise Redirect(self._cw.build_url(
                    vid='contact_management_view',
                    __message=msg))
    

    Retrieving of the user action is performed by testing if the '__action_<action>', where <action> refers to the cwaction in the button declaration, is present in the form keys. In the case of a cancelling, we simply redirect to the contact management view with a message specifying that the deletion has been cancelled. In the case of a deletion confirmation, both Person id's for the connected user and for the contact to remove are retrieved from the form hidden arguments.

    The deletion is performed using an RQL request on the relation in_contact_with. We also redirect the view to the contact management view, this time with another message confirming the deletion of the contact link.


  • Logilab at the LawFactory

    2012/07/16 by Vincent Michel

    We have been playing along with political data for a while, using CubicWeb to store and query various sets of open data (e.g. NosDeputes, data.gouv.fr), and testing different visualization tools. In particular, we have extended our prototype of News Analysis (see the presentation we made last year at Euroscipy), in order to use these political datasets as reference for the named entities extraction part. Last week's conference "The Law Factory" at Sciences Po was a really nice opportunity to meet people with similar interests in opendata for political sciences, and to find out which questions we should be asking our data ! Check out the talk of our presentation and a few screencasts (no sound) :

    Comments are welcome !

    Interresting things seen at #OLPC

    Among the different things that we have seen, we want to emphasize on:

    • Law is Code (http://gitorious.org/law-is-code/) - This project by the team of Regards Citoyens, aims at analysing the laws and amendments, by extracting information from the French National Assembly website, and by pushing the contributions of the members of parlement to a given law in a git repository. If we can find the time, we'll turn that into a mercurial repository and integrate it into our above application using cubicweb-vcsfile.
    http://www.cubicweb.org/file/2423768?vid=download
    • Both national websites (Assemblée Nationale, Sénat), do not allow (yet...) to get data any other way than parsing the sites. However, it seems that the people involved are aware of the issues of opendata, and this may changed in the next months. In particular, the Senat use two databases (Basile and Ameli), and opening them to the public could be really interesting
    • Different projects about African parlements can be found on the following website : http://www.parliaments.info
    • Check out, ITCparliement which gives tools to analyse and share data from many different parliments.

    Saturday, at La Cantine Numérique, the discussions focused on the possibilities to share tools, and the possible collaborations. I think that this is the crucial point: How people can share tools and use them in a efficient way, without being an IT expert ?

    How does this inspire us for CubicWeb ?

    In this way, we have are thinking about some evolutions of CubicWeb that can fullfill (part) of these requirements:

    • easier installation, especially on Windows, and easier Postgresql configuration. This could perhaps be made by allowing some graphical interface for creating/managing the instances and the databases.
    • a graphical tool for schema construction. Even if the construction of a data model in CubicWeb is quite simple, and rely on the straightforward Python syntax, it could be interesting to expose a graphical tool for adding/removing/modifying entities from the schema, as well as some attributes or relations.
    • easier ways to import data. This point is not trivial, and we don't want to develop a specific language for defining import rules, that could be used for 80% of the cases, but will be painful to extend to the 20% exotic cases. We would rather develop some helpers to ease the building of some import scripts in Python, and to upload some CubicWeb instances already filled with open databases.

    Demo of CubicWeb as a follow up

    As a follow up of the conference, we are openning a demo site using CubicWeb to expose data of the past legislative and presidential elections (2002, 2007, 2012)

    https://www.cubicweb.org/file/2425136?&vid=download

    The data used is published under Licence Ouverte / Open Licence by http://data.gouv.fr.

    This demo site allows you to deeply explore the data, with different visualisations, and complex queries. Again, comments are welcome, especially if you want to retrieve some information but you don't know how to! This demo site will probably evolve in the next weeks, and we will use it to test different cubes that we have been building.

    PS: We are sorry we cannot open the propotype of news aggregator for now, as there are still licensing issues concerning the reusability of the different news sources that we get articles from.


  • What's new in CubicWeb 3.15

    2012/05/14 by Sylvain Thenault

    CubicWeb 3.15 introduces a bunch of new functionalities. In short (more details below):

    • ability to use ZMQ instead of Pyro to connect to repositories
    • ZMQ inter-instances messages bus
    • new LDAP source using the datafeed approach, much more flexible than the legacy 'ldapuser' source
    • full undo support

    Plus some refactorings regarding Ajax function calls, WSGI, the registry, etc. Read more for the detail.

    New functionalities

    • Add ZMQ server, based on the cutting edge ZMQ socket library. This allows to access distant instances, in a similar way as Pyro.
    • Publish/subscribe mechanism using ZMQ for communication among cubicweb instances. The new zmq-address-sub and zmq-address-pub configuration variables define where this communication occurs. As of this release this mechanism is used for entity cache invalidation.
    • Improved WSGI support. While there are still some caveats, most of the code which was twisted only is now generic and allows related functionalities to work with a WSGI front-end.
    • Full undo/transaction support: undo of modifications has finally been implemented, and the configuration simplified (basically you activate it or not on an instance basis).
    • Controlling HTTP status code returns is now much easier:
      • WebRequest now has a status_out attribute to control the response status ;
      • most web-side exceptions take an optional status argument.

    API changes

    • The base registry implementation has been moved to a new logilab.common.registry module (see #1916014). This includes code from :

      • cubicweb.vreg (everything that was in there)
      • cw.appobject (base selectors and all).

      In the process, some renaming was done:

      • the top level registry is now RegistryStore (was VRegistry), but that should not impact CubicWeb client code;
      • former selectors functions are now known as "predicate", though you still use predicates to build an object'selector;
      • for consistency, the objectify_selector decorator has hence been renamed to objectify_predicate;
      • on the CubicWeb side, the selectors module has been renamed to predicates.

      Debugging refactoring dropped the need for the lltrace decorator. There should be full backward compat with proper deprecation warnings. Notice the yes predicate and objectify_predicate decorator, as well as the traced_selection function should now be imported from the logilab.common.registry module.

    • All login forms are now submitted to <app_root>/login. Redirection to requested page is now handled by the login controller (it was previously handled by the session manager).

    • Publisher.publish has been renamed to Publisher.handle_request. This method now contains a generic version of the logic previously handled by Twisted. Controller.publish is not affected.

    Unintrusive API changes

    • New 'ldapfeed' source type, designed to replace 'ldapuser' source with data-feed (i.e. copy based) source ideas.
    • New 'zmqrql' source type, similar to 'pyrorql' but using ømq instead of Pyro.
    • A new registry called 'services' has appeared, where you can register server-side cubicweb.server.Service child classes. Their call method can be invoked from a web-side AppObject instance using the new self._cw.call_service method or a server-side one using self.session.call_service. This is a new way to call server-side methods, much cleaner than monkey patching the Repository class, which becomes a deprecated way to perform similar tasks.
    • a new ajaxfunction registry now hosts all remote functions (i.e. functions callable through the asyncRemoteExec JS api). A convenience ajaxfunc decorator will let you expose your python functions easily without all the appobject standard boilerplate. Backwards compatibility is preserved.
    • the 'json' controller is now deprecated in favor of the 'ajax' one.
    • WebRequest.build_url can now take a __secure__ argument. When True, cubicweb tries to generate an https url.

    User interface changes

    A new 'undohistory' view exposes the undoable transactions and gives access to undo some of them.


  • Thoughts on CubicWeb 4.0

    2012/05/14 by Sylvain Thenault

    This is a fairly technical post talking about the structural changes I would like to see in CubicWeb's near future. Let's call that CubicWeb 4.0! It also drafts ideas on how to go from here to there. Draft, really. But that will eventually turn into a nice roadmap hopefully.

    The great simplification

    Some parts of cubicweb are sometimes too hairy for different reasons (some good, most bad). This participates in the difficulty to get started quickly. The goal of CubicWeb 4.0 should be to make things simpler :

    • Fix some bad old design.
    • Stop reinventing the wheel and use widely used libraries in the Python Web World. This extends to benefitting from state of the art libraries to build nice and flexible UI such as Bootstrap, on top of the JQuery foundations (which could become as prominent as the Python standard library in CubicWeb, the development team should get ready for it).
    • If there is a best way to do something, just do it and refrain from providing configurability and options.

    On the road to Bootstrap

    First, a few simple things could be done to simplify the UI code:

    • drop xhtml support: always return text/html content type, stop bothering with this stillborn stuff and use html5
    • move away everything that should not be in the framework: calendar?, embedding, igeocodable, isioc, massmailing, owl?, rdf?, timeline, timetable?, treeview?, vcard, wdoc?, xbel, xmlrss?

    Then we should probably move the default UI into some cubes (i.e. the content of cw.web.views and cw.web.data). Besides making the move to Bootstrap easier, this should also have the benefit of making clearer that this is the default way to build an (automatic) UI in CubicWeb, but one may use other, more usual, strategies (such as using a template language).

    At a first glance, we should start with the following core cubes:

    • corelayout, the default interface layout and generic components. Modules to backport there: application (not an appobject yet), basetemplates, error, boxes, basecomponents, facets, ibreadcrumbs, navigation, undohistory.
    • coreviews, the default generic views and forms. Modules to backport there: actions, ajaxedit, baseviews, autoform, dotgraphview, editcontroller, editforms, editviews, forms, formrenderers, primary, json, pyviews, tableview, reledit, tabs.
    • corebackoffice, the concrete views for the default back-office that let you handle users, sources, debugging, etc. through the web. Modules to backport here: cwuser, debug, bookmark, cwproperties, cwsources, emailaddress, management, schema, startup, workflow.
    • coreservices, the various services, not directly related to display of something. Modules to backport here: ajaxcontroller, apacherewrite, authentication, basecontrollers, csvexport, idownloadable, magicsearch, sessions, sparql, sessions, staticcontrollers, urlpublishing, urlrewrite.

    This is a first draft that will need some adjustements. Some of the listed modules should be split (e.g. actions, boxes,) and their content moved to different core cubes. Also some modules in cubicweb.web packages may be moved to the relevant cube.

    Each cube should provide an interface so that one could replace it with another one. For instance, move from the default coreviews and corelayout cube to bootstrap based ones. This should allow a nice migration path from the current UI to a Bootstrap based UI. Bootstrap should probably be introduced bottom-up: start using it for tables, lists, etc. then go up until the layout defined in the main template. The Orbui experience should greatly help us by pointing at hot spots that will have to be tackled, as well as by providing a nice code base from which we should start.

    Regarding current implementation, we should take care that Contextual components are a powerful way to build "pluggable" UI, but we should probably add an intermediate layer that would make more obvious / explicit:

    • what the available components are
    • what the available slots are
    • which component should go in which slot when possible

    Also at some point, we should take care to separate view's logic from HTML generation: our experience with client works shows that a common need is to use the logic but produce a different HTML. Though we should wait for more use of Bootstrap and related HTML simplification to see if the CSS power doesn't somewhat fulfill that need.

    On the road to proper tasks management

    The current looping task / repo thread mecanism is used for various sort of things and has several problems:

    • tasks don't behave similarly in a multi-instances configuration (some should be executed in a single instance, some in a subset); the tasks system has been originally written in a single instance context; as of today this is (sometimes) handled using configuration options (that will have to be properly set in each instance configuration file);
    • tasks is a repository only api but we also need web-side tasks;
    • there is probably some abuse of the system that may lead to unnecessary resources usage.

    Analyzing a sample http://www.logilab.org/ instance, below are the running looping task by categories. Tasks that have to run on each web instance:

    • clean_sessions, automatically closes unused repository sessions. Notice cw.etwist.server also records a twisted task to clean web sessions. Some changes are imminent on this, they will be addressed in the upcoming refactoring session (that will become more and more necessary to move on several points listed here).
    • regular_preview_dir_cleanup (preview cube), cleanup files in the preview filesystem directory. Could be executed by a (some of the) web instance(s) provided that the preview directory is shared.

    Tasks that should run on a single instance:

    • update_feeds, update copy based sources (e.g. datafeed, ldapfeed). Controlled by 'synchronize' source configuration (persistent source attribute that may be overridden by instance using CWSourceHostConfig entities)
    • expire_dataimports, delete CWDataImport entities older than an amount of time specified in the 'logs-lifetime' configuration option. Not controlled yet.
    • cleanup_auth_cookies (rememberme cube), delete CWAuthCookie entities whose life-time is exhausted. Not controlled yet.
    • cleaning_revocation_key (forgotpwd cube), delete Fpasswd entities with past revocation_date. Not controlled yet.
    • cleanup_plans (narval cube), delete Plan entities instance older than an amount of time specified in the configuration. If 'plan-cleanup-delay' is set to an empty value, the task isn't started.
    • refresh_local_repo_caches (vcsfile cube), pull or clone vcs repositories cache if the Repository entity ask to import_revision_content (hence web instance should have up to date cache to display files content) or if 'repository-import' configuration option is set to 'yes'; import vcs repository content as entities if 'repository-import' configuration option and it is coming from the system source.

    Some deeper thinking is needed here so we can improve things. That includes thinking about:

    • the inter-instances messages bus based on zmq and introduced in 3.15,
    • the Celery project (http://celeryproject.org/), an asynchronous task queue, widely used and written in Python,

    Remember the more cw independent the tasks are, the better it is. Though we still want an 'all-integrated' approach, e.g. not relying on external configuration of Unix specific tools such as CRON. Also we should see if a hard-dependency on Celery or a similar tool could be avoided, and if not if it should be considered as a problem (for devops).

    On the road to an easier configuration

    First, we should drop the different behaviour according to presence of a '.hg' in cubicweb's directory. It currently changes the location where cubicweb external resources (js, css, images, gettext catalogs) are searched for. Speaking of implementation:

    • shared_dir returns the cubicweb.web package path instead of the path to the shared cube,
    • i18n_lib_dir returns the cubicweb/i18n directory path instead of the path to the shared/i18n cube,
    • migration_scripts_dir returns the cubicweb/misc/migration directory path instead of share/cubicweb/migration.

    Moving web related objects as proposed in the Bootstrap section would resolve the problem for the content web/data and most of i18n (though some messages will remain and additional efforts will be needed here). By going further this way, we may also clean up some schema code by moving cubicweb/schemas and cubicweb/misc/migration to a cube (though only a small benefit is to be expected here).

    We should also have fewer environment variables... Let's see what we have today:

    • CW_INSTANCES_DIR, where to look for instances configuration
    • CW_INSTANCES_DATA_DIR, where to look for instances persistent data files
    • CW_RUNTIME_DIR, where to look for instances run-time data files
    • CW_MODE, set to 'system' or 'user' will predefine above environment variables differently
    • CW_CUBES_PATH, additional directories where to look for cubes
    • CW_CUBES_DIR, location of the system 'cubes' directory
    • CW_INSTALL_PREFIX, installation prefix, from which we can compute path to 'etc', 'var', 'share', etc.

    I would propose the following changes:

    • CW_INSTANCES_DIR is turned into CW_INSTANCES_PATH, and defaults to ~/etc/cubicweb.d if it exists and /etc/cubicweb.d (on Unix platforms) otherwise;
    • CW_INSTANCES_DATA_DIR and CW_RUNTIME_DIR are replaced by configuration file options, with smart values generated at instance creation time;
    • the above change should make CW_MODE useless;
    • CW_CUBES_DIR is to be dropped, CW_CUBES_PATH should be enough;
    • regarding CW_INSTALL_PREFIX, I'm lacking experience with non-hg-or-debian installations and don't know if this can be avoided or not.

    Last but not least (for the moment), the 'web' / 'repo' / 'all-in-one' configurations, and the fact that the associated configuration file changes stinks. Ideas to stop doing this:

    • one configuration file per instance, with all options provided by installed parts of the framework used by the application.
    • activate 'services' (or not): web server, repository, zmq server, pyro server. Default services to be started are stored in the configuration file.

    There is probably more that can be done here (less configuration options?), but that would already be a great step forward.

    On the road to...

    The following projects should be investigated to see if we could benefit from them:

    Discussion

    Remember the following goals: migration of legacy code should go smoothly. In a perfect world every application should be able to run with CubicWeb 4.0 until the backwards compatibility code is removed (and CubicWeb 4.0 will probably be released as 4.0 at that time).

    Please provide feedbacks:

    • do you think choices proposed above are good/bad choices? Why?
    • do you know some additional libraries that should be investigated?
    • do you have other changes in mind that could/should be done in cw 4.0?

  • Follow up of IRI conference about Museums and the Web #museoweb

    2012/04/12 by Arthur Lutz

    I attented the conference organised by IRI in a series of conferences about "Muséologie, muséographie et nouvelles formes d’adresse au public" (hashtag #museoweb). This particular occurence was about "Le Web devient audiovisuel" (the web is also audio and video content). Here are a few notes and links we gathered. The event was organised by Alexandre Monnin @aamonnz.

    http://polemictweet.com/2011-2012-museo-audiovisuel/images/slide4_museo_fr.png

    Yves Raimond from the BBC

    Yves Raimond @moustaki made a presentation about his work at the BBC around semantic web technologies and speech recognition over large quantities of digitized archives. Parts of the BCC web sites use semantic web data as the database and do mashups with external sources of data (musicbrainz, dbpedia, wikipedia). For example Tom Waits has an html web page : http://www.bbc.co.uk/music/artists/c3aeb863-7b26-4388-94e8-5a240f2be21b add .rdf at the end of the URL http://www.bbc.co.uk/music/artists/c3aeb863-7b26-4388-94e8-5a240f2be21b.rdf

    He also made an introduction about the ABC-IP The Automatic Broadcast Content Interlinking Project and the Kiwi-API project that uses CMU Sphinx on Amazon Web Services to process large quantities of archives. A screenshot of Kiwi-API is shown on the BBC R&D blog. The code should be open sourced soon and should appear on the BBC R&D github page.

    Following his presentation, the question was asked if using Wikipedia content on an institutional web site would be possible in France, I pointed to the use of Wikipedia on http://data.bnf.fr , for example at the bottom of the Victor Hugo page.

    Raphaël Troncy about Media Fragments

    Raphaël Troncy @rtroncy made a presentation about "Media Fragments" which will enable sharing parts of a video on the web. Two major features : the sharing of specific extracts and the optimization of bandwith use when streaming the extract (usefull for mobile devices for example). It is a W3C working draft : http://www.w3.org/TR/media-frags-reqs/. Here are a few links of demos and players :

    Part of the presentation was about the ACAV project done jointly with Dailymotion : http://www.capdigital.com/projet-acav/

    The slides of his presentation are available here : http://www.slideshare.net/troncy/addressing-and-annotating-multimedia-fragments

    IRI presentation

    Vincent Puig @vincentpuig and Raphaël Velt @raphv made a presentation of various projects led by IRI :

    http://www.iri.centrepompidou.fr/wp-content/themes/IRI-Theme/images/logo-iri-petit_fr_fr.png

    Final words

    The technologies seen during this conference are often related to semantic web technologies or at least web standards. Some of the visualizations are quite impressive and could mean new uses of the Web and an inspiration for CubicWeb projects.

    A few of the people present at the conference will be attending or presenting talks at SemWeb.Pro which will take place in Paris on the 2nd and 3rd of may 2012.


  • CubicWeb Sprint report for the "BugSquash" team

    2012/03/16 by Nicolas Chauvat

    Beginners fixed core bugs

    The first day of the CubicWeb sprint was dedicated to an introduction to a group of four beginners that included two people that do not work at Logilab. At the end of day, this team knew about Entity, Views and Schema and was ready to dive into the core in order to squash some bugs.

    The first steps into the CubicWeb core were not so easy, but these brave beginners, assisted by a skilled developer, managed to fix some bugs and add a few useful features, including one from a windows user that made it into the stable branch.

    The gen-static-datadir command

    We had a look at cubicweb-ctl gen-static-datadir, a feature that copies in a directory all the files that could be cached by a "front" web server instead of being served by cubicweb.

    Testing the feature

    At first run, we found that not all files where copied. We alas were unable to reproduce. So we need to keep an eye on this. On next tests, we tried several configuration. The files that were copied were always the ones containd in the "deepest" cube in the tree of cubes. So we can say that the command is working well.

    Approach used by the feature

    In the code, we browse all cubes used by the master cube to gather all filenames that we want to copy and afterwards we use "config.locate_resource(resource)" to find the best location for this file.

    Doing this, we sometimes copy a file from the cache. If we do not want to use the cache, we could be sort the cubes recursively copy the whole data folder and sometimes overwrite files with files located nearer to the master cube.

    New option

    We added a -r option that erases the target directory before launching the command.


  • Undoing changes in CubicWeb

    2012/02/29 by Anthony Truchet

    Many desktop applications offer the possibility for the user to undo the recent changes : a similar undo feature has now been integrated into the CubicWeb framework.

    Because a semantic web application and a common desktop application are not the same thing at all, especially as far as undoing is concerned, we will first introduce what is the undo feature for now.

    What's undoing in a CubicWeb application

    A CubicWeb application acts upon an Entity-Relationship model, described by a schema. This ensures some data integrity properties. It also implies that changes are made by group called transaction : so as to insure the data integrity the transaction is completely applied or none of it is applied. What may appear as a simple atomic action to a user can actually consist in several actions for the framework. The end-user has no need to know the details of all actions in those transactions. Only the so-called public actions will appear in the description of the an undoable transaction.

    Lets take a simple example: posting a "comment" for a blog entry will create the entity itself and the link to the blog entry.

    The undo feature for CubicWeb end-users

    For now there are two ways to access the undo feature when it has been activated in the instance configuration file with the option undo-support=yes. Immediately after having done something the undo** link appears in the "creation" message.

    Screenshot of the undo link in the message

    Otherwise, one can access at any time the undo-history view accessible from the start-up page.

    Screenshot of the undo link in the message

    This view shows the transactions, and each provides its own undo link. Only the transactions the user has permissions to see and undo will be shown.

    Screenshot of the **undo** link in the message

    If the user attempts to undo a transaction which can't be undone or whose undoing fails, then a message will explain the situation and no partial undoing will be left behind.

    What's next

    The undo feature is functional but the interface and configuration options are quite limited. One major, planned, improvement would be enable the user to filter which transactions or actions he sees in the undo-history view. Another critical improvement would be to selectively enable the undo feature on part of the entity-relationship schema to avoid storing too much data and reduce the underlying overhead.

    Feedback on this undo feature for specific CubicWeb applications is welcome. More detailed information regarding the undo feature will be published in the CubicWeb book when the patches make it through the review process.


  • CubicWeb Sprint report for the "ZMQ" team

    2012/02/27 by Julien Cristau

    There has been a growing interest in ZMQ in the past months, due to its ability to efficiently deal with message passing, while being light and robust. We have worked on introducing ZMQ in the CubicWeb framework for various uses :

    • As a replacement/alternative to the Pyro source, that is used to connect to distant instances. ZMQ may be used as a lighter and more efficient alternative to Pyro. The main idea here is to use the send_pyobj/recv_pyobj API of PyZMQ (python wrapper of ZMQ) to execute methods on the distant Repository in a totally transparent way for CubicWeb.
    http://www.cubicweb.org/file/2219158?vid=download
    • As a JSONServer. Indeed, ZMQ could be used to share data between a server and any requests done through ZMQ. The request is just a string of RQL, and the response is the result set formatted in Json.
    • As the building block for a simple notification (publish/subscribe) system between CubicWeb instances. A component can register its interest in a particular topic, and receive a callback whenever a corresponding message is received. At this point, this mechanism is used in CubicWeb to notify other instances that they should invalidate their caches when an entity is deleted.

  • CubicWeb Sprint report for the "WSGI" team

    2012/02/20 by Pierre-Yves David

    Cubicweb has had WSGI support for several years, but this support was incomplete.

    The WSGI team was in charge of turning WSGI support into a full featured backend that could replace Twisted in real production scenarii.

    Because we only had first class support for Twisted, some of the CubicWeb logic related to HTTP handling was implemented on the twisted side with twisted concepts. Our first task was to move this logic in CubicWeb itself. The handling of HTTP status in our response was improved in the process.

    Our second task was to focus on the "non-HTTP" part of CubicWeb (because the repository also manages background tasks). The developement mode for WSGI is now able to handle and run such tasks. For this purpose we have begun a process that aims to remove server related code from the repository object.

    We also Tested several WSGI middleware. One of the most promising is Firepython, integrating python logging and debugging feature with Firebug. werkzeug debugger seems neat too.

    http://www.cubicweb.org/file/2194267?vid=download

    All these improvements open the road to a simple and efficient multi-process architecture in CubicWeb.


  • CubicWeb Sprint report for the "Benchmarks" team

    2012/02/17 by Arthur Lutz

    One team during the CubicWeb sprint looked at issues around monitoring benchmark values for CubicWeb development. This is a huge task, so we tried to stay focused on a few aspects:

    • production reponse times (using tools such as smokeping and munin)
    • response times of test executions in continuous integration tests
    • response times of test instances runinng in continuous integration

    We looked at using cpu.clock() instead of cpu.time() in the xunit files that report test results so as to be a bit more independent of the load of the machine (but subprocesses won't be counted for).

    Graphing test times in hudson/jenkins already exists (/job/PROJECT/BUILDID/testReport/history/?) and can also be graphed by TestClass and by individual test. What is missing so far is a specific dashboard were one could select the significant graphs to look at.

    By the end of the first day we had a "lorem ipsum" test instance that is created on the fly on each hudson/jenkins build and a jmeter bench running on it, it's results processed by the performance plugin.

    http://www.cubicweb.org/file/2184036?vid=download

    By the end of the second day we had some visualisation of existing data collected by apycot using jqplot javascript visulation (cubicweb-jqplot):

    http://www.cubicweb.org/file/2184035?vid=download

    By the end of the sprint, we got patches submitted for the following cubes :

    • apycot
    • cubicweb-jqplot
    • the original jqplot library (update : patch accepted a few days later)

    On the last hour of the sprint, since we had a "lorem ipsum" test application running each time the tests went through the continuous integration, we hacked up a proof of concept to get automatic screenshots of this temporary test application. So far, we get screenshots for firefox only, but it opens up possibilities for other browsers. Inspiration could be drawn from https://browsershots.org/


  • "Data Fast-food": quick interactive exploratory processing and visualization of complex datasets with CubicWeb

    2012/01/19 by Vincent Michel

    With the emergence of the semantic web in the past few years, and the increasing number of high quality open data sets (cf the lod diagram), there is a growing interest in frameworks that allow to store/query/process/mine/visualize large data sets.

    We have seen in previous blog posts how CubicWeb may be used as an efficient knowledge management system for various types of data, and how it may be used to perform complex queries. In this post, we will see, using Geonames data, how CubicWeb may perform simple or complex data mining and machine learning procedures on data, using the datamining cube. This cube adds powerful tools to CubicWeb that make it easy to interactively process and visualize datasets.

    At this point, it is not meant to be used on massive datasets, for it is not fully optimized yet. If you try to perform a TF-IDF (term frequency–inverse document frequency) with a hierarchical clustering on the full dbpedia abstracts dataset, be prepared to wait. But it is a promising way to enrich the user experience while playing with different datasets, for quick interactive exploratory datamining processing (what I've called the "Data fast-food"). This cube is based on the scikit-learn toolbox that has recently gained a huge popularity in the machine learning and Python community. The release of this cube drastically increases the interest of CubicWeb for data management.

    The Datamining cube

    For a given query, similarly to SQL, CubicWeb returns a result set. This result set may be presented by a view to display a table, a map, a graph, etc (see documentation and previous blog posts).

    The datamining cube introduces the possibility to process the result set before presenting it, for example to apply machine learning algorithms to cluster the data.

    The datamining cube is based on two concepts:

    • the concept of processor: basically, a processor transforms a result set in a numpy array, given some criteria defining the mathematical processing, and the columns/rows of the result set to be taken into account. The numpy-array is a polyvalent structure that is widely used for numerical computation. This array could thus be efficiently used with any kind of datamining algorithms. Note that, in our context of knowledge management, it is more convenient to return a numpy array with additional meta-information, such as indices or labels, the result being stored in what we call a cw-array. Meta-information may be useful for display, but is not compulsory.
    • the concept of array-view: the "views" are basic components of CubicWeb, distinguish querying and displaying the data is key in this framework. So, on a given result set, many different views can be applied. In the datamining cube, we simply overload the basic view of CubicWeb, so that it works with cw-array instead of result sets. These array-views are associated to some machine learning or datamining processes. For example, one can apply the k-means (clustering process) view on a given cw-array.

    A very important feature is that the processor and the array-view are called directly through the URL using the two related parameters arid (for ARray ID) and vid (for View ID, standard in CubicWeb).

    http://www.cubicweb.org/file/2154793?vid=download

    Processors

    We give some examples of basic processors that may be found in the datamining cube:

    • AttributesAsFloatArrayProcessor (arid='attr-asfloat'): This processor turns all Int, BigInt and Float attributes in the result set to floats, and returns the corresponding array. The number of rows is equal to the number of rows in the result set, and the number of columns is equal to the number of convertible attributes in the result set.
    • EntityAsFloatArrayProcessor (arid='entity-asfloat'): This processor performs similarly to the AttributesAsFloatArrayProcessor, but keeps the reference to the entities used to create the numpy-array. Thus, this information could be used for display (map, label, ...).
    • AttributesAsTokenArrayProcessor (arid='attr-astoken'): This processor turns all String attributes in the result set in a numpy array, based on a Word-n-gram analyze. This may be used to tokenize a set of strings.
    • PivotTableCountArrayProcessor (arid='pivot-table-count'): This processor is used to create a pivot table, with a count function. Other functions, such as sum or product also exist. This may be used to create some spreadsheet-like views.
    • UndirectedRelationArrayProcessor (arid='undirected-rel'): This processor creates a binary numpy array of dimension (nb_entities, nb_entities), that represents the relations (or corelations) between entities. This may be used for graph-based vizualisation.

    We are also planning to extend the concept of processor to sparse matrix (scipy.sparse), in order to deal with very high dimensional data.

    Array Views

    The array views that are found in the datamining cube, are, for most of them, used for simple visualization. We used HTML-based templates and the Protovis Javascript Library.

    We will not detail all the views, but rather show some examples. Read the reference documentation for a complete and detailed description.

    Examples on numerical data

    Histogram

    The request:

    Any LO, LA WHERE X latitude LA, NOT X latitude NULL, X longitude LO,  NOT X longitude NULL,
    X country C, NOT X elevation NULL, C name "France"
    

    that may be translated as:

    All couples (latitude, longitude) of the locations in France, with an elevation not null
    

    and, using vid=protovis-hist and arid=attr-asfloat

    http://www.cubicweb.org/file/2154795?vid=download

    Scatter plot

    Using the notion of view, we can display differently the same result set, for example using a scatter plot (vid=protovis-scatterplot).

    http://www.cubicweb.org/file/2156233?vid=download

    Another example with the request:

    Any P, E WHERE X is Location, X elevation E, X elevation >1, X population P,
    X population >10, X country CO, CO name "France"
    

    that may be translated as:

    All couples (population, elevation) of locations in France,
    with a population higher than 10 (inhabitants),and an elevation higher than 1 (meter)
    

    and, using the same vid (vid=protovis-scatterplot) and the same arid (arid=attr-asfloat)

    http://www.cubicweb.org/file/2154802?vid=download

    If a third column is given in the result set (and thus in the numpy array), it will be encoded in the size/color of each dot of the scatter plot. For example with the request:

    Any LO, LA, E WHERE X latitude LA, NOT X latitude NULL, X longitude LO,  NOT X longitude NULL,
    X country C, NOT X elevation NULL, X elevation E, C name "France"
    

    that may be translated as:

    All tuples (latitude, longitude, elevation) of the locations in France, with an elevation not null
    

    and, using the same vid (vid=protovis-scatterplot) and the same arid (arid=attr-asfloat), we can visualize the elevation on a map, encoded in size/color

    http://www.cubicweb.org/file/2154805?vid=download

    Another example with the request:

    Any LO, LA LIMIT 50000 WHERE X is Location, X population  >1000, X latitude LA, X longitude LO,
    X country CO, CO name "France"
    

    that may be translated as:

    All couples (latitude, longitude) of 50000 locations in France, with a population higher than 100 (inhabitants)
    
    http://www.cubicweb.org/file/2156095?vid=download

    There also exist some AreaChart view, LineArray view, ...

    Examples on relational data

    Relational Matrix (undirected graph)

    The request:

    Any X,Y WHERE X continent CO, CO name "North America", X neighbour_of Y
    

    that may be translated as:

    All neighbour countries in North America
    

    and using the vid='protovis-binarymap' and arid='undirected-rel'

    http://www.cubicweb.org/file/2154796?vid=download

    Relational Matrix (directed graph)

    If we do not want a symmetric matrix, i.e. if we want to keep the direction of a link (X,Y is not the same relation as Y,X), we can use the directed*rel array processor. For example, with the following request:

    Any X,Y LIMIT 20 WHERE X continent Y
    

    that may be translated as:

    20 countries and their continent
    

    and using the vid='protovis-binarymap' and arid='directed-rel'

    http://www.cubicweb.org/file/2154797?vid=download

    Force directed graph

    For a dynamic representation of relations, we can use a force directed graph. The request:

    Any X,Y WHERE X neighbour_of Y
    

    that may be translated as:

    All neighbour countries in the World.
    

    and using the vid='protovis-forcedirected' and arid='undirected-rel', we can see the full graph, with small independent components (e.g. UK and Ireland)

    http://www.cubicweb.org/file/2154800?vid=download

    Again, a third column in the result set could be used to encode some labeling information, for example the continent.

    The request:

    Any X,Y,CO WHERE X neighbour_of Y, X continent CO
    

    that may be translated as:

    All neighbour countries in the World, and their corresponding continent.
    

    and again, using the vid='protovis-forcedirected' and arid='undirected-rel', we can see the full graph with the continents encoded in color (Americas in green, Africa in dark blue, ...)

    http://www.cubicweb.org/file/2154801?vid=download

    Dendrogram

    For hierarchical information, one can use the Dendrogram view. For example, with the request:

    Any X,Y WHERE X continent Y
    

    that may be translated as:

    All couple (country, continent) in the World
    

    and using vid='protovis-dendrogram' and arid='directed-rel', we have the following dendrogram (we only show a part due to lack of space)

    http://www.cubicweb.org/file/2154806?vid=download

    Unsupervised Learning

    We have also developed some machine learning view for unsupervised learning. This is more a proof of concept than a fully optimized development, but we can already do some cool stuff. Each machine learning processing is referenced by a mlid. For example, with the request:

    Any LO, LA WHERE X is Location, X elevation E, X elevation >1, X latitude LA, X longitude LO,
    X country CO, CO name "France"
    

    that may be translated as:

    All couples (latitude, longitude) of the locations in France, with an elevation higher than 1
    

    and using vid='protovis-scatterplot' arid='attr-asfloat' and mlid='kmeans', we can construct a scatter plot of all couples of latitude and longitude in France, and create 10 clusters using the kmeans clustering. The labeling information is thus encoded in color/size:

    http://www.cubicweb.org/file/2154804?vid=download

    Download

    Finally, we have also implement a download view, based on the Pickle of the numpy-array. It is thus possible to access remotely any data within a Python shell, allowing to process them as you want. Changing the request can be done very easily by changing the rql parameter in the URL. For example:

    import pickle, urllib
    data = pickle.loads(urllib.open('http://mydomain?rql=my request&vid=array-numpy&arid=attr-asfloat'))
    

  • CubicWeb sprint in Paris - 2012/02/07-10

    2011/12/21 by Nicolas Chauvat

    Topics

    To be decided. Some possible topics are :

    • optimization (still)
    • porting cubicweb to python3
    • porting cubicweb to pypy
    • persistent sessions
    • finish twisted / wsgi refactoring
    • inter-instance communication bus
    • use subprocesses to handle datafeeds
    • developing more debug-tools (debug console, view profiling, etc.)
    • pluggable / unpluggable external sources (as needed for the cubipedia and semantic family)
    • client-side only applications (javascript + http)
    • mercurial storage backend: see this thread of the mailing list
    • mercurial-server integration: see this email to the mailing list

    other ideas are welcome, please bring them up on cubicweb@lists.cubicweb.org

    Location

    This sprint will take place from in february 2012 from tuesday the 7th to friday the 10th. You are more than welcome to come along, help out and contribute. An introduction is planned for newcomers.

    Network resources will be available for those bringing laptops.

    Address : 104 Boulevard Auguste-Blanqui, Paris. Ring "Logilab" (googlemap)

    Metro : Glacière

    Contact : http://www.logilab.fr/contact

    Dates : 07/02/2012 to 10/02/2012


  • Geonames in CubicWeb !

    2011/12/14 by Vincent Michel

    CubicWeb is a semantic web framework written in Python that has been succesfully used in large-scale projects, such as data.bnf.fr (French National Library's opendata) or Collections des musées de Haute-Normandie (museums of Haute-Normandie).

    CubicWeb provides a high-level query language, called RQL, operating over a relational database (PostgreSQL in our case), and allows to quickly instantiate an entity-relationship data-model. By separating in two distinct steps the query and the display of data, it provides powerful means for data retrieval and processing.

    In this blog, we will demonstrate some of these capabilities on the Geonames data.

    Geonames

    Geonames is an open-source compilation of geographical data from various sources:

    "...The GeoNames geographical database covers all countries and contains over eight million placenames that are available for download free of charge..." (http://www.geonames.org)

    The data is available as a dump containing different CSV files:

    • allCountries: main file containing information about 8,000,000 places in the world. We won't detail the various attributes of each location, but we will focus on some important properties, such as population and elevation. Moreover, admin_code_1 and admin_code_2 will be used to link the different locations to the corresponding AdministrativeRegion, and feature_code will be used to link the data to the corresponding type.
    • admin1CodesASCII.txt and admin2Codes.txt detail the different administrative regions, that are parts of the world such as region (Ile-de-France), department (Department of Yvelines), US counties...
    • featureCodes.txt details the different types of location that may be found in the data, such as forest(s), first-order administrative division, aqueduct, research institute, ...
    • timeZones.txt, countryInfo.txt, iso-languagecodes.txt are additional files prodividing information about timezones, countries and languages. They will be included in our CubicWeb database but won't be explained in more details here.

    The Geonames website also provides some ways to browse the data: by Countries, by Largest Cities, by Highest mountains, by postal codes, etc. We will see that CubicWeb could be used to automatically create such ways of browsing data while allowing far deeper queries. There are two main challenges when dealing with such data:

    • the number of entries: with 8,000,000 placenames, we have to use efficient tools for storing and querying them.
    • the structure of the data: the different types of entries are separated in different files, but should be merged for efficient queries (i.e. we have to rebuild the different links between entities, e.g Location to Country or Location to AdministrativeRegion).

    Data model

    With CubicWeb, the data model of the application is written in Python. It defines different entity classes with their attributes, as well as the relationships between the different entity classes. Here is a sample of the schema.py that we have used for Geonames data:

    class Location(EntityType):
        name = String(maxsize=1024, indexed=True)
        uri = String(unique=True, indexed=True)
        geonameid = Int(indexed=True)
        latitude = Float(indexed=True)
        longitude = Float(indexed=True)
        feature_code = SubjectRelation('FeatureCode', cardinality='?*', inlined=True)
        country = SubjectRelation('Country', cardinality='?*', inlined=True)
        main_administrative_region = SubjectRelation('AdministrativeRegion',
                                  cardinality='?*', inlined=True)
        timezone = SubjectRelation('TimeZone', cardinality='?*', inlined=True)
        ...
    

    This indicates that the main Location class has a name attribute (string), an uri (string), a geonameid (integer), a latitude and a longitude (both floats), and some relation to other entity classes such as FeatureCode (the relation is named feature_code), Country (the relation is named country), or AdministrativeRegion called main_administrative_region.

    The cardinality of each relation is classically defined in a similar way as RDBMS, where * means any number, ? means zero or one and 1 means one and only one.

    We give below a visualisation of the schema (obtained using the /schema relative url)

    http://www.cubicweb.org/file/2124618?vid=download

    Import

    The data contained in the CSV files could be pushed and stored without any processing, but it is interesting to reconstruct the relations that may exist between different entities and entity classes, so that queries will be easier and faster.

    Executing the import procedure took us 80 minutes on regular hardware, which seems very reasonable given the amount of data (~7,000,000 entities, 920MB for the allCountries.txt file), and the fact that we are also constructing many indexes (on attributes or on relations) to improve the queries. This import procedure uses some low-level SQL commands to load the data into the underlying relational database.

    Queries and views

    As stated before, queries are performed in CubicWeb using RQL (Relational Query Language), which is similar to SPARQL, but with a syntax that is closer to SQL. This language may be used to query directly the concepts while abstracting the physical structure of the underlying database. For example, one can use the following request:

    Any X LIMIT 10 WHERE X is Location, X population > 1000000,
        X country C, C name "France"
    

    that means:

    Give me 10 locations that have a population greater than 1000000, and that are in a country named "France"

    The corresponding SQL query is:

    SELECT _X.cw_eid FROM cw_Country AS _C, cw_Location AS _X
    WHERE _X.cw_population>1000000
          AND _X.cw_country=_C.cw_eid AND _C.cw_name="France"
    LIMIT 10
    

    We can see that RQL is higher-level than SQL and abstracts the details of the tables and the joins.

    A query returns a result set (a list of results), that can be displayed using views. A main feature of CubicWeb is to separate the two steps of querying the data and displaying the results. One can query some data and visualize the results in the standard web framework, download them in different formats (JSON, RDF, CSV,...), or display them in some specific view developed in Python.

    In particular, we will use the mapstraction.map which is based on the Mapstraction and the OpenLayers libraries to display information on maps using data from OpenStreetMap. This mapstraction.map view uses a feature of CubicWeb called adapter. An adapter adapts a class of entity to some interface, hence views can rely on interfaces instead of types and be able to display entities with different attributes and relations. In our case, the IGeocodableAdapter returns a latitude and a longitude for a given class of entity (here, the mapping is trivial, but there are more complex cases... :) ):

    class IGeocodableAdapter(EntityAdapter):
          __regid__ = 'IGeocodable'
          __select__ = is_instance('Location')
          @property
          def latitude(self):
              return self.entity.latitude
          @property
          def longitude(self):
              return self.entity.longitude
    

    We will give some results of queries and views later. It is important to notice that the following screenshoots are taken without any modification of the standard web interface of CubicWeb. It is possible to write specific views and to define a specific CSS, but we only wanted to show how CubicWeb could handle such data. However, the default web template of CubicWeb is sufficient for what we want to do, as it dynamically creates web pages showing attributes and relations, as well as some specific forms and javascript applets adapted directly to the data (e.g. map-based tools). Last but not least, the query and the view could be defined within the url, and thus open a world of new possibilities to the user:

    http://baseurl:port/?rql=The query that I want&vid=Identifier-of-the-view
    

    Facets

    We will not get into too much details about Facets, but let's just say that this feature may be used to determine some filtering axis on the data, and thus may be used to post-filter a result set. In this example, we have defined four different facets: on the population, on the elevation, one the feature_code and one the main_administrative_region. We will see illustration of these facets below.

    We give here an example of the definition of a Facet:

    class LocationPopulationFacet(facet.RangeFacet):
        __regid__ = 'population-facet'
        __select__ = is_instance('Location')
        order = 2
        rtype = 'population'
    

    where __select__ defines which class(es) of entities are targeted by this facet, order defines the order of display of the different facets, and rtype defines the target attribute/relation that will be used for filtering.

    Geonames in CubicWeb

    The main page of the Geoname application is illustrated in the screenshot below. It provides general information on the database, in particular the number of entities in the different classes:

    • 7,984,330 locations.
    • 59,201 administrative regions (e.g. regions, counties, departments...)
    • 7,766 languages.
    • 656 features (e.g. types of location).
    • 410 time zones.
    • 252 countries.
    • 7 continents.
    http://www.cubicweb.org/file/2124617?vid=download

    Simple query

    We will first illustrate the possibilites of CubicWeb with the simple query that we have detailed before (that could be directly pasted in the url...):

    Any X LIMIT 10 WHERE X is Location, X population > 1000000,
        X country C, C name "France"
    

    We obtain the following page:

    http://www.cubicweb.org/file/2124615?vid=download

    This is the standard view of CubicWeb for displaying results. We can see (right box) that we obtain 10 locations that are indeed located in France, with a population of more than 1,000,000 inhabitants. The left box shows the search panel that could be used to launch queries, and the facet filters that may be used for filtering results, e.g. we may ask to keep only results with a population greater than 4,767,709 inhabitants within the previous results:

    http://www.cubicweb.org/file/2124616?vid=download

    and we obtain now only 4 results. We can also notice that the facets are linked: by restricting the result set using the population facet, the other facets also restricted their possibilities.

    Simple query (but with more information !)

    Let's say that we now want more information about the results that we have obtained previously (for example the exact population, the elevation and the name). This is really simple ! We just have to ask within the RQL query what we want (of course, the names N, P, E of the variables could be almost anything...):

    Any N, P, E LIMIT 10 WHERE X is Location,
        X population P, X population > 1000000,
        X elevation E, X name N, X country C, C name "France"
    
    http://www.cubicweb.org/file/2124619?vid=download

    The empty column for the elevation simply means that we don't have any information about elevation.

    Anyway, we can see that fetching particular information could not be simpler! Indeed, with more complex queries, we can access countless information from the Geonames database:

    Any N,E,LA,LO ORDERBY E DESC LIMIT 10  WHERE X is Location,
          X latitude LA, X longitude LO,
          X elevation E, NOT X elevation NULL, X name N,
          X country C, C name "France"
    

    which means:

    Give me the 10 highest locations (the 10 first when sorting by decreasing elevation) with their name, elevation, latitude and longitude that are in a country named "France"
    http://www.cubicweb.org/file/2124626?vid=download

    We can now use another view on the same request, e.g. on a map (view mapstraction.map):

    Any X ORDERBY E DESC LIMIT 10  WHERE X is Location,
           X latitude LA, X longitude LO, X elevation E,
           NOT X elevation NULL, X country C, C name "France"
    
    http://www.cubicweb.org/file/2124631?vid=download

    And now, we can add the fact that we want more results (20), and that the location should have a non-null population:

    Any N, E, P, LA, LO ORDERBY E DESC LIMIT 20  WHERE X is Location,
           X latitude LA, X longitude LO,
           X elevation E, NOT X elevation NULL, X population P,
           X population > 0, X name N, X country C, C name "France"
    
    http://www.cubicweb.org/file/2124632?vid=download

    ... and on a map ...

    http://www.cubicweb.org/file/2124633?vid=download

    Conclusion

    In this blog, we have seen how CubicWeb could be used to store and query complex data, while providing (among other...) Web-based views for data vizualisation. It allows the user to directly query data within the URL and may be used to interact with and explore the data in depth. In a next blog, we will give more complex queries to show the full possibilities of the system.


  • Importing thousands of entities into CubicWeb within a few seconds with dataimport

    2011/12/09 by Adrien Di Mascio

    In most cubicweb projects I've been developing on, there always comes a time where I need to import legacy data in the new application. CubicWeb provides Store and Controller objects in the dataimport module. I won't talk here about the recommended general procedure described in the module's docstring (I find it a bit convoluted for simple cases) but I will focus on Store objects. Store objects in this module are more or less a thin layer around session objects, they provide high-level helpers such as create_entity(), relate() and keep track of what was inserted, errors occurred, etc.

    In a recent project, I had to create a somewhat fair amount (a few million) of simple entities (strings, integers, floats and dates) and relations. Default object store (i.e. cubicweb.dataimport.RQLObjectStore) is painfully slow, the reason being all integrity / security / metadata hooks that are constantly selected and executed. For large imports, dataimport also provides the cubicweb.dataimport.NoHookRQLObjectStore. This store bypasses all hooks and uses the underlying system source primitives directly, making it around two-times faster than the standard store. The problem is that we're still doing each sql query sequentially and we're talking here of millions of INSERT / UPDATE queries.

    My idea was to create my own ObjectStore class inheriting from NoHookRQLObjectStore that would try to use executemany or even copy_from when possible [1]. It is actually not hard to make groups of similar SQL queries since create_entity() generates the same query for a given set of parameters. For instance:

    create_entity('Person', firstname='John', surname='Doe')
    create_entity('Person', firstname='Tim', surname='BL')
    

    will generate the following sql queries:

    INSERT INTO cw_Person ( cw_cwuri, cw_eid, cw_modification_date,
                            cw_creation_date, cw_firstname, cw_surname )
           VALUES ( %(cw_cwuri)s, %(cw_eid)s, %(cw_modification_date)s,
                    %(cw_creation_date)s, %(cw_firstname)s, %(cw_surname)s )
    INSERT INTO cw_Person ( cw_cwuri, cw_eid, cw_modification_date,
                            cw_creation_date, cw_firstname, cw_surname )
           VALUES ( %(cw_cwuri)s, %(cw_eid)s, %(cw_modification_date)s,
                    %(cw_creation_date)s, %(cw_firstname)s, %(cw_surname)s )
    

    The only thing that will differ is the actual data inserted. Well ... ahem ... CubicWeb actually also generates a "few" extra sql queries to insert metadata for each entity:

    INSERT INTO is_instance_of_relation(eid_from,eid_to) VALUES (%s,%s)
    INSERT INTO is_relation(eid_from,eid_to) VALUES (%s,%s)
    INSERT INTO cw_source_relation(eid_from,eid_to) VALUES (%s,%s)
    INSERT INTO owned_by_relation ( eid_to, eid_from ) VALUES ( %(eid_to)s, %(eid_from)s )
    INSERT INTO created_by_relation ( eid_to, eid_from ) VALUES ( %(eid_to)s, %(eid_from)s )
    

    Those extra queries are actually even exactly the same for each entity insterted, whatever the entity type is, hence craving for executemany or copy_from. Grouping together SQL queries is not that hard [2] but has a drawback : as you don't have an intermediate state (the data is actually inserted only at the very end of the process), you loose the ability to query your database to fetch the entities you've just created during the import.

    Now, a few benchmarks ...

    To create those benchmarks, I decided to use the workorder cube which is a simple cube, yet complete enough : it provides only two entity types (WorkOrder and Order), a relation between them (Order split_into WorkOrder) and uses different kind of attributes (String, Date, Float).

    Once the cube was instantiated, I ran the following script to populate the database with my 3 different stores:

    import sys
    from datetime import date
    from random import choice
    from itertools import count
    
    from logilab.common.decorators import timed
    
    from cubicweb import cwconfig
    from cubicweb.dbapi import in_memory_repo_cnx
    
    def workorders_data(n, seq=count()):
        for i in xrange(n):
            yield {'title': u'wo-title%s' % seq.next(), 'description': u'foo',
                   'begin_date': date.today(), 'end_date': date.today()}
    
    def orders_data(n, seq=count()):
        for i in xrange(n):
            yield {'title': u'o-title%s' % seq.next(), 'date': date.today(), 'budget': 0.8}
    
    def split_into(orders, workorders):
        for workorder in workorders:
            yield choice(orders), workorder
    
    def initial_state(session, etype):
        return session.execute('Any S WHERE S is State, WF initial_state S, '
                               'WF workflow_of ET, ET name %(etn)s', {'etn': etype})[0][0]
    
    
    @timed
    def populate(store, nb_workorders, nb_orders, set_state=False):
        orders = [store.create_entity('Order', **attrs)
                  for attrs in orders_data(nb_orders)]
        workorders = [store.create_entity('WorkOrder', **attrs)
                      for attrs in workorders_data(nb_workorders)]
        ## in_state is set by a hook, so NoHookObjectStore will need
        ## to set the relation manually
        if set_state:
            order_state = initial_state(store.session, 'Order')
            workorder_state = initial_state(store.session, 'WorkOrder')
            for order in orders:
                store.relate(order.eid, 'in_state', order_state)
            for workorder in workorders:
                store.relate(workorder.eid, 'in_state', workorder_state)
        for order, workorder in split_into(orders, workorders):
            store.relate(order.eid, 'split_into', workorder.eid)
        store.commit()
    
    
    if __name__ == '__main__':
        config = cwconfig.instance_configuration(sys.argv[1])
        nb_orders = int(sys.argv[2])
        nb_workorders = int(sys.argv[3])
        repo, cnx = in_memory_repo_cnx(config, login='admin', password='admin')
        session = repo._get_session(cnx.sessionid)
        from cubicweb.dataimport import RQLObjectStore, NoHookRQLObjectStore
        from cubes.mycube.dataimport.store import CopyFromRQLObjectStore
        print 'testing RQLObjectStore'
        store = RQLObjectStore(session)
        populate(store, nb_workorders, nb_orders)
        print 'testing NoHookRQLObjectStore'
        store = NoHookRQLObjectStore(session)
        populate(store, nb_workorders, nb_orders, set_state=True)
        print 'testing CopyFromRQLObjectStore'
        store = CopyFromRQLObjectStore(session)
    

    I ran the script and asked to create 100 Order entities, 1000 WorkOrder entities and to link each created WorkOrder to a parent Order

    adim@esope:~/tmp/bench_cwdi$ python bench_cwdi.py bench_cwdi 100 1000
    testing RQLObjectStore
    populate clock: 24.590000000 / time: 46.169721127
    testing NoHookRQLObjectStore
    populate clock: 8.100000000 / time: 25.712352991
    testing CopyFromRQLObjectStore
    populate clock: 0.830000000 / time: 1.180006981
    

    My interpretation of the above times is :

    • The clock time indicates the time spent on CubicWeb server side (i.e. hooks and data pre/postprocessing around SQL queries). The time time should be the sum of clock time + time spent in postgresql.
    • RQLObjectStore is slow ;-). Nothing new here, but the clock/time ratio means that we're speding a lot of time on the python side (i.e. hooks as I told earlier) and a fair amount of time in postgresql.
    • NoHookRQLObjectStore really takes down the time spent on the python side, the time in postgresql remains about the same as for RQLObjectStore, this is not surprising, queries performed are the same in both cases.
    • CopyFromRQLObjectStore seems blazingly fast in comparison (inserting a few thousands of elements in postgresql with a COPY FROM statement is not a problem). And ... yes, I checked the data was actually inserted, and I even a ran a cubicweb-ctl db-check on the instance afterwards.

    This probably opens new perspective for massive data imports since the client API remains the same as before for the programmer. It's still a bit experimental, can only be used for "dummy", brute-force import scenario where you can preprocess your data in Python before updating the database, but it's probably worth having such a store in the the dataimport module.

    [1]The idea is to promote an executemany('INSERT INTO ...', data) statement into a COPY FROM whenever possible (i.e. simple data types, easy enough to escape). In that case, the underlying database and python modules have to provide support for this functionality. For the record, the psycopg2 module exposes a copy_from() method and soon logilab-database will provide an additional high-level helper for this functionality (see this ticket).
    [2]The code will be posted later or even integrated into CubicWeb at some point. For now, it requires a bit of monkey patching around one or two methods in the source so that SQL is not executed but just recorded for later executions.

  • Reusing OpenData from Data.gouv.fr with CubicWeb in 2 hours

    2011/12/07 by Vincent Michel

    Data.gouv.fr is great news for the OpenData movement!

    Two days ago, the French government released thousands of data sets on http://data.gouv.fr/ under an open licensing scheme that allows people to access and play with them. Thanks to the CubicWeb semantic web framework, it took us only a couple hours to put some of that open data to good use. Here is how we mapped the french railway system.

    http://www.cubicweb.org/file/2110281?vid=download

    Train stations in french Britany

    Source Datasets

    We used two of the datasets available on data.gouv.fr:

    • Train stations : description of the 6442 train stations in France, including their name, type and geographic coordinates. Here is a sample of the file

      441000;St-Germain-sur-Ille;Desserte Voyageur;48,23955;-1,65358
      441000;Montreuil-sur-Ille;Desserte Voyageur-Infrastructure;48,3072;-1,6741
      
    • LevelCrossings : description of the 18159 level crossings on french railways, including their type and location. Here is a sample of the file

      558000;PN privé pour voitures avec barrières sans passage piétons accolé;48,05865;1,60697
      395000;PN privé pour voitures avec barrières avec passage piétons accolé public;;48,82544;1,65795
      

    Data Model

    Given the above datasets, we wrote the following data model to store the data in CubicWeb:

    class Location(EntityType):
        name = String(indexed=True)
        latitude = Float(indexed=True)
        longitude = Float(indexed=True)
        feature_type = SubjectRelation('FeatureType', cardinality='?*')
        data_source = SubjectRelation('DataGovSource', cardinality='1*', inlined=True)
    
    class FeatureType(EntityType):
        name = String(indexed=True)
    
    class DataGovSource(EntityType):
        name = String(indexed=True)
        description = String()
        uri = String(indexed=True)
        icon = String()
    

    The Location object is used for both train stations and level crossings. It has a name (text information), a latitude and a longitude (numeric information), it can be linked to multiple FeatureType objects and to a DataGovSource. The FeatureType object is used to store the type of train station or level crossing and is defined by a name (text information). The DataGovSource object is defined by a name, a description and a uri used to link back to the source data on data.gouv.fr.

    http://www.cubicweb.org/file/2110311?vid=download

    Schema of the data model

    Data Import

    We had to write a few lines of code to benefit from the massive data import feature of CubicWeb before we could load the content of the CSV files with a single command:

    $ cubicweb-ctl import-datagov-location datagov_geo gare.csv-fr.CSV  --source-type=gare
    $ cubicweb-ctl import-datagov-location datagov_geo passage_a_niveau.csv-fr.CSV  --source-type=passage
    

    In less than a minute, the import was completed and we had:

    • 2 DataGovSource objects, corresponding to the two data sets,
    • 24 FeatureType objects, corresponding to the different types of locations that exist (e.g. Non exploitée, Desserte Voyageur, PN public isolé pour piétons avec portillons or PN public pour voitures avec barrières gardé avec passage piétons accolé manoeuvré à distance),
    • 24601 Locations, corresponding to the different train stations and level crossings.

    Data visualization

    CubicWeb allows to build complex applications by assembling existing components (called cubes). Here we used a cube that wraps the Mapstraction and the OpenLayers libraries to display information on maps using data from OpenStreetMap.

    In order for the Location type defined in the data model to be displayable on a map, it is sufficient to write the following adapter:

    class IGeocodableAdapter(EntityAdapter):
          __regid__ = 'IGeocodable'
          __select__ = is_instance('Location')
          @property
          def latitude(self):
              return self.entity.latitude
          @property
          def longitude(self):
              return self.entity.longitude
    

    That was it for the development part! The next step was to use the application to browse the structure of the french train network on the map.

    Train stations in use:

    http://www.cubicweb.org/file/2110279?vid=download

    Train stations not in use:

    http://www.cubicweb.org/file/2110280?vid=download

    Zooming on some parts of the map, for example Brittany, we get to see more details and clicking on the train icons gives more information on the corresponding Location.

    Train stations in use:

    http://www.cubicweb.org/file/2110281?vid=download

    Train stations not in use:

    http://www.cubicweb.org/file/2110282?vid=download

    Since CubicWeb separates querying the data and displaying the result of a query, we can switch the view to display the same data in tables or to export it back to a CSV file.

    http://www.cubicweb.org/file/2110313?vid=download

    Querying Data

    CubicWeb implements a query langage very similar to SPARQL, that makes the data available without the need to learn a specific API.

    • Example 1: http:/some.url.demo/?rql=Any X WHERE X is Location, X name LIKE "%miny"

      This request gives all the Location with a name that ends with "miny". It returns only one element, the Firminy train station.

    http://www.cubicweb.org/file/2110286?vid=download
    • Example 2: http:/some.url.demo/?rql=Any X WHERE X is Location, X name LIKE "%ny"

      This request gives all the Location with a name that ends with "ny", and return 112 trainstations.

    http://www.cubicweb.org/file/2110287?vid=download
    • Example 3: http:/some.url.demo/?rql=Any X WHERE X latitude < 47.8, X latitude>47.6, X longitude >-1.9, X longitude<-1.8

      This request gives all the Location that have a latitude between 47.6 and 47.8, and a longitude between -1.9 and -1.8.

      We obtain 11 Location (9 levelcrossings and 2 trainstations). We can map them using the view mapstraction.map that we describe previously.

      http://www.cubicweb.org/file/2110288?vid=download
    • Example 4: http:/domainname:8080/?rql=Any X WHERE X latitude < 47.8, X latitude>47.6, X longitude >-1.9, X longitude<-1.8, X feature_type F, F name "Desserte Voyageur"

      Will limit the previous results set to train stations that are used for passenger service:

      http://www.cubicweb.org/file/2110289?vid=download
    • Example 5: http:/domainname:8080/?rql=Any X WHERE X feature_type F, F name "PN public pour voitures sans barrières sans SAL"&vid=mapstraction.map

      Finally, one can map all the level crossings for vehicules without barriers (there are 3704):

      http://www.cubicweb.org/file/2110290?vid=download http://www.cubicweb.org/file/2110291?vid=download

    As you could see in the last URL, the map view was chosen directly with the parameter vid, meaning that the URL is shareable and can be easily included in a blog with a iframe for example.

    Data sharing

    The result of a query can also be "displayed" in RDF, thus allowing users to download a semantic version of the information, without having to do the preprocessing themselves:

    <rdf:Description rdf:about="cwuri24684b3a955d4bb8830b50b4e7521450">
      <rdf:type rdf:resource="http://ns.cubicweb.org/cubicweb/0.0/Location"/>
      <cw:cw_source rdf:resource="http://some.url.demo/"/>
      <cw:longitude rdf:datatype="http://www.w3.org/2001/XMLSchema#float">-1.89599</cw:longitude>
      <cw:latitude rdf:datatype="http://www.w3.org/2001/XMLSchema#float">47.67778</cw:latitude>
      <cw:feature_type rdf:resource="http://some.url.demo/7222"/>
      <cw:data_source rdf:resource="http://some.url.demo/7206"/>
    </rdf:Description>
    

    Conclusion

    For someone who knows the CubicWeb framework, a couple hours are enough to create a CubicWeb application that stores, displays, queries and shares data downloaded from http://www.data.gouv.fr/

    The full source code for the above will be released before the end of the week.

    If you want to see more of CubicWeb in action, browse http://data.bnf.fr or learn how to develop your own application at http://docs.cubicweb.org/


  • Roundup of "Powered by Cubicweb" websites

    2011/11/15 by Arthur Lutz

    Here is a (incomplete) list of public websites powered by Cubicweb. A lot of CubicWeb technology is used for private web applications in large companies that we can not list here.

    Demos are listed here : http://www.cubicweb.org/card/demo

    You can also find a list of the companies providing services for Cubicweb (with a few extra examples) : https://www.cubicweb.org/card/CubicWebServiceProviders


  • What's new in CubicWeb 3.14?

    2011/11/10 by Sylvain Thenault

    The development of CubicWeb 3.14 was rather long and included a lot of API changes detailed here. As usual backward compatibility is provided for public APIs.

    Please note this release depends on yams 0.34 (which is incompatible with prior cubicweb releases regarding instance re-creation).

    API changes

    • Entity.fetch_rql the restriction argument has been deprecated and should be replaced with a call to the new Entity.fetch_rqlst method, get the returned value (a rql Select node) and use the RQL syntax tree API to include the above-mentioned restrictions.

      Backward compat is kept with proper warning.

    • Entity.fetch_order and Entity.fetch_unrelated_order class methods have been replaced by Entity.cw_fetch_order and Entity.cw_fetch_unrelated_order with a different prototype:

      • instead of taking (attr, var) as two string argument, they now take (select, attr, var) where select is the rql syntax tree being constructed and var the variable node.
      • instead of returning some string to be inserted in the 'ORDERBY' clause, it has to modify the syntax tree

      Backward compat is kept with proper warning, except if:

      • custom order method returns something else the a variable name with or without the sorting order (e.g. cases where you sort on the value of a registered procedure as it was done in the tracker for instance). In such case, an error is logged telling that this sorting is ignored until API upgrade.
      • client code uses direct access to one of those methods on an entity (no code known to do that).
    • Entity._rest_attr_info class method has been renamed to Entity.cw_rest_attr_info

      No backward compat since this is a protected method an no code is known to use it outside cubicweb itself.

    • AnyEntity.linked_to has been removed as part of a refactoring of this functionality (link a entity to another one at creation step). It was replaced by a EntityFieldsForm.linked_to property.

      In the same refactoring, cubicweb.web.formfield.relvoc_linkedto, cubicweb.web.formfield.relvoc_init and cubicweb.web.formfield.relvoc_unrelated were removed and replaced by RelationField methods with the same names, that take a form as a parameter.

      No backward compatibility yet. It's still time to cry for it. Cubes known to be affected: tracker, vcsfile, vcreview.

    • CWPermission entity type and its associated require_permission relation type (abstract) and require_group relation definitions have been moved to a new localperms cube. Some functions from the cubicweb.schemas package as well as some views where moved too. This makes cubicweb itself smaller while you get all the local permissions stuff into a single and documented place.

      Backward compat is kept for existing instances, though you should have installed the localperms cubes. A proper error should be displayed when trying to migrate to 3.14 an instance the use CWPermission without the new cube installed. For new instances / test, you should add a dependancy on the new cube in cubes using this feature, along with a dependancy on cubicweb >= 3.14.

    • jQuery has been updated to 1.6.4 and jquery-tablesorter to 2.0.5. No backward compat issue known.

    • Table views refactoring : new RsetTableView and EntityTableView, as well as rewritten an enhanced version of PyValTableView on the same bases, with logic moved to some column renderers and a layout. Those should be well documented and deprecates former TableView, EntityAttributesTableView and CellView, which are however kept for backward compat, with some warnings that may not be very clear unfortunatly (you may see your own table view subclass name here, which doesn't make the problem that clear). Notice that _cw.view('table', rset, *kwargs) will be routed to the new RsetTableView or to the old TableView depending on given extra arguments. See #1986413.

    • display_name don't call .lower() anymore. This may leads to changes in your user interface. Different msgid for upper/lower cases version of entity type names, as this is the only proper way to handle this with some languages.

    • IEditControlAdapter has been deprecated in favor of EditController overloading, which was made easier by adding dedicated selectors called match_edited_type and match_form_id.

    • Pre 3.6 API backward compat has been dropped, though data migration compatibility has been kept. You may have to fix errors due to old API usage for your instance before to be able to run migration, but then you should be able to upgrade even a pre 3.6 database.

    • Deprecated cubicweb.web.views.iprogress in favor of new iprogress cube.

    • Deprecated cubicweb.web.views.flot in favor of new jqplot cube.

    Unintrusive API changes

    • Refactored properties forms (eg user preferences and site wide properties) as well as pagination components to ease overridding.

    • New cubicweb.web.uihelper module with high-level helpers for uicfg.

    • New anonymized_request decorator to temporary run stuff as an anonymous user, whatever the currently logged in user.

    • New 'verbatimattr' attribute view.

    • New facet and form widget for Integer used to store binary mask.

    • New js_href function to generated proper javascript href.

    • match_kwargs and match_form_params selectors both accept a new once_is_enough argument.

    • printable_value is now a method of request, and may be given dict of formatters to use.

    • [Rset]TableView allows to set None in 'headers', meaning the label should be fetched from the result set as done by default.

    • Field vocabulary computation on entity creation now takes __linkto information into accounet.

    • Started a cubicweb.pylintext pylint plugin to help pylint analyzing cubes: you should now use

      pylint --load-plugins=cubicweb.pylintext
      

      to analyse your cubicweb code.

    RQL

    User interface changes

    • Datafeed source now present an history of the latest import's log, including global status and debug/info/warning/error messages issued during imports. Import logs older than a configurable amount of time are automatically deleted.
    • Breadcrumbs component is properly kept when creating an entity with '__linkto'.
    • users and groups management now really lead to that (i.e. includes groups management).
    • New 'jsonp' controller with 'jsonexport' and 'ejsonexport' views.

    Configuration

    • Added option 'resources-concat' to make javascript/css files concatenation optional, making JS debugging a lot easier when needed.

    As usual, the 3.14 also includes a bunch of other minor changes, and bug fixes, though this time an effort has been done so that every API changes / new API should be listed here. Please download and install CubicWeb 3.14 and report any problem on the tracker and/or the mailing-list!

    Enjoy!


  • ensure that 2 boolean attributes of an entity never have the same value

    2011/09/08

    I want to implement an entity with 2 boolean attributes, and a requirement is that these two attributes never have the same boolean value (think of some kind of radio buttons).

    Let's start with a simple schema example:

    # in schema.py
    class MyEntity(EntityType):
       use_option1 = Boolean(required=True, default=True)
       use_option2 = Boolean(required=True, default=False)
    

    So new entities will be conform to the spec.

    To do this, you need two things:

    • a constraint in the entity schema which will ring if both attributes have the same value
    • a hook which will toggle the other attribute when one attribute is changed.

    RQL constraints are generally meant to be used on relations, but you can use them on attributes too. Simply use 'S' to denote the entity, and write the constraint normally. You need to have the same constraint on both attributes, because the constraint evaluation is triggered by the modification of the attribute.

    # in schema.py
    class MyEntity(EntityType):
       use_option1 = Boolean(required=True, default=True,
                             constraints = [
                                  RQLConstraint('S use_option1 O1, S use_option2 != O1')
                                           ])
       use_option2 = Boolean(required=True, default=False,
                             constraints = [
                                  RQLConstraint('S use_option1 O1, S use_option2 != O1')
                                           ])
    

    With this update, it is no longer possible to have both options set to True or False (you will get a ValidationError). The nice thing to have is to get the other option to be updated when one of the two attributes is changed, which means that you don't have to take care of this when editing the entity in the web interface (which you cannot do anyway if you are using reledit for instance).

    A nice way of writing the hook is to use Python's sets to avoid tedious logic code:

    class RadioButtonUpdateHook(Hook):
       '''ensure use_option1 = not use_option2 (and conversely)'''
       __regid__ = 'mycube.radiobuttonhook'
       events = ('before_update_entity', 'before_add_entity')
       __select__ = Hook.__select__ & is_instance('MyEntity')
       # we prebuild the set of boolean attribute names
       _flag_attributes = set(('use_option1', 'use_option2'))
       def __call__(self):
           entity = self.entity
           edited = set(entity.cw_edited)
           attributes = self._flag_attributes
           if attributes.issubset(edited):
               # both were changed, let the integrity hooks do their job
               return
           if not attributes & edited:
               # none of our attributes where changed, do nothing
               return
           # find which attribute was modified
           modified_set = attributes & edited
           # find the name of the other attribute
           to_change = (attributes - modified_set).pop()
           modified_name = modified_set.pop()
           # set the value of that attribute
           entity.cw_edited[to_change] = not entity.cw_edited[modified_name]
    

    That's it!


  • What's new in CubicWeb 3.13?

    2011/07/21 by Sylvain Thenault

    CubicWeb 3.13 has been developed for a while and includes some cool stuff:

    • generate and handle Apache's modconcat compatible URLs, to minimize the number of HTTP requests necessary to retrieve JS and CSS files, along with a new cubicweb-ctl command to generate a static 'data' directory that can be served by a front-end instead of CubicWeb
    • major facet enhancements:
      • nicer layout and visual feedback when filtering is in-progress
      • new RQLPathFacet to easily express new filters that are more than one hop away from the filtered entities
      • a more flexibile API, usable in cases where it wasn't previously possible
    • some form handling refactorings and cleanups, notably introduction of a new method to process posted content, and updated documentation
    • support for new base types : BigInt, TZDateTime and TZTime (in 3.12 actually for those two)
    • write queries optimization, and several RQL fixes on complex queries (e.g. using HAVING, sub-queries...), as well as new support for CAST() function and REGEXP operator
    • datafeed source and default CubicWeb xml parsers:
      • refactored into smaller and overridable chunks
      • easier to configure
      • make it work

    As usual, the 3.13 also includes a bunch of other minor enhancements, refactorings and bug fixes. Please download and install CubicWeb 3.13 and report any problem on the tracker and/or the mailing-list!

    Enjoy!


  • CubicWeb sprint in Paris / Need for Speed

    2011/03/22 by Adrien Di Mascio

    Logilab is hosting a CubicWeb sprint - 3 days in our Paris offices.

    The general focus will be on speed :

    • on cubicweb-server side : improve performance of massive insertions / deletions
    • on cubicweb-client side : cache implementation, HTTP server, massive parallel usage, etc.

    This sprint will take place from in April 2011 from tuesday the 26th to thursday the 28th. You are more than welcome to come along and help out, contribute, but unlike previous sprints, at least basic knowledge of CubicWeb will be required for participants since no introduction is planned.

    Network resources will be available for those bringing laptops.

    Address : 104 Boulevard Auguste-Blanqui, Paris. Ring "Logilab" (googlemap)

    Metro : Glacière

    Contact : http://www.logilab.fr/contact

    Dates : 26/04/2011 to 28/04/2011


  • What's new in CubicWeb 3.11?

    2011/02/18 by Sylvain Thenault

    Unlike recent major version of CubicWeb, the 3.11 doesn't come with many API changes or refactorings and introduces a fairly small set of new features. But those are important features!

    • 'pyrorql' sources mapping is now stored in the database instead of a python file in the instance's home. This eases the deployment and maintenance of distributed aplications.

    • A new 'datafeed' source was introduced, inspired by the soon to be deprecated datafeed cube. It needs polishing but sets the foundation for advanced semantic web applications that import content from others site using simple http request.

      A 'datafeed' source is associated to a parser that analyses the imported data and then creates/updates entities accordingly. There is currently a single parser in the core that imports CubicWeb-generated xml and needs to be configured with a mapping information that defines how relations are to be followed. It provides a viable alternative to 'pyrorql' sources. Other parsers to import RDF, RSS, etc should come soon.

      A new facet to filter entities based on the source they came from is now available.

    • The management interface for users, groups, sources and site preferences was simplified so it should be more intuitive to newbies (and others). Most items have been dropped from the user drop-down menu and the simpler views were made available through the '/manage' url.

    • The default 'index' / 'manage' view has been simplified to deprecate features that rely on external folder and card cubes. That's almost the only deprecation warning you'll get in upgrading to 3.11. Just this one won't hurt!

    • The old_calendar module has been dropped in favor of jQuery's fullcalendar powered views. That's a great news for applications using calendar features. Since it was added to the exising calendar module, you shouldn't have to change anything to get it working, unless you were using old_calendar in which case you may have to update a few things. This work was initiated by our mexican friends from Crealibre.

    As usual, the 3.11 also includes a bunch of other minor enhancements, refactorings and bug fixes. Please download and install CubicWeb 3.11 and report any problem to the mailing-list!

    Enjoy!


  • A simple scalable web server HA architecture suitable for medium sized projects

    2011/02/14 by Florent Cayré

    Having deployed and maintained several public medium sized web sites running CubicWeb when I worked at SecondWeb, I was asked by my friends from Logilab to write a blog post describing how we managed our deployment while working with the customer and the hosting company.

    Non technical (albeit important) considerations

    Customers that want to run such a medium traffic web site either tell you which hosting company they partner with, or ask you to find one, so you have no other choice to deal with an external hosting structure to manage the servers. I prefer this by the way because:

    1. High Availability (HA) hosting really requires skills and hardware that are neither common nor cheap;
    2. HA hosting requires 24/7/365 availability that SecondWeb could not (and did not even want to) offer.

    It is clearly difficult for all parties (try to put yourself in the shoes of the customer...) to manage a website with 3 partners involved, each with their own goals. From the development leader point of view, you will notice that the technical people of the hosting company continuously change and you keep seeing the same operational errors even if you provide and keep improving high quality documentation. The software upgrade documentation has to be particularly clear as it greatly influences the overall web site availability. You also have to keep an history of the interventions on the servers yourself and maintain an up-to-date copy of the configuration files.

    The overall architecture proposed here partly benefits from this experience with managed hosting company, in that we tried to keep it simple.

    Which traffic size ? Why not bigger ?

    The architecture proposed here has been successfully tested with sites delivering web pages to up to 2 millions unique visitors per month. It should scale further up depending on your site database access needs: if you need very fresh data and have a lot of write operations to the database, you will need to distribute database access amongst several servers, which is beyond the scope of this post.

    This is the main limitation of the proposed architecture and the reason why it is not well-suited for a bigger traffic.

    Design choices

    Load balancing - Preserve user sessions

    To achieve very high availability for your web site, you must have no single point of failure in the whole architecture, which can be far from reasonable from the costs point of view. However, hosting companies can share costs between their customers and have them benefit from a double network infrastructure all along the way from the Internet to your web servers, themselves hosted on two distant locations. You may then choose an even number of web servers, half of them hosted on each network infrastructure.

    The important thing is that you must preserve user sessions. As of CubicWeb 3.10, DB persistent sessions have not been implemented yet (it will soon, there is a ticket planned for this functionality), thus you must preserve session cookies by always directing a given user to the same web server, which is usually achieved by configuring the load balancer(s) in IP hash mode (it is faster than balancing on the session cookie, which implies reaching the http stack rather than staying at the TCP/IP level).

    Squid caching, processor load balancing

    Now if you have multi-processor web servers (which is very likely these times) you will need to use one CubicWeb application instance per processor or the Python GIL will limit the CPU of your application to a fraction of the available power. This is pretty easy, you just have to duplicate configuration directories from /etc/cubicweb.d, changing instance names and ports. You can use a simple sed-based script to generate these copies automatically and keep them in sync.

    Now that we have one instance per processor, the problem of preserving sessions is back. It can be elegantly solved using Squid, which can of course deliver cached objects (in particular images, more on this later), but also listen on several ports and distribute incoming requests evenly among the CubicWeb instances based on their port of origin. Note that the load balancer must be set up to balance between ports of the web servers, one port for each processor. The Squid configuration file to achieve this, looks like:

    http_port 81 defaultsite=www.example.org vhost
    acl portA myport 81
    
    http_port 82 defaultsite=www.example.org vhost
    acl portB myport 82
    
    acl site1 dstdomain www.example.org
    
    cache_peer 127.0.0.1 parent 8081 0 no-query originserver default name=server_1
    cache_peer_access server_1 allow portA site1
    cache_peer_access server_1 deny all
    
    cache_peer 127.0.0.1 parent 8082 0 no-query originserver default name=server_2
    cache_peer_access server_2 allow portB site1
    cache_peer_access server_2 deny all
    

    This is a way to setup Squid to listen to ports 81 and 82 and distribute requests for www.example.org to ports 8081 and 8082 respectively. This way, requests should be evenly balanced between the processors a on bi-processor web server.

    You can now setup Squid more classically to achieve what it is initially done for: caching. See Squid docs for this, particularly the refresh_pattern directive. Note you do not need to force any HTTP cache standard feature in Squid, as CubicWeb enables you to fine tune caching using simple HTTPCacheManager classes found in cubicweb/web/httpcache.py (at the end of this file, you will also find default cache manager configuration for the entity and startup views).

    CubicWeb with Apache frontend

    This is controversial but it did not hurt for me: I like to put an Apache frontend between Squid and the Twisted-based CubicWeb application, because the hosting companies are usually pretty good at setting it up, like to use server status for monitoring, mod_deflate for textual content compression, mod_rewrite and other modules to customize, monitor or fine tune the web servers.

    It can however be argued that Apache is a huge piece of software for such a restrictive usage, and its memory footprint would be better used for caching.

    No shared disk

    This is an interesting part that simplifies the overall setup: if you want to save data on disk, it is likely that you also want to keep it in sync between the web servers, or use a highly secure network storage solution.

    As we already have a data store accessible from the web servers, namely the database itself, I often choose to use it even for images. This looks like the nightmare of every sysadmin, but if you make sure the images are not fetched every second from the database, by using fine tuned cache settings, it will not hurt. And this way you still benefit from the flexibility of a database and the easier maintenance of a single data store. We can use CubicWeb cache settings to allow squid caching images for 1 hour for example. If you have a very dynamic web site however, you will then need to force a URL change when an image is edited. This can easily be achieved in CubicWeb using a custom edit controller that creates a new image when the data attribute of an Image instance was edited, as illustrated here:

    from cubicweb import typed_eid
    from cubicweb.selectors import yes
    from cubicweb.web.views.editcontroller import EditController
    
    
    class CustomEditController(EditController):
        __select__ = EditController.__select__ & yes()
    
        def handle_updated_image(self, old_eid):
            'modify submitted form to change old_eid into a new entity eid in all key/ values'
            old_eid = unicode(old_eid)
            form = self._cw.form
            new_eid = self._cw.varmaker.next()
            # handle image eid
            del form['__type:%s' % old_eid]
            form['__type:%s' % new_eid] = u'Image'
            # handle eid list
            index = form['eid'].index(old_eid)
            form['eid'] = form['eid'][:index] + [new_eid] + form['eid'][index+1:]
            # handle attribute and relations
            for (k, v) in form.iteritems():
                if v == old_eid:
                    form[k] = new_eid
                if k.endswith(u':%s' % old_eid):
                    form[k[:-len(old_eid)] + new_eid] = v
                    del form[k]
    
        def _default_publish(self):
            # implement image creation when data image was updated, so that we can use
            # a far expiry date cache on download view
            images = []
            for (k, v) in self._cw.form.iteritems():
                if v != 'Image' or not k.startswith('__type') or k == self._cw.form['__maineid']:
                    continue
                try:
                    eid = typed_eid(k[7:])
                except ValueError:
                    continue
                if self._cw.form.get('data-subject:%s' % eid, None):
                    self.handle_updated_image(eid)
                    images.append(eid)
            super(CustomEditController, self)._default_publish()
            for eid in images:
                self._cw.execute('DELETE Image I WHERE I eid %(eid)s', {'eid': eid})
    

    To add the 1 hour expiry date for image download view, you can use:

    from cubicweb.selectors import yes
    from cubicweb.web import httpcache
    from cubicweb.web.views.idownloadable import DownloadView
    
    class CustomDownloadView(DownloadView):
        __select__ = DownloadView.__select__ & yes()
        http_cache_manager = httpcache.MaxAgeHTTPCacheManager
        cache_max_age = 3600
    

    Database server

    Hosting companies now often have a pretty good knowledge of PostgreSQL, the favorite DB back end for CubicWeb. They usually propose to replicate the database for data safety at a low cost, using PostgreSQL log shipping feature. Note that new PostgreSQL 9 versions should make it easier to setup replication modes that could be useful to improve performance and scalability, but there is still a lack of production level experience for the moment. Please share if you have, because it is the main issue to deal with to scale up further.

    Pre-production

    This is worth mentioning you need a pre-production server hosted by the same company on the same hardware (or virtual machine), because:

    • software upgrade will run smoother if the technical staff of the hosting company has already performed the same upgrade operation once: check the same person does both within a short timeframe if possible;
    • you will feel better if your migration scripts have successfully run on a fresh copy of the production data: ask for a db copy before a pre-production upgrade; this is much easier to do if you do not have to copy the database dumps remotely.
    • the pre-production server can host its own database server and the replication of the production one.

    Monitoring

    When you experience a web site downtime, it is much too late to take a look at the available monitoring. It is important to prepare the tools you need to diagnose a problem, get used to read the graphs and have the orders of magnitude of the values and their variations in mind.

    Even the simplest graphs, like CPU usage, need to be correctly interpreted. In a recent setup, I did not realize that only one CPU was used on a bi-pro server, delivering half the power it should... When you cannot access the machine and use top, you only see the information of the monitoring graphs, so you must know how to read them !

    Apart from the classical CPU, CPU load, (detailed) memory usage, and network traffic, ask for PostgreSQL, Squid, and Apache specific graphs (plug-ins for them are easy to find and install for classic monitoring solutions).

    For CubicWeb web sites, it is also worth setting up following views and use them for automatic alerts:

    • a software / db version consistency monitoring
    • a db pool size monitoring
    • a simple db connection check view
    • a view writing the server host name is not interesting for automatic alerts but to see on which server your IP is directed to: this is needed when you do not reproduce the behaviour the customer is complaining about...

    There are some classes I use for these tasks. Feel free to reuse and adapt them to your needs:

    from socket import gethostname
    
    from cubicweb.view import View
    
    
    class _MonitoringView(View):
        __abstract__ = True
        __select__ = yes()
        content_type = 'text/plain'
        templatable = False
    
    
    class PoolMonitoringView(_MonitoringView):
        __regid__ = 'monitor_pool'
    
        def call(self):
            repo = self._cw.cnx._repo
            max_pool = self._cw.vreg.config['connections-pool-size']
            percent = ((max_pool - repo._available_pools.qsize()) * 100.0) / max_pool
            self.w(u'%s%%' % percent)
    
    
    class DBMonitoringView(_MonitoringView):
        __regid__ = 'monitor_db'
    
        def call(self):
            try:
                count = self._cw.execute('Any COUNT(X) WHERE X is CWUser')[0][0]
                self.w(u'ServiceOK : %s users in DB' % count)
            except:
                self.w(u'ServiceKO')
    
    
    class VersionMonitoringView(_MonitoringView):
        __regid__ = 'monitor_version'
    
        def versions_text(self, versions):
            return u' | '.join(cube + u': ' + u'.'.join(unicode(x) for x in version)
                               for (cube, version) in versions)
    
        def call(self):
            config = self._cw.vreg.config
            vc_config = config.vc_config()
            db_config = [('cubicweb', vc_config.get('cubicweb', '?'))]
            fs_config = [('cubicweb', config.cubicweb_version())]
            for cube in sorted(config.cubes()):
                db_config.append((cube, vc_config.get(cube, '?')))
                try:
                    fs_version = config.cube_version(cube)
                except:
                    fs_version = '?'
                fs_config.append((cube, fs_version))
            db_config = self.versions_text(db_config)
            fs_config = self.versions_text(fs_config)
            if db_config == fs_config:
                self.w(u'ServiceOK : FS config %s == DB config %s' % (fs_config, db_config))
            else:
                self.w(u'ServiceKO : FS config %s !$ DB config %s' % (fs_config, db_config))
    
    
    class HostnameMonitoringView(_MonitoringView):
        __regid__ = 'monitor_hostname'
    
        def call(self):
            self.w(unicode(gethostname()))
    

    Sketch of the architecture and conclusion

    There is a sketch of the proposed architecture. Please comment on it and share your experience on the topic, I would be happy to learn your tips and tricks.

    I would conclude with an important remark regarding performance: a good scalable architecture is of great help to run a busy web site smoothly, however the performance boost you get by optimizing your software performance is usually worth it and must be seriously considered before any hardware upgrade, may it seem costly at first glance.

    /file/1521968?vid=download

  • Building my photos web site with CubicWeb part V: let's make it even more user friendly

    2011/01/24 by Sylvain Thenault

    We'll now see how to benefit from features introduced in 3.9 and 3.10 releases of CubicWeb

    Step 1: tired of the default look?

    OK... Now our site has its most desired features. But... I would like to make it look somewhat like my website. It is not www.cubicweb.org after all. Let's tackle this first!

    The first thing we can to is to change the logo. There are various way to achieve this. The easiest way is to put a logo.png file into the cube's data directory. As data files are looked at according to cubes order (CubicWeb resources coming last), that file will be selected instead of CubicWeb's one.

    Note

    As the location for static resources are cached, you'll have to restart your instance for this to be taken into account.

    Though there are some cases where you don't want to use a logo.png file. For instance if it's a JPEG file. You can still change the logo by defining in the cube's uiprops.py file:

    LOGO = data('logo.jpg')
    

    The uiprops machinery has been introduced in CubicWeb 3.9. It is used to define some static file resources, such as the logo, default Javascript / CSS files, as well as CSS properties (we'll see that later).

    Note

    This file is imported specifically by CubicWeb, with a predefined name space, containing for instance the data function, telling the file is somewhere in a cube or CubicWeb's data directory.

    One side effect of this is that it can't be imported as a regular python module.

    The nice thing is that in debug mode, change to a uiprops.py file are detected and then automatically reloaded.

    Now, as it's a photos web-site, I would like to have a photo of mine as background... After some trials I won't detail here, I've found a working recipe explained here. All I've to do is to override some stuff of the default CubicWeb user interface to apply it as explained.

    The first thing to to get the <img/> tag as first element after the <body> tag. If you know a way to avoid this by simply specifying the image in the CSS, tell me! The easiest way to do so is to override the HTMLPageHeader view, since that's the one that is directly called once the <body> has been written. How did I find this? By looking in the cubiweb.web.views.basetemplates module, since I know that global page layouts sits there. I could also have grep the "body" tag in cubicweb.web.views... Finding this was the hardest part. Now all I need is to customize it to write that img tag, as below:

    class HTMLPageHeader(basetemplates.HTMLPageHeader):
        # override this since it's the easier way to have our bg image
        # as the first element following <body>
        def call(self, **kwargs):
            self.w(u'<img id="bg-image" src="%sbackground.jpg" alt="background image"/>'
                   % self._cw.datadir_url)
            super(HTMLPageHeader, self).call(**kwargs)
    
    
    def registration_callback(vreg):
        vreg.register_all(globals().values(), __name__, (HTMLPageHeader))
        vreg.register_and_replace(HTMLPageHeader, basetemplates.HTMLPageHeader)
    

    As you may have guessed, my background image is in a background.jpg file in the cube's data directory, but there are still some things to explain to newcomers here:

    • The call method is there the main access point of the view. It's called by the view's render method. It is not the only access point for a view, but this will be detailed later.
    • Calling self.w writes something to the output stream. Except for binary views (which do not generate text), it must be passed an Unicode string.
    • The proper way to get a file in data directory is to use the datadir_url attribute of the incoming request (e.g. self._cw).

    I won't explain again the registration_callback stuff, you should understand it now! If not, go back to previous posts in the series :)

    Fine. Now all I've to do is to add a bit of CSS to get it to behave nicely (which is not the case at all for now). I'll put all this in a cubes.sytweb.css file, stored as usual in our data directory:

    /* fixed full screen background image
     * as explained on http://webdesign.about.com/od/css3/f/blfaqbgsize.htm
     *
     * syt update: set z-index=0 on the img instead of z-index=1 on div#page & co to
     * avoid pb with the user actions menu
     */
    img#bg-image {
        position: fixed;
        top: 0;
        left: 0;
        width: 100%;
        height: 100%;
        z-index: 0;
    }
    
    div#page, table#header, div#footer {
        background: transparent;
        position: relative;
    }
    
    /* add some space around the logo
     */
    img#logo {
        padding: 5px 15px 0px 15px;
    }
    
    /* more dark font for metadata to have a chance to see them with the background
     *  image
     */
    div.metadata {
        color: black;
    }
    

    You can see here stuff explained in the cited page, with only a slight modification explained in the comments, plus some additional rules to make things somewhat cleaner:

    • a bit of padding around the logo
    • darker metadata which appears by default below the content (the white frame in the page)

    To get this CSS file used everywhere in the site, I have to modify the uiprops.py file introduced above:

    STYLESHEETS = sheet['STYLESHEETS'] + [data('cubes.sytweb.css')]
    

    Note

    sheet is another predefined variable containing values defined by already process uiprops.py file, notably the CubicWeb's one.

    Here we simply want our CSS in addition to CubicWeb's base CSS files, so we redefine the STYLESHEETS variable to existing CSS (accessed through the sheet variable) with our one added. I could also have done:

    sheet['STYLESHEETS'].append(data('cubes.sytweb.css'))
    

    But this is less interesting since we don't see the overriding mechanism...

    At this point, the site should start looking good, the background image being resized to fit the screen.

    http://www.cubicweb.org/file/1440508?vid=download

    The final touch: let's customize CubicWeb's CSS to get less orange... By simply adding

    contextualBoxTitleBg = incontextBoxTitleBg = '#AAAAAA'
    

    and reloading the page we've just seen, we know have a nice greyed box instead of the orange one:

    http://www.cubicweb.org/file/1440510?vid=download

    This is because CubicWeb's CSS include some variables which are expanded by values defined in uiprops file. In our case we controlled the properties of the CSS background property of boxes with CSS class contextualBoxTitleBg and incontextBoxTitleBg.

    Step 2: configuring boxes

    Boxes present to the user some ways to use the application. Let's first do a few user interface tweaks in our views.py file:

    from cubicweb.selectors import none_rset
    from cubicweb.web.views import bookmark
    from cubes.zone import views as zone
    from cubes.tag import views as tag
    
    # change bookmarks box selector so it's only displayed on startup views
    bookmark.BookmarksBox.__select__ = bookmark.BookmarksBox.__select__ & none_rset()
    # move zone box to the left instead of in the context frame and tweak its order
    zone.ZoneBox.context = 'left'
    zone.ZoneBox.order = 100
    # move tags box to the left instead of in the context frame and tweak its order
    tag.TagsBox.context = 'left'
    tag.TagsBox.order = 102
    # hide similarity box, not interested
    tag.SimilarityBox.visible = False
    

    The idea is to move all boxes in the left column, so we get more space for the photos. Now, serious things: I want a box similar to the tags box but to handle the Person displayed_on File relation. We can do this simply by adding a AjaxEditRelationCtxComponent subclass to our views, as below:

    from logilab.common.decorators import monkeypatch
    from cubicweb import ValidationError
    from cubicweb.web import uicfg, component
    from cubicweb.web.views import basecontrollers
    
    # hide displayed_on relation using uicfg since it will be displayed by the box below
    uicfg.primaryview_section.tag_object_of(('*', 'displayed_on', '*'), 'hidden')
    
    class PersonBox(component.AjaxEditRelationCtxComponent):
        __regid__ = 'sytweb.displayed-on-box'
        # box position
        order = 101
        context = 'left'
        # define relation to be handled
        rtype = 'displayed_on'
        role = 'object'
        target_etype = 'Person'
        # messages
        added_msg = _('person has been added')
        removed_msg = _('person has been removed')
        # bind to js_* methods of the json controller
        fname_vocabulary = 'unrelated_persons'
        fname_validate = 'link_to_person'
        fname_remove = 'unlink_person'
    
    
    @monkeypatch(basecontrollers.JSonController)
    @basecontrollers.jsonize
    def js_unrelated_persons(self, eid):
        """return tag unrelated to an entity"""
        rql = "Any F + ' ' + S WHERE P surname S, P firstname F, X eid %(x)s, NOT P displayed_on X"
        return [name for (name,) in self._cw.execute(rql, {'x' : eid})]
    
    
    @monkeypatch(basecontrollers.JSonController)
    def js_link_to_person(self, eid, people):
        req = self._cw
        for name in people:
            name = name.strip().title()
            if not name:
                continue
            try:
                firstname, surname = name.split(None, 1)
            except:
                raise ValidationError(eid, {('displayed_on', 'object'): 'provide <first name> <surname>'})
            rset = req.execute('Person P WHERE '
                               'P firstname %(firstname)s, P surname %(surname)s',
                               locals())
            if rset:
                person = rset.get_entity(0, 0)
            else:
                person = req.create_entity('Person', firstname=firstname,
                                                surname=surname)
            req.execute('SET P displayed_on X WHERE '
                        'P eid %(p)s, X eid %(x)s, NOT P displayed_on X',
                        {'p': person.eid, 'x' : eid})
    
    @monkeypatch(basecontrollers.JSonController)
    def js_unlink_person(self, eid, personeid):
        self._cw.execute('DELETE P displayed_on X WHERE P eid %(p)s, X eid %(x)s',
                         {'p': personeid, 'x': eid})
    

    You basically subclass to configure with some class attributes. The fname_* attributes give the name of methods that should be defined on the json control to make the AJAX part of the widget work: one to get the vocabulary, one to add a relation and another to delete a relation. These methods must start by a js_ prefix and are added to the controller using the @monkeypatch decorator. In my case, the most complicated method is the one which adds a relation, since it tries to see if the person already exists, and else automatically create it, assuming the user entered "firstname surname".

    Let's see how it looks like on a file primary view:

    http://www.cubicweb.org/file/1440509?vid=download

    Great, it's now as easy for me to link my pictures to people than to tag them. Also, visitors get a consistent display of these two pieces of information.

    Note

    The ui component system has been refactored in CubicWeb 3.10, which also introduced the AjaxEditRelationCtxComponent class.

    Step 3: configuring facets

    The last feature we'll add today is facet configuration. If you access to the '/file' url, you'll see a set of 'facets' appearing in the left column. Facets provide an intuitive way to build a query incrementally, by proposing to the user various way to restrict the result set. For instance CubicWeb proposes a facet to restrict based on who created an entity; the tag cube proposes a facet to restrict based on tags; the zoe cube a facet to restrict based on geographical location, and so on. In that gist, I want to propose a facet to restrict based on the people displayed on the picture. To do so, there are various classes in the cubicweb.web.facet module which simply have to be configured using class attributes as we've done for the box. In our case, we'll define a subclass of RelationFacet.

    Note

    Since that's ui stuff, we'll continue to add code below to our views.py file. Though we begin to have a lot of various code their, so it's may be a good time to split our views module into submodules of a view package. In our case of a simple application (glue) cube, we could start using for instance the layout below:

    views/__init__.py   # uicfg configuration, facets
    views/layout.py     # header/footer/background stuff
    views/components.py # boxes, adapters
    views/pages.py      # index view, 404 view
    
    from cubicweb.web import facet
    
    class DisplayedOnFacet(facet.RelationFacet):
        __regid__ = 'displayed_on-facet'
        # relation to be displayed
        rtype = 'displayed_on'
        role = 'object'
        # view to use to display persons
        label_vid = 'combobox'
    

    Let's say we also want to filter according to the visibility attribute. This is even simpler as we just have to derive from the AttributeFacet class:

    class VisibilityFacet(facet.AttributeFacet):
        __regid__ = 'visibility-facet'
        rtype = 'visibility'
    

    Now if I search for some pictures on my site, I get the following facets available:

    http://www.cubicweb.org/file/1440517?vid=download

    Note

    By default a facet must be applyable to every entity in the result set and provide at leat two elements of vocabulary to be displayed (for instance you won't see the created_by facet if the same user has created all entities). This may explain why you don't see yours...

    Conclusion

    We started to see the power behind the infrastructure provided by the framework, both on the pure ui (CSS, Javascript) side and on the Python side (high level generic classes for components, including boxes and facets). We now have, with a few lines of code, a full-featured web site with a personalized look.

    Of course we'll probably want more as time goes, but we can now concentrate on making good pictures, publishing albums and sharing them with friends...


  • CubicWeb sprint in Paris on january 19/20/21 2011

    2010/12/03 by Sylvain Thenault
    http://farm1.static.flickr.com/183/419945378_4ead41a76d_m.jpg

    Almost everything is in the title: we'll hold a CubicWeb sprint in our Paris office after the first French Semantic Web conference, so on 19, 20 and 21 of january 2011.

    The main topic will be to enhance newcomers experience in installing and using CubicWeb.

    If you wish to come, you're welcome, that's a great way to meet us, learn the framework and share thoughts about it. Simply contact us so we can check there is still some room available.

    photo by Sebastian Mary under creative commons licence.


  • HTML5 features presented at Paris Web 2010 by Paul Rouget

    2010/10/19 by Arthur Lutz

    While at Paris Web 2010 we were all impressed by the presentation and demos by Paul Rouget on HTML5 (tech evangelist must be a hard job!). Here is my take and a few URLs on the things that were presented.

    http://hacks.mozilla.org/wp-content/themes/Hacks2010/img/mozilla.png
    • Websockets with persistent connections between the server and the browser. That way you can avoid pulling information every 5 seconds, the server can tell the web page a new info is available. The immediate uses we have for this are :
      • realtime feed display
      • jabber web chat rooms
      • in cubicweb's forge : new comment indication on a ticket
      • in cubicweb in general : notification that the edited element has been openned by another user (instead of a lock mechanism)
      • real time collaborative editing (etherpad style functionality)
    • File upload demo : http://demos.hacks.mozilla.org/openweb/uploadingFiles/
    • File EXIF extraction, client side resize or geolocalisation http://demos.hacks.mozilla.org/openweb/FileAPI/ . That could be very cool for things such as resizing an image before it is sent to the server (you know, for your mother who doesn't know how to resize that 2 Mbytes photo before sending it to the site). Reference : https://developer.mozilla.org/en/Using_files_from_web_applications
    • Using File IO, you can do some heavy Drag'n'drop from your computer to your browser directly in the browser (yes, you can get rid of that nasty java applet). Apparently Google implemented in Chromium a non-standard drag'n'drop the other way around : from the web app to your desktop, which could be cool as well.
    http://farm5.static.flickr.com/4147/5085028912_173337f0ba.jpg
    • XHR - XMLHttpRequest. Usually this type of requests is not possible cross-domain. Now they will be (with an authorization mechanism). That way, you will be able to post and control websites from the page in your browser.
    • Audio Data API : you can now access & modify audio files directly in your browser (before uploading them server side). This makes me think of the first time I realized people where implementing traditionally "heavy" applications (photo editing, music editing, even movie edition) in web applications. I was (and still am) very surprised and skeptic, but this kind of evolution makes me believe that there can be a day when you don't even need to send massive files to the server to edit them.
    http://farm1.static.flickr.com/191/513636061_98d07f7966_t.jpg

    Admittedly, you probably need to see the thrilling presentation and demos to be tempted to go and dip into these technologies. Reading the documentation will probably not encourage you to go and code some cool new features.

    One of the things that the audience commented about at the end of the presentation is that there was still a huge lack of "authoring tools" for HTML5. For some coders that never leave vim or emacs, this is heresy, but we have to admit that the adoption of flash and silverlight (apparently) is very much driven by simple click'n'program tools.

    http://www.mozilla.org/images/minefield_168.png

    During the presentation, I used a Chrome 6 that I had lying around on my Ubuntu, but by the end of the presentation I had installed Firefox4 using the mozilla PPA

    sudo add-apt-repository ppa:ubuntu-mozilla-daily/ppa
    sudo apt-get update
    sudo apt-get -uVf install firefox-4.0
    

    The PPA version keeps config files separate so you can easily switch between your "standard" Firefox3 profile and the cutting edge Firefox4 (obviously the big downside is not having all your cool extensions).

    The only thing missing from the presentation was the code... a request I hope Paul will grant to the community (a bunch of tweets about that followed the presentation).


  • What's new in CubicWeb 3.10?

    2010/10/18 by Sylvain Thenault

    The 3.10 development started during August, with two important patches: one on the repository / entity API, another one on the boxes / content navigation components unification (more on this later). Then it somewhat came to a halt, as more work was done on other projects and to stabilize the 3.9 branch. We finally got back on it during September, adding several other major changes or enhancements.

    • Cleanup of the repository side entity API, i.e. the API you may use when writing hooks. Beside simple namespace cleanup (a few renamings), the API has been modified to move out attributes being edited from the read cache. So now:

      • entities do not inherit from dict anymore; access to the dict protocol on an entity will raise deprecation warnings
      • the attributes cache is now a cw_attr_cache dictionary on the entity
      • edited attributes are in a cw_edited attribute special object, which is only available in hooks for a modified entity (i.e. '[before|after]_[add|update]_entity', you should use the dict protocol on that object to get modified attributes or to modify what is edited (in 'before' hooks only, and this is now enforced). This deprecates the former edited_attributes attribute.
    • Unification of 'boxes' / 'contentnavigation' registries and base classes, into "contextual components" stored in the 'ctxcomponents' registry. This implied the introduction of "layout" objects which are appobjects responsible of displaying the components according to the context they are displayed in.

      This separation of content / layout and some css cleanups allows us to move former boxes and content components into each other's place in the user interface: for instance, go to your preferences pages and try to move the search box. You now have many more different locations available. Though one component may not go anywhere, so forthcoming releases should tweak this to avoid proposing dumb choices. But the hot stuff is there!

      Also, a cache has been set on the registry to avoid recomputing possible components for each context (place in the ui).

    • Upgraded jQuery and jQuery UI respectively to version 1.4.2 and 1.8. Removed jquery.autocomplete.js since jQuery UI provides its own autocomplete plugin. A cwautocomplete plugin was added in order to keep widgets as backward compatible as possible. If you used custom autocomplete feature, you should take a look at this guide.

    • The RelationFacet base class now automatically proposes to search for entities without the relation if this is allowed by the schema and if there are some in current results. Example: search for tickets which are not planned in a version.

    • Data sources have been modeled as CubicWeb entity type CWSource. The 'sources' file is still there but will now only contains definition of the system source, as well as default manager account login and password. This implied changes in instance initialization commands, introduction of a new 'add-source' command to cubicweb-ctl, as well as change in the repository startup. Also, on a multi-sources instance, we can now search using a facet on the cw_source relation (a new mandatory metadata relation on each entities) to filter according to the data source entities are coming from.

    • Although introduced during 3.9 releases, it's worth mentioning the new support for multi-columns unicity constraint through yams's __unique_together__ entity type attribute, allowing for unicity constraint enforced by the underlying database instead of CubicWeb hooks. This is limited and doesn't work in every configuration, but is a must have when running several distributed CubicWeb instance of the same application (hence database).

    Also as usual, the 3.10 includes a bunch of other minor enhancements, refactorings and bug fixes. Every introduced change should be backward compatible, except probably some minor ui details due to the css box simplification. That's it.

    So please download and install CubicWeb 3.10 and report us any problem on the mailing-list!

    Enjoy!


  • CubicWeb presentation at the JDLL (Lyon)

    2010/10/07 by Arthur Lutz

    For the "Journées Du Logiciel Libre (JDLL)" in Lyon which will take place the 14th, 15th et 16th of octobre 2010, we will be presenting the semantic side of CubicWeb on Friday 15th. There will be a talk and a tutorial. Details can be found here and there.

    If you're around, come and see us!

    http://www.jdll.org/sites/default/files/banniere.png

  • Debugging a memory leak in a cube

    2010/09/24

    We recently discovered that the cubicweb.org site (the one you are probably visiting right now) was suffering from a memory leak. The munin graphs showed a memory consumption steadily increasing soon after the instance was started, and this would only stop when all the memory on the host was exhausted. This was clearly caused by a memory leak somewhere, either in CubicWeb itself or in a cube used by the instance.

    Munin graphs showing the memory leak in cubicweb.org

    Fig. 1: Munin graphs showing the memory leak in cubicweb.org

    Notice the associated service downtimes, and the stabilized memory consumption on Sept 23, after the leak was fixed.

    Since Python has a garbage collector, either the leak was occuring in a C extension, or it was caused by some objects which were not garbage collectable. A common cause for the latter, as explained in the gc module documentation, are objects with a __del__ method which are part of a cycle.

    We used the "gc" view, which is an administrative view in CubicWeb, reachable by appending "?vid=gc" at the end of the url of the root of your instance, if you are a member of the managers group. This view uses the gc module from the python standard library to see which objects are not garbage collected.

    This view showed thousands of instances of mercurial.url.httphandler. This class indeed has a __del__ method and instances have a cycle with urllib2.OpenerDirector. Mercurial is used by the vcsfile cube which regularly polls remote repository over HTTP, which causes httphandler to be instantiated (and a reference to be leaked). This problem had gone undetected in mercurial because most of the time, processes using mercurial over http are shortlived and the leaked memory is quickly collected by the operating system. Discussion ensued on the IRC forum #mercurial with the developers and a patch was submitted which fixes the leak. In order to avoid the problem with versions of mercurial up to the current one, a new version of vcsfile including a monkey patch for mercurial was released and deployed on cubicweb.org!


  • We'll be attending the Paris Web 2010 conference

    2010/09/09 by Arthur Lutz

    A few of us from Logilab will be attending the Paris Web 2010 conference. This is the fifth edition of the conference and we're looking forward to some of the conferences. It's a bit of a shame that the web site of the conference does not offer the attendees the possibility of building your own schedule by virtually "registering" to certain talks. This is one of the cool things about cubicweb-conference, a cubicweb application that helps build websites for organizing and hosting conferences.

    http://www.cubicweb.org/file/1251255?vid=download

    We're glad there will be talks about the semantic web, accessibility, html5, css, javascript, etc...

    I recently discovered Lanyrd which claims to be a social conference directory, with your twitter account you can track or show that you will be attending a conference. Twittter and identi.ca can be a good way of following a conference that you cannot physically attend. Check out Paris Web 2010 on Lanyrd : http://lanyrd.com/2010/parisweb/


  • Summer CubicWeb/Narval Sprint - Final report

    2010/08/18 by Sylvain Thenault

    For that last sprint day, each team made some nice achievements:

    • Steph & Alain worked on the mv/cp actions implementation to makes them working properly and supporting globs. Last but not least, with a full set of tests.
    • Alex & Charles got back what we call apycot 'full' tests, eg running test with coverage enabled, checking that code coverage is greater than a given threshold, but also running pylint and checking that its global evaluation is at least 7 (configurable, of course).
    • Katia & Aurélien provided a sharp implementation of recipe checking, so that we know we don't launch a recipe badly constructed, as well as informing the user nicely from what errors his recipe suffer.
    • Julien managed to set up a recipe managing from Debian package construction to Debian repository publication, going through lintian on the way
    • Pierre-Yves helped other teams to solve the narval related bugs they encountered, and finished by writing a thread-safe implementation of apycot's writer so we can run several checker simultaneously.
    • Celso continued working on a proof of concept blue-theme cube, wondering how to make CubicWeb looks nicer and be easily customisable in future versions.
    • Sylvain helped there and there and integrated patches...

    So we finally didn't get up to the demo. But we now have everything to set it up, so I've a good hope that we will have a beta version of our brand new production chain up and running before the end of August!

    Thanks to everyone for all this good work, and for this time spent all together!


  • CubicWeb gets press coverage at SemanticWeb.com

    2010/08/15 by Nicolas Chauvat

    Following the presentation of CubicWeb at OSCON 2010 in July, the editor of SemanticWeb.com wrote an article describing the CubicWeb framwork. Read the article and ask your questions on the mailing list!


  • Summer CubicWeb/Narval Sprint - Day 4

    2010/08/13 by Pierre-Yves David

    In this fourth day of the our Summer Sprint important progress have been made.

    • Stéphanie and Alain cleaned up the Apycot's bot sources from deprecated code and rewrite part of the test suite to follow the new way to launch apycot. They cleaned up the handling of VCS sources for tested project taking advantages of the new mercurial cache for vcsfile implemented by Katia and Aurélien last Tuesday. This feature keep a local clone of the remote repository and allow much faster checkout during test runs.
    • Julien made significant progress in the writing of the Debian recipe. A recipes can now successfully build Debian packages of a project and validate them with lintian and lgp. He later paired with Pierre-Yves and they improved the annotation of Apycot's Narval variable to enhance Input validation in Apycot's Narval recipes. For example, the action building a Debian package will explicitly refuse to run on a project not yet checked-out.
    • Aurelien first paired with Pierre-Yves to improve some views and the consistency of the database schema, then he worked on a dashboard displaying various indicators useful to the version publishing process.
    • Pierre-Yves spent some time improving the ability of Narval to recover on errors and to display meaningful logs about them.
    • Alexandre and Charles finished the re-implementation of the full python recipe.They used options at the Narval level to run test suite with the coverage enabled and re-enabled the coverage checker to process the result, discovering some problems in Narval's engine on the way...
    • Celso finished Spanish translation of Cubicweb's core and started to work on a new css theme
    • Sylvain helped several groups along the day and reviewed patches from them.

  • Summer CubicWeb/Narval Sprint - Day 3

    2010/08/13

    CubicWeb/Narval Sprint is going on !

    The third day of our sprint focused on the following points:

    • Pierre-Yves worked to prevent duplicate test executions (eg running several time the same test with the same version configuration),
    • Celso has terminated the spanish translation of CubicWeb. He's now working on various cubes translation,
    • Stéphanie and Alain spent some time on the narval bot view. They also modified ProjectEnvironement's attributes in order to use similar information available on the vcsfile repository, hence simplifying the configuration (more to do on this!),
    • Julien worked on the debian package recipe,
    • Katia and Aurélien worked on recipe security (using CWPermission),
    • Alexandre and Charles produced a first template of a full test recipe using pyunit and pycoverage,
    • Finally, our captain, Sylvain, is at the helm !

    We'll hopefuly be able to present a functionnal demo at the end of the week.

    Narval/Cubicweb left off !


  • Summer CubicWeb/Narval Sprint - Day 2

    2010/08/11 by Katia Saurfelt

    During the second day of our Summer CubicWeb/Narval Sprint, several tasks started on the first day were completed and new tasks started:

    • Charles, Alexandre and Julien finished writing the "copy" and "move" Narval actions, and then started transforming existing apycot checkers into Narval actions.
    • Pierre-Yves managed to improve Narval reports with more explicit and relevant content.
    • Stéphanie and Alain finished the bot status view as well as the recipe graph view.
    • Katia and Aurélien finished writing the new mercurial cache solution for vcsfile and started improving the security of Narval recipes (i.e. who can start which recipe).
    • Celso kept on his life-long work of translating CubicWeb to Spanish.
    • Sylvain wrote some Narval views, improved Narval execution logs handling and kept on reviewing patches and helping various people...

  • Summer CubicWeb/Narval Sprint - Day 1

    2010/08/10 by Sylvain Thenault

    We started this first day by several presentations by Sylvain about Logilab's current development process workflow, and compared it to what it should be after the sprint. Sylvain also introduced Narval.

    We then set up a dev environment on everyone's computer: a working forge with a local Narval agent that can be used for tests during the week.

    Regarding more concrete tasks:

    • Charles and Alexandre started writing some basic Narval actions such as move, to move a file from a place to another, and had to grasp narval's concepts on the way.
    • Pierre-Yves dug into the code to understand how exceptions are propagated in the Narval engine, his goal was to get better reports.
    • Stéphanie and Alain worked on a nice bot status view.
    • Katia, Aurélien studied the new mercurial cache solution for vcsfile
    • Julien started some piece of documentation.
    • Celso, our Mexican friend, discovered some new features of recent cubicweb releases and setup his environment to later work on Spanish translation, CSS, etc.
    • Sylvain came with a basically working narval implementation on top of cubicweb, and spent the day helping various people...

  • Summer CubicWeb/Narval Sprint

    2010/08/10 by Sylvain Thenault

    Although this week is normally the regular annual holidays here at Logilab, some of us will sprint in Paris exceptionally.

    Focus

    We're starting this week with an exciting goal: integrating all our release process into our continuous integration suite (through the apycot cube). Including Debian repository management, pypi registration, etc...

    The hot stuff to achieve this is the third resurrection of Narval, the project Logilab was originaly based on, but this time it is built on top of CubicWeb framework. Narval will be used to rewrite some parts of apycot, in order to make it more flexible and powerful.

    It is not just a refactoring or a simple upgrade! We hope to automate common tasks, simplify maintenance, and thus enhance release quality, but also gain a lot of functionality in near future.

    Sprint roadmap

    • merge Apycotbot process manager into a new Narval incarnation, and rewrite it as Narval actions and recipes
    • improve vcsfile cube with a new cache system for mercurial
    • define Logilab's release process as new Narval recipes, triggered by actions such as adding release tag into the source repository

    More detailed stuff will come with the sprint reports that we'll try to issue each day.

    Information

    This sprint is taking place in Logilab's offices in Paris from Monday the 9th to the 13th of August 2010.


  • HOWTO change the value of a variable in all-in-one.conf with a migration script

    2010/08/03

    Here is a sample migration script (see also the cubicweb documentation on that topic) which changes the variable 'sender-addr'. There is an additional twist in that the variable is only updated if the instance is configured with a known value for that variable.

    wrong_addr = 'cubicweb@loiglab.fr' # known wrong address
    fixed_addr = 'cubicweb@logilab.fr'
    configured_addr = config.get('sender-addr')
    # check that the address has not been hand fixed by a sysadmin
    if configured_addr == wrong_addr:
        config['sender-addr'] = fixed-addr
        config.save()
    

    This is very useful in cases such as:

    • automatically changing the value of a variable which used a default value set by cubicweb-ctl create
    • changing the configuration of an instance with limited intervention from the local sysadmin (because asking him to hand edit the config file is error prone): he just has to deploy the new release and run cubicweb-ctl upgrade
    • fixing issues caused by settings in the all-in-one.conf file (e.g. changing the value of max-post-length)

  • OSCON 2010 - Data freedom and the semantic web

    2010/07/29 by Sandrine Ribeau

    I presented CubicWeb at OSCON 2010. I could only stay for a day and I did not get a chance to see a lot of talks, but judging from the conference schedule it seems only a few of them were related to making data available on the web. I will focus on these talks, for they are very relevant to us who are building the semantic web.

    http://assets.en.oreilly.com/1/event/45/oscon2010_125x125.jpg

    I highly encourage you to watch this video of Stormy Peters, "Is Your Data Free?". It addresses the issue of the privacy of data that you think belongs to you but actually doesn't. This is exactly what is behind the CubicWeb design: build your own web of data in a permission based environment in order to preserve your privacy.

    http://wiki.freebase.com/skins/freebaseUpdate/freebaselogo.png

    Open source, Open data presented by the Freebase folk, makes a very interesting parallel between open source and open data raising the problematic of versioning open data and providing quality data. There are methodologies and tools for open source software to ensure well designed and reliable code. There is absolutely nothing so far that could handle properly data versioning and data quality assurance. That is the biggest concern freebase has and through this talk they asked for help from the open source community so that more people would get involved in finding solutions to serve open data.

    An attendee raised an interesting question about the format that everybody would agree to use to represent the data. I was surprised by the answer. It seems that so far they do not believe that this is a concern, not to say they don't care, but almost. For freebase, the main concern and most challenging part of the data representation is to have a unique identifier. I am not quite sure I agree on that part. Yes, this is important, even mandatory, but there is also the need to define or use a known format to represent this data, (RDF for example) so that we can source this data. To be semantic data, it needs to be both identifiable and readable. And I do not see the point of publishing data on the web if it is not ready to use.

    Just for fun, look at Rewrite or Refactor: When to Declare Technical Bankruptcy, it might sounds familiar to you...

    CubicWeb presentation went well, an interested audience which was very happy to see that we could aggregate multiple types of sources in a CubicWeb application. Of course, it would be even better if we would support an RDF source such as dbpedia: don't worry that's going to happen. Also what raised an interest is the semantic views already integrated in the framework such as SIOC, OWL, FOAF, DOAP that you can find in blog entries (sioc), schema (owl), user (foaf), project (doap).

     

    RDF Resource Description Framework Icon OWL Button - microformats JSON - RSS dublincore DOAP SIOC - FOAF

     

    By providing a platform for using data from multiple sources and publishing semantic data, CubicWeb is already a piece of the web of open data!


  • Building my photos web site with CubicWeb part IV: let's make it more user friendly

    2010/07/13 by Sylvain Thenault

    Step 0: updating code to CubicWeb 3.9 / cubicweb-file 1.9

    CubicWeb 3.9 brings several improvements that we'll want to use, and the 1.9 version of the file cube has a major change: the Image type has been dropped in favor of an IImage adapter that makes code globally much cleaner (although this is not directly visible here). So the first thing to do is to upgrade our cube to the 3.9 API. As CubicWeb releases are mostly backward compatible, this is not mandatory but it's easier to follow changes as they come than having a huge upgrade to do at some point. Also, this remove deprecation warnings which are a bit tedious...

    Since we only have very few lines of code, this step is pretty simple. Actually the main thing we have to do is to upgrade our schema, to remove occurrences of the Image type or replace them by the File type. Here is the (striped) diff:

     class comments(RelationDefinition):
         subject = 'Comment'
    -    object = ('File', 'Image')
    +    object = 'File'
         cardinality = '1*'
         composite = 'object'
    
     class tags(RelationDefinition):
         subject = 'Tag'
    -    object = ('File', 'Image')
    +    object = 'File'
    
     class displayed_on(RelationDefinition):
         subject = 'Person'
    -    object = 'Image'
    +    object = 'File'
    
     class situated_in(RelationDefinition):
    -    subject = 'Image'
    +    subject = 'File'
         object = 'Zone'
    
     class filed_under(RelationDefinition):
    -    subject = ('File', 'Image')
    +    subject = 'File'
         object = 'Folder'
    
     class visibility(RelationDefinition):
    -    subject = ('Folder', 'File', 'Image', 'Comment')
    +    subject = ('Folder', 'File', 'Comment')
         object = 'String'
         constraints = [StaticVocabularyConstraint(('public', 'authenticated',
                                                    'restricted', 'parent'))]
    
     class may_be_readen_by(RelationDefinition):
    -    subject = ('Folder', 'File', 'Image', 'Comment',)
    +    subject = ('Folder', 'File', 'Comment',)
         object = 'CWUser'
    
    
    -from cubes.file.schema import File, Image
    +from cubes.file.schema import File
    
     File.__permissions__ = VISIBILITY_PERMISSIONS
    -Image.__permissions__ = VISIBILITY_PERMISSIONS
    

    Now, let's set the dependency in the __pkginfo__ file. As 3.8 simplifies this file, we can merge __depends_cubes__ (as introduced in the first blog of this series) with __depends__ to get the following result:

    __depends__ = {'cubicweb': '>= 3.9.0',
                   'cubicweb-file': '>= 1.9.0',
                   'cubicweb-folder': None,
                   'cubicweb-person': None,
                   'cubicweb-zone': None,
                   'cubicweb-comment': None,
                   'cubicweb-tag': None,
                   }
    

    If your cube is packaged for debian, it's a good idea to update the debian/control file at the same time, so you won't forget it.

    That's it for the API update, CubicWeb and cubicweb-file will handle other stuff for us. Easy, no?

    We can now start some more fun stuff...

    Step 1: let's improve site's usability for our visitors

    The first thing I've noticed is that people to whom I send links to photos with some login/password authentication get lost, because they don't grasp they have to login by clicking on the 'authenticate' link. That's probably because they only get a 404 when trying to access an unauthorized folder, and the site doesn't make clear that 1. you're not authenticated, 2. you could get more content by authenticating yourself.

    So, to improve this situation, I decided that I should:

    • make a login box appears for anonymous, so they see at a first glance a place to put the login / password information I provided
    • customize the 404 page, proposing to login to anonymous.

    Here is the code, samples from my cube's views.py file:

    from cubicweb.selectors import is_instance
    from cubicweb.web import box
    from cubicweb.web.views import basetemplates, error
    
    class FourOhFour(error.FourOhFour):
        __select__ = error.FourOhFour.__select__ & anonymous_user()
    
        def call(self):
            self.w(u"<h1>%s</h1>" % self._cw._('this resource does not exist'))
            self.w(u"<p>%s</p>" % self._cw._('have you tried to login?'))
    
    class LoginBox(box.BoxTemplate, basetemplates.LogFormView):
        """display a box containing links to all startup views"""
        __regid__ = 'sytweb.loginbox'
        __select__ = box.BoxTemplate.__select__ & anonymous_user()
    
        title = _('Authenticate yourself')
        order = 70
    
        def call(self, **kwargs):
            self.w(u'<div class="sideBoxTitle"><span>%s</span></div>' % self.title)
            self.w(u'<div class="sideBox"><div class="sideBoxBody">')
            self.login_form('loginBox')
            self.w(u'</div></div>')
    

    The first class provides a new specific implementation of the default page you get on a 404 error, to display an explicit message for anonymous users.

    Note

    Thanks to the selection mechanism, it will be selected for anonymous users, since the additional anonymous_user() selector gives it a higher score than the default, and not for authenticated since this selector will return 0 otherwise (hence the object won't be selectable).

    The second class defines a simple box, that will be displayed by default with boxes in the left column, thanks to default box.BoxTemplate'selector. The HTML is written to match default CubicWeb boxes style. To get the actual login form, we inherit from the LogFormView view which provides a login_form method (handling some stuff under the cover for us, hence the multiple inheritance), that we simply have to call to get the form's HTML.

    login box / 404 screenshot

    The login box and the custom 404 page for an anonymous visitor (translated in french)

    Step 2: providing a custom index page

    Another thing we can easily do to improve the site is... A nicer index page (e.g. the first page you get when accessing the web site)! The default one is quite intimidating (that should change in a near future). I will provide a much simpler index page that simply list available folders (e.g. photo albums in that site).

    from cubicweb.web.views import startup
    
    class IndexView(startup.IndexView):
        def call(self, **kwargs):
            self.w(u'<div>\n')
            if self._cw.cnx.anonymous_connection:
                self.w(u'<h4>%s</h4>\n' % self._cw._('Public Albums'))
            else:
                self.w(u'<h4>%s</h4>\n' % self._cw._('Albums for %s') % self._cw.user.login)
            self._cw.vreg['views'].select('tree', self._cw).render(w=self.w)
            self.w(u'</div>\n')
    
    def registration_callback(vreg):
        vreg.register_all(globals().values(), __name__, (IndexView,))
        vreg.register_and_replace(IndexView, startup.IndexView)
    

    As you can see, we override the default index view found in cubicweb.web.views.startup, getting back nothing but its identifier and selector since we override the top level view's call method.

    Note

    In that case, we want our index view to replace the existing one. We implement the registration_callback function, in which we code a registeration of everything in the module but our IndexView, then we register it instead of the former index view.

    Also, we added a title that tries to make it more evident that the visitor is authenticated, or not. Hopefully people will get it now!

    default index page screenshot

    The default index page

    new index page screenshot

    Our simpler, less intimidating, index page (still translated in french)

    Step 3: more navigation improvements

    There are still a few problems I want to solve...

    • Images in a folder are displayed in a somewhat random order. I would like to have them ordered by file's name (which will usually, inside a given folder, also result ordering photo by their date and time)
    • When clicking a photo from an album view, you've to get back to the gallery view to go to the next photo. This is pretty annoying...
    • Also, when viewing an image, there is no clue about the folder to which this image belongs to.

    I will first try to explain the ordering problem. By default, when accessing related entities by using the ORM's API, you should get them ordered according to the target's class fetch_order. If we take a look at the file cube's schema, we can see:

    class File(AnyEntity):
        """customized class for File entities"""
        __regid__ = 'File'
        fetch_attrs, fetch_order = fetch_config(['data_name', 'title'])
    

    By default, fetch_config will return a fetch_order method that will order on the first attribute in the list. We could expect to get files ordered by their name. But we don't. What's up doc ?

    The problem is that files are related to folder using the filed_under relation. And that relation is ambiguous, eg it can lead to File entities, but also to Folder entities. In such a case, since both entity types don't share the attribute on which we want to sort, we'll get linked entities sorted on a common attribute (usually modification_date).

    To fix this, we have to help the ORM. We'll do this in the method from the ITree folder's adapter, used in the folder's primary view to display the folder's content. Here's the code that I've put in our cube's entities.py file, since it's more logical stuff than view stuff:

    from cubes.folder import entities as folder
    
    class FolderITreeAdapter(folder.FolderITreeAdapter):
    
        def different_type_children(self, entities=True):
            rql = self.entity.cw_related_rql(self.tree_relation,
                                             self.parent_role, ('File',))
            rset = self._cw.execute(rql, {'x': self.entity.eid})
            if entities:
                return list(rset.entities())
            return rset
    
    def registration_callback(vreg):
        vreg.register_and_replace(FolderITreeAdapter, folder.FolderITreeAdapter)
    

    As you can see, we simply inherit from the adapter defined in the folder cube, then we override the different_type_children method to give a clue to the ORM's cw_related_rql method, that will generate the rql to get entities related to the folder by the filed_under relation (the value of the tree_relation attribute). The clue is that we only want to consider the File target entity type. By doing this, we remove the ambiguity and get back a RQL query that correctly orders files by their data_name attribute.

    Note

    • Adapters have been introduced in CubicWeb 3.9 / cubicweb-folder 1.8.
    • As seen earlier, we want to replace the folder's ITree adapter by our implementation, hence the custom registration_callback method.

    Ouf. That one was tricky...

    Now the easier parts. Let's start by adding some links on the file's primary view to see the previous / next image in the same folder. CubicWeb provides a component that do exactly that. To make it appear, it has to be adaptable to the IPrevNext interface. Here is the related code sample, extracted from our cube's views.py file:

    from cubicweb.selectors import is_instance
    from cubicweb.web.views import navigation
    
    
    class FileIPrevNextAdapter(navigation.IPrevNextAdapter):
        __select__ = is_instance('File')
    
        def previous_entity(self):
            rset = self._cw.execute('File F ORDERBY FDN DESC LIMIT 1 WHERE '
                                    'X filed_under FOLDER, F filed_under FOLDER, '
                                    'F data_name FDN, X data_name > FDN, X eid %(x)s',
                                    {'x': self.entity.eid})
            if rset:
                return rset.get_entity(0, 0)
    
        def next_entity(self):
            rset = self._cw.execute('File F ORDERBY FDN ASC LIMIT 1 WHERE '
                                    'X filed_under FOLDER, F filed_under FOLDER, '
                                    'F data_name FDN, X data_name < FDN, X eid %(x)s',
                                    {'x': self.entity.eid})
            if rset:
                return rset.get_entity(0, 0)
    

    The IPrevNext interface implemented by the adapter simply consist of the previous_entity / next_entity methods, that should respectively return the previous / next entity or None. We make an RQL query to get files in the same folder, ordered similarly (eg by their data_name attribute). We set ascendant/descendant ordering and a strict comparison with current file's name (the "X" variable representing the current file).

    Note

    • Former implements selector should be replaced by is_instance or adaptable selector with CubicWeb >= 3.9. In our case, is_instance is used to tell our adapter to get File entities.

    Notice that this query supposes we wont have two files of the same name in the same folder. Fixing this is out of the scope of this blog. And as I would like to have at some point a smarter, context sensitive previous/next entity, I'll probably never fix this query (though if I had to, I would probably choose to add a constraint in the schema so that we can't add two files of the same name in a folder).

    One more thing: by default, the component will be displayed below the content zone (the one with the white background). You can change this in the site's properties through the ui, but you can also change the default value in the code by modifying the context attribute of the component:

    navigation.NextPrevNavigationComponent.context = 'navcontentbottom'
    

    Note

    context may be one of 'navtop', 'navbottom', 'navcontenttop' or 'navcontentbottom'; the first two being outside the main content zone, the two others inside it.

    screenshot of the previous/next entity component

    The previous/next entity component, at the bottom of the main content zone.

    Now, the only remaining stuff in my todo list is to see the file's folder. I'll use the standard breadcrumb component to do so. Similarly as what we've seen before, this component is controlled by the IBreadCrumbs interface, so we'll have to provide a custom adapter for File entity, telling the a file's parent entity is its folder:

    from cubicweb.web.views import ibreadcrumbs
    
    class FileIBreadCrumbsAdapter(ibreadcrumbs.IBreadCrumbsAdapter):
        __select__ = is_instance('File')
    
        def parent_entity(self):
            if self.entity.filed_under:
                return self.entity.filed_under[0]
    

    In this case, we simply use the attribute notation provided by the ORM to get the folder in which the current file (e.g. self.entity) is located.

    Note

    The IBreadCrumbs interface is a breadcrumbs method, but the default IBreadCrumbsAdapter provides a default implementation for it that will look at the value returned by its parent_entity method. It also provides a default implementation for this method for entities adapting to the ITree interface, but as our File doesn't, we've to provide a custom adapter.

    screenshot of the breadcrumb component

    The breadcrumb component when on a file entity, now displaying parent folder.

    Step 4: preparing the release and migrating the instance

    Now that greatly enhanced our cube, it's time to release it and to upgrade production site. I'll probably detail that process later, but I currently simply transfer the new code to the server running the web site.

    However, there's some commands to get things done properly... First, as I've added some translatable string, I have to run:

    $ cubicweb-ctl i18ncube sytweb
    

    To update the cube's gettext catalogs (the '.po' files under the cube's i18n directory). Once the above command is executed, I'll then update translations.

    To see if everything is ok on my test instance, I do:

    $ cubicweb-ctl i18ninstance sytweb
    $ cubicweb-ctl start -D sytweb
    

    The first command compile i18n catalogs (e.g. generates '.mo' files) for my test instance. The second command starts it in debug mode, so I can open my browser and navigate through the web site to see if everything is ok...

    Note

    In the 'cubicweb-ctl i18ncube' command, sytweb refers to the cube, while in the two other, it refers to the instance (if you can't see the difference, reread CubicWeb's concept chapter !).

    Once I've checked it's ok, I simply have to bump the version number in the __pkginfo__ module to trigger a migration once I'll have updated the code on the production site. I can check the migration is also going fine, by first restoring a dump from the production site, then upgrading my test instance.

    To generate a dump from the production site:

    $ cubicweb-ctl db-dump sytweb
    pg_dump -Fc --username=syt --no-owner --file /home/syt/etc/cubicweb.d/sytweb/backup/tmpYIN0YI/system sytweb
    -> backup file /home/syt/etc/cubicweb.d/sytweb/backup/sytweb-2010-07-13_10-22-40.tar.gz
    

    I can now get back the dump file ('sytweb-2010-07-13_10-22-40.tar.gz') to my test machine (using scp for instance) to restore it and start migration:

    $ cubicweb-ctl db-restore sytweb sytweb-2010-07-13_10-22-40.tar.gz
    $ cubicweb-ctl upgrade sytweb
    

    You'll have to answer some questions, as we've seen in an earlier post.

    Now that everything is tested, I can transfer the new code to the production server, apt-get upgrade cubicweb 3.9 and its dependencies, and eventually upgrade the production instance.

    Conclusion

    This is a somewhat long post that starts showing you the way CubicWeb provides a highly configurable user interface, as well as powerful and reusable components. And there are a lot of others like those!

    So see you next time for part V, where we'll probably want to do more ui stuff!


  • CubicWeb 3.9 released

    2010/07/12 by Sylvain Thenault

    CubicWeb 3.9.0 went out last week. We now have tested it in production and fixed the remaining bugs, which means it is now show time!

    http://www.cubicweb.org/file/1179905?vid=download

    What's new in CubicWeb 3.9?

    The 3.9 release development was started by a one week long sprint at the beginning of May. The two goals were first to make it easier to customize the look and feel of a CubicWeb application, and second to do a big cleanup of the javascript library. This led to the following major changes.

    • We introduced property sheets, which replace former external_resources file, as well as define some constants that will be used to 'compile' cubicweb and cubes' stylesheets.
    • We started a new, clean cubicweb.css stylesheet, that tries to keep up with the rhythm. This is still a work in progress, and by default the old css is still used, unless specified otherwise in the configuration file.
    • We set the bases for web functional testing using windmill. See test cases in cubicweb/web/test/windmill/ and python wrapper in cubicweb/web/test_windmill/ if you want to use this in your own cube.
    • We set the bases for javascript unit-testing using qunit. See test cases in cubicweb/web/test/jstests/ and python wrapper in cubicweb/web/test_jscript/ if you want to use this in your own cube.
    • We cleaned the javascript code: the generic stuff moved into the cw namespace, the ajax api is now much simpler thanks to more generic and powerful functions. As usual backward compatibility was kept, which means that your existing code will still run, but you will see tons of deprecation warnings in the firebug console.
    • We implemented a simple documentation extraction system for javascript. Just put ReST in javascript comments, and get all the power of sphinx for documenting your javascript code.

    But that's not all! There are also two major changes in 3.9.

    http://www.cubicweb.org/file/1179904?vid=download

    Architectural change: adapters

    The first major change is the introduction of adapters, also found in the Zope Component Architecture and documented in the GoF book. This will allow for better application design and easier code reuse. You can see several usage in the framework, for instance the "ITree" adapter in cubicweb.entities.adapters, the "IBreadCrumbs" adapter in cubicweb.web.views.ibreadcrumbs, or still the "ICalendarable" adapter in cubicweb.web.views.calendar.

    Important full search improvement

    The second major change will benefit directly to end users: we worked with our friends from SecondWeb to expose the ranking feature found in postgres full-text search. This clearly improves the user experience when doing full-text searches. Ranking may be finely tuned by setting different weights to entity types, entity types attributes, or even be dynamically computed per entity instance. Of course, all this is done in an adapter, see "IFTIndexableAdapter" in cubicweb/entities/adapters.py.

    Minor changes

    Other minor changes include:

    • support for wildcard text search for application using postgres >= 8.4 as backend. Try searching for 'cub*' on cubicweb.org for instance.
    • inline edition of composite relation
    • nicer, clickable, schema image of the data model
    • enhanced support for the SQLserver database

    Enjoy!


  • Using RQL's HAVING clause to by-pass limitation of the WHERE clause

    2010/06/09 by Sylvain Thenault

    The HAVING clause, as in SQL, has been originally introduced to restrict a query according to value returned by an aggregat function, e.g.:

    Any X GROUPBY X WHERE X relation Y HAVING COUNT(Y) > 10
    

    It may however be used for something else...

    For instance, let's say you want to get people whose uppercased first name equals to another person uppercased first name. Since in the WHERE clause, we are limited to 3-expression (<subject> <relation> <object>), such thing can't be expressed (believe me or try it out). But this can be expressed using HAVING comparison expression:

    Person X WHERE X firstname XFN, Y firstname YFN HAVING X > Y, UPPER(XFN) = UPPER(YFN)
    

    Nice, no? This open some new possibilities. Another example:

    Person X WHERE X birthday XB HAVING YEAR(XB) = 2000
    

    Get it? That lets you use transformation functions not only in selection but for restriction as well, which was the major flaw in the RQL language.

    Notice that while we would like this to work without the HAVING clause, this can't be currently be done because it introduces an ambiguity in RQL's grammar that can't be handled by yapps, the parser's generator we're using.


  • Deactivating the 'reledit' feature

    2010/06/09 by Sylvain Thenault

    The 'reledit' feature is the one that makes attributes/relations editable in entity's primary view for authorized users (you know, the pen that appears when your mouse is over a field's value, clicking on it making a form to edit this field appears).

    This is a nice feature, but you may not want it. It can be easily deactivated everywhere it's used automatically in the site by using the code snippet below:

    from cubicweb.web.views import editforms
    
    class DeactivatedAutoClickAndEditFormView(editforms.AutoClickAndEditFormView):
        def should_edit_attribute(self, entity, rschema, form):
            return False
    
        def should_edit_relation(self, entity, rschema, role, rvid):
            return False
    
    def registration_callback(vreg):
        vreg.register_and_replace(DeactivatedAutoClickAndEditFormView,
                                  editforms.AutoClickAndEditFormView)
    

  • Django, lessons learned in the world of startup companies

    2010/06/02 by Sandrine Ribeau

    I went to the BayPIGgies meeting last thursday. The talk of this session was led by the chief software architect of RubberCan, Barnaby Bienkowski. The idea was to explain why Django turns out to be the choice a lot of startups make when building their web applications.

    Governement 2.0

    http://assets.sunlightfoundation.com/site/3.0/images/sf_logo_trans.png

    The fact that Django is recommended by Sunlight Foundation is important. This foundation is a non-partisan, non-profit organization based in Washington, DC that focuses on the digitization of government data and the creation of tools and Web sites to make that data easily accessible for all citizens. This is part of what is called Governement 2.0. It is a neologism for attempts to apply the social networking and integration advantages of Web 2.0 to the practice of government (see E-Governement).

    It looks like the Sunlight Foundation recommends Django because it comes from the publishing industry. I am not sure what is so special about this, but I wish I could get more details on it, so please add your comments below.

    Since the CubicWeb's community is still small, we are not yet recommended by such a large foundation, but we'll make more effort to talk about it and try to expand our community.

    Geo-localization

    http://geodjango.org/images/globe.png

    These days, geo-localization is a big deal in most applications. On that matter, what Django has to offer is GeoDjango, that recently became part of the Django core. It is integrated with the ORM and has pre-generated SQL queries, but it is not optimized. It uses PostGIS, which adds support for geographic objects to the PostgreSQL object-relational database. GeoDjango strives to make it as simple as possible to create geographic web applications, like location-based services. Some of the features it provides are:

    • Extensions to Django’s ORM for the querying and manipulation of spatial data
    • Editing of geometry fields inside the administration panels
    • Loosely-coupled, high-level Python interfaces for GIS geometry operations and data formats.
    http://openstreetmap.org/images/osm_logo.png?1271689861

    OpenStreetMap is used for the backend. It provides geographic data for any part of the world. This is a nice feature and we should consider it for CubicWeb. What we provide so far is an interface IGeocodable with related views gmap-view, gmap-bubble, geocoding-json and gmap-legend. We do not query this data yet, we simply render them nicely in a Google Map. You can find the details on how to use it here.

    Online stores

    Numerous web applications are not only service or data providers, they sell something. Satchmo is the Django tool to easily build online stores. It provides a shopping cart framework with checkout using different payment modules such as Authorize.net, TrustCommerce, CyberSource, PayPal, Google Checkout or Protx.

    CubicWeb does not provide a component allowing to build an online store, it's not yet a domain we worked on. But I'd like to talk a bit about the cube cubicweb-shoppingcart. This cube defines shopping item and shopping cart, and enables to add items to the shopping cart. It defines type of shopping items and only those can be added to the shopping cart. Whereas Satchmo required to define categories and add items within a category, cubicweb-shoppingcart does not oblige to define categories. Creating shopping items is the only thing you need to do. That makes this component usable not only for online store. For example, we used this cube to manage Euroscipy registration fees reusing the generic schema of a "virtual" shopping cart and its related ressources (web widgets, validation hook, ...).

    Re-usable components

    http://pinaxproject.com/site_media/img/pinax_logo.png

    Pinax has a overall good satisfaction as it supports basics components for blogging, tagging, registration, notification and so on. But one point that was raised, is the difficulty of customizing Pinax components. It seems easy to write your own version of Pinax components, but to integrate them is a pain. All the components are tightly related and by customizing one, there is a big chance it will affect the other components.

    This last point is a big disadvantage. Why? Well, as a developer there is always something that you need to adjust to fit your needs. So customizing components is something you will not avoid while developing your web application. And something I'd like to point about CubicWeb, is its simplicity of re-using existing components, which are independent from each others. This is as easy as Python inheritance. And with its VRegistry, selectors and application objects (see The VRegistry, selectors and application objects for more details), customization is well integrated into the framework.

    Assemble cubes and functionalities is very easy as well. Let's think of an example. We have those three cubes: cubicweb-book, cubicweb-tag and cubicweb-comment. Cubicweb-book defines Book entity type. Cubicweb-tag defines Tag entities and the ability to tag other entity types. Cubicweb-comment defines Comment entity type and the ability to comment other entity types. What if we want to create an application in which we could tag and comment Book. Well, this is done with the following schema definition where we explicitly define the relations between Book, Tag and Comment entity types:

    from yams.buildobjs import RelationDefinition
    class comments(RelationDefinition):
        subject = 'Comment'
        object = 'Book'
        cardinality = '1*'
        composite = 'subject'
    
    class tag(RelationDefinition):
        subject = 'Tag'
        object = 'Book'
        cardinality = '**'
    

    Forms

    Despite the fact that forms are easy in Django, there is no way to add inline entities, at least for now (see this proposition) as easily as in CubicWeb (see HTML form construction for more details). That is very neat when you create/edit related entities. Plus, since CubicWeb 3.6, forms are much easier to handle, and we still put a lot of effort into making it simplier.

    So, yes, overall Django is selected as the best compromise, but for the reason I listed, CubicWeb should be considered.

    Watch out Django, we are getting on your way ;)


  • OSCON 2010 discount!!

    2010/05/21 by Sandrine Ribeau
    http://assets.en.oreilly.com/1/event/45/oscon2010_12year.png

    Since Logilab will be presenting CubicWeb at OSCON, we get to have a discount code giving 20% rebate on OSCON registration. Please feel free to use this discount code while registering: os10fos.

    See you there!


  • Building my photos web site with CubicWeb part III: storing images on the file-system

    2010/05/20 by Sylvain Thenault

    Step 1: configuring the BytesFileSystem storage

    To avoid cluttering my database, and to ease file manipulation, I don't want them to be stored in the database. I want to be able create File/Image entities for some files on the server file system, where those file will be accessed to get entities data. To do so, I've to set a custom BytesFileSystemStorage storage for the File/Image 'data' attribute, which holds the actual file's content.

    Since the function to register a custom storage needs to have a repository instance as a first argument, we have to call it in a server startup hook. So I added it in cubes/sytweb/hooks.py :

    from os import makedirs
    from os.path import join, exists
    
    from cubicweb.server import hook
    from cubicweb.server.sources import storage
    
    class ServerStartupHook(hook.Hook):
        __regid__ = 'sytweb.serverstartup'
        events = ('server_startup', 'server_maintenance')
    
        def __call__(self):
            bfssdir = join(self.repo.config.appdatahome, 'bfss')
            if not exists(bfssdir):
                makedirs(bfssdir)
                print 'created', bfssdir
            storage = storages.BytesFileSystemStorage(bfssdir)
            set_attribute_storage(self.repo, 'File', 'data', storage)
            set_attribute_storage(self.repo, 'Image', 'data', storage)
    

    Note

    • how we built the hook's registry identifier (_regid__): you can introduce 'namespaces' by using their python module like naming identifiers. This is especially important for hooks where you usually want a new custom hook, not overriding / specializing an existent one, but the concept may be used for any application objects
    • we catch two events here: "server_startup" and "server_maintenance". The first is called on regular repository startup (eg, as a server), the other for maintenance task such as shell or upgrade. In both cases, we need to have the storage set, else we'll be in trouble...
    • the path given to the storage is the place where a file added through the ui (or in the database before migration) will be located
    • be aware that by doing this, you can't write queries that will try to restrict on the File and the Image data attribute anymore. Thankfully we don't usually do that on a file's content or more generally on attributes for the Bytes type

    Now, if you've already added some photos through the web ui, you'll have to migrate existing data so that the file's content will be stored on the file-system instead of the database. There is a migration command to do so, let's run it in the cubicweb shell (in actual life, you'd have to put it in a migration script as we saw last time):

    $ cubicweb-ctl shell sytweb
     entering the migration python shell
     just type migration commands or arbitrary python code and type ENTER to execute it
     type "exit" or Ctrl-D to quit the shell and resume operation
     >>> storage_changed('File', 'data')
     [........................]
     >>> storage_changed('Image', 'data')
     [........................]
    

    That's it. Now, the files added through the web ui will have their content stored on the file-system, and you'll also be able to import files from the file-system as explained in the next part.

    Step 2: importing some data into the instance

    Hey, we're starting to have some nice features, let's give this new web site a try. For instance if I have a 'photos/201005WePyrenees' containing pictures for a particular event, I can import it to my web site by typing

    $ cubicweb-ctl fsimport -F sytweb photos/201005WePyrenees/
    ** importing directory /home/syt/photos/201005WePyrenees
      importing IMG_8314.JPG
      importing IMG_8274.JPG
      importing IMG_8286.JPG
      importing IMG_8308.JPG
      importing IMG_8304.JPG
    

    Note

    The -F option tell that folders should be mapped, hence my photos will be all under a Folder entity corresponding to the file-system folder.

    Let's take a look at the web ui:

    http://www.cubicweb.org/file/972765?vid=download

    Nothing different, I can't see the new folder... But remember our security model! By default, files are only accessible to authenticated users, and I'm looking at the site as anonymous, e.g. not authenticated. If I login, I can now see:

    http://www.cubicweb.org/file/972766?vid=download

    Yeah, it's there! You can also notice that I can see some entities as well as folders and images the anonymous users can't. It just works everywhere in the ui since it's handled at the repository level, thanks to our security model.

    Now if I click on the newly inserted folder, I can see

    http://www.cubicweb.org/file/972767?vid=download

    Great! I get my pictures in the folder. I can now give a nicer name to this folder (provided I don't intend to import from it anymore, else already imported photos will be reimported), change permissions, title for some pictures, etc... Having good content is much more difficult than having a good web site ;)

    Conclusion

    We started to see here an advanced feature of our repository: the ability to store some parts of our data-model into a custom storage, outside the database. There is currently only the BytesFileSystemStorage available, but you can expect to see more coming in a near future.

    Also, we can now start to feed our web-site with some nice pictures! The site isn't perfect (far from it actually) but it's usable, and we can start using it and improve it on the way. The Incremental Cubic Way :)

    So see you next time to start tweaking the user interface!


  • CSS+JS sprint report - Day 1 and 2 (April 2010)

    2010/04/30 by Adrien Di Mascio

    These first two days essentially consisted in exploring the javascript world.

    Documenting javascript

    Sandrine and Alain worked on the javascript documentation tools and how they could be integrated into our sphinx generated documentation.

    http://www.percious.com/static/images/blog/sphinx.png

    They first studied pyjsdoc which unfortunately only generates HTML. After a somewhat successful attempt to generate sphinx ReST, we decided to use a consistent documentation format between python modules and js modules and therefore switched to a home-made, very simple javascript comment parser. Here's an example of what the parser understands:

    /**
     * .. cfunction:: myFunction(a, b, /*...*/, c, d)
     *
     *    This function is very **well** documented and does quite
     *    a lot of stuff :
     *    - task 1
     *    - task 2
     *
     *    :param a: this is the first parameter
     *    ...
     *    :return: 42
     */
    function myFunction(a, b, /*...*/, c, d) {
    }
    

    The extracted ReST snippets are then concatenated and inserted in the general documentation.

    Unit testing javascript

    Katia, Julien and Adrien looked at the different testing tools for javascript, with the two following goals in mind:

    • low-level unit testing, as cubicweb agnostic as possible
    • high-level / functional testing, we want to write navigation scenarios and replay them

    And the two winners of the exploration are:

    http://www.t0asted.com/getwindmill/wm_logo_round.png
    • QUnit for pure javascript / DOM testing. Julien and Adrien successfully managed to test a few cubicweb js functions, most notably the loadxhtml jquery plugin.
    • Windmill for higher level testing. Katia and Sylvain were able to integrate Windmill within the CubicWeb unit testing framework.

    Of course, there is still a lot of work that needs to be done. For instance, we would like to have a test runner facility to run QUnit-based tests on multiple platforms / browsers automatically.

    Parametrized stylesheets and vertical rhythm

    Sylvain worked on property sheets and managed to implement compiled CSS based on simple string interpolation. Of course, compiled CSS are still HTTP cached, automatically recompiled on debug mode, etc. On his way, he also got rid of the external_resources file. Backward compatibility will of course be guaranteed for a while.

    Nicolas worked on CSS and vertical rythm and prepared a patch that introduces a basic rhythm. The tedious work will be to get every stylesheet to dance to the beat.


  • CubicWeb sprint in Paris about js and css

    2010/04/29 by Arthur Lutz

    Logilab is once again hosting a sprint around CubicWeb - 5 days in our Paris offices.

    The general focus will be around javascript & css :

    http://www.iconarchive.com/icons/enhancedlabs/lha-objects/128/Filetype-CSS-icon.png http://codesnip.net/wp-content/uploads/javascript.png
    • easily change the style of an application
    • handling of bundles merging javascript and css
    • have a clean javascript API, documented and tested
    • have documentation about the css & javascript parts in the cubicweb book

    This sprint is taking place from thursday the 29th of April 2010 to the 5th of may 2010 (weekend is off limits - the offices will be closed). You are more than welcome to come along and help out, contribute, or just pair program with someone. Coming only for a day, or an afternoon is fine too... Network resources will be available for those bringing laptops.

    Address : 104 Boulevard Auguste-Blanqui, Paris. Ring "Logilab".

    Metro : St Jacques or Corvisart (Glacière is closest, but will be closed from monday onwards)

    Contact : http://www.logilab.fr/contact

    Dates : 29/04/2010 to 30/04/2010 and 03/05/2010 to 05/05/2010


  • CubicWeb 3.8 released

    2010/04/28 by Sylvain Thenault

    CubicWeb 3.8.0 went out last week, but now we have tested it, produced a 3.8.1, it's show time!

    What's new in CubicWeb 3.8?

    One of the most important change is http server update to move from deadend twisted.web2 to twisted.web. With this change comes the possibility to configure the maximum size of POST request in the configuration file (was hard-coded to 100Mo before).

    Other changes include:

    • CubicWeb should now be installable through pip or easy_install. This is still experimental, and we don't use it that much so please, give us some feedback! Some cubes are now also "pipable" (comment, blog...), but more will come with new releases.
    • .execute() function lost its cache key argument. This is great news since it was a pain to explain and most cubicweb users didn't know how to handle it well (and I'm thre greatest beneficer since I won't have to explain over and over again)
    • nicer schema and workflow views
    • refactored web session handling, which should now be cleaner, clearer, hence less buggy...
    • nicer skeleton generation for new cubes, cleaner __pkginfo__ (you don't have to define both __depends__ / __depends_cubes__ or __recommends__ / __recommends_cubes__ in the general case, and other cleanups)

    Enjoy!


  • Migrating cubicweb instances - benefits from a distributed architecture

    2010/04/22 by Arthur Lutz

    Aim : do the migration for N cubicweb instances hosted on a server to another with no downtime.

    Prerequisites : have an explicit definition of the database host (not default or localhost). In our case, the database is hosted on another host. You are not migrating your pyro server. You are not using multisource (more documentation on that soon).

    Steps :

    1. on new machine : install your environment (pseudocode)

      apt-get install cubicweb cubicweb-applications apache2
      
    2. on old machine : copy your cubicweb and apache configuration to the new machine

      scp /etc/cubicweb.d/ newmachine:/etc/cubicweb.d/
      scp /etc/apache2/sites-available/ newmachine:/etc/apache2/sites-available/
      
    3. on new machine : give new ids to pyro registration so the new instances can register

      cd /etc/cubicweb.d/ ; sed -i.bck 's/^pyro-instance-id=.*$/\02/' */all-in-one.conf
      
    4. on new machine : start your instances

      cubicweb start
      
    5. on new machine : enable sites and modules for apache and start it, test it using by modifying your /etc/host file.

    6. change dns entry from your oldmachine to newmachine

    7. shutdown your old machine (if it doesn't host other services or your database)

    8. That's it.

    Possible enhancements : use right from the start a pound server behind your apache, that way you can add backends and smoothily migrate by shuting down backends that pound will take into account.

    http://www.cubicweb.org/file/893561?vid=download

  • Documentation progress

    2010/04/20 by Aurelien Campeas

    As part of an effort to improve the documentation (see the cw_course version) a lot of chapters have been completed (and filled with real-world examples). Many more were updated and reorganized.

    I won't list everything but here are the most important improvements:

    picture under creative commons

    Picture under Creative Commons, courtesy of digitalnoise.

    • The publishing process
    • Templates & the architecture of views
    • Primary views customizations (including use of the uicfg module)
    • Controllers
    • Hooks & Operations
    • Proper usage of the ORM
    • Unit tests
    • Breadcrumbs
    • URL rewrite
    • Using the CW javascript library

    Last but not least, a whole new tutorial based on Sylvain's great series Building my photos Web site has been included. It covers some advanced topics such as Operations and sophisticated security settings.

    The visual style has been enhanced a bit to have better readability.

    As always, patches are welcome !

    picture under Creative Commons, courtesy of digitalnoise


  • Building my photos web site with CubicWeb part II: security, testing and migration

    2010/04/13 by Sylvain Thenault

    This post will cover various topics:

    • configuring security
    • migrating an existing instance
    • writing some unit tests

    Goal

    Here are the read permissions I want:

    • folders, files, images and comments should have one of the following visibility rules:
      • 'public', everyone can see it
      • 'authenticated', only authenticated users can see it
      • 'restricted', only a subset of authenticated users can see it
    • managers (e.g. me) can see everything
    • only authenticated users can see people
    • everyone can see classifier entities (tag and zone)

    Also, unless explicity specified, the visibility of an image should be the same as the visibility of its parent folder and the visibility of a comment should be the same as the one of the commented entity. If there is no parent entity, the default visibility is 'authenticated'.

    Regarding write permissions, that's much easier:

    • the anonymous user can't write
    • authenticated users can only add comment
    • managers will add the remaining stuff

    Now, let's implement that!

    Proper security in CubicWeb is done at the schema level, so you don't have to bother with it in the views, for the users will only see what they have access to.

    Step 1: adding permissions to the schema

    In the schema, you can grant access according to groups or RQL expressions (users get access if the expression return some results). To implements the read security defined above, groups are not enough, we'll need to use RQL expressions. Here is the idea:

    • add a visibility attribute on folder, image and comment, with a vocabulary ('public', 'authenticated', 'restricted', 'parent')
    • add a may_be_read_by relation that links folder, image or comment to users,
    • add hooks to propagate permission changes.

    So the first thing to do is to modify the schema.py of my cube to define these relations:

    from yams.constraints import StaticVocabularyConstraint
    
    class visibility(RelationDefinition):
        subject = ('Folder', 'File', 'Image', 'Comment')
        object = 'String'
        constraints = [StaticVocabularyConstraint(('public', 'authenticated',
                                                   'restricted', 'parent'))]
        default = 'parent'
        cardinality = '11' # required
    
    class may_be_read_by(RelationDefinition):
        subject = ('Folder', 'File', 'Image', 'Comment',)
        object = 'CWUser'
    

    We can note the following points:

    • we've added a new visibility attribute to folder, file, image and comment using a RelationDefinition
    • cardinality = '11' means this attribute is required. This is usually hidden under the required argument given to the String constructor, but we can rely on this here (same thing for StaticVocabularyConstraint, which is usually hidden by the vocabulary argument)
    • the 'parent' possible value will be used for visibility propagation

    Now, we should be able to define security rules in the schema, based on these new attribute and relation. Here is the code to add to schema.py:

    from cubicweb.schema import ERQLExpression
    
    VISIBILITY_PERMISSIONS = {
        'read':   ('managers',
                   ERQLExpression('X visibility "public"'),
                   ERQLExpression('X visibility "authenticated", U in_group G, G name "users"'),
                   ERQLExpression('X may_be_read_by U')),
        'add':    ('managers',),
        'update': ('managers', 'owners',),
        'delete': ('managers', 'owners'),
        }
    AUTH_ONLY_PERMISSIONS = {
            'read':   ('managers', 'users'),
            'add':    ('managers',),
            'update': ('managers', 'owners',),
            'delete': ('managers', 'owners'),
            }
    CLASSIFIERS_PERMISSIONS = {
            'read':   ('managers', 'users', 'guests'),
            'add':    ('managers',),
            'update': ('managers', 'owners',),
            'delete': ('managers', 'owners'),
            }
    
    from cubes.folder.schema import Folder
    from cubes.file.schema import File, Image
    from cubes.comment.schema import Comment
    from cubes.person.schema import Person
    from cubes.zone.schema import Zone
    from cubes.tag.schema import Tag
    
    Folder.__permissions__ = VISIBILITY_PERMISSIONS
    File.__permissions__ = VISIBILITY_PERMISSIONS
    Image.__permissions__ = VISIBILITY_PERMISSIONS
    Comment.__permissions__ = VISIBILITY_PERMISSIONS.copy()
    Comment.__permissions__['add'] = ('managers', 'users',)
    Person.__permissions__ = AUTH_ONLY_PERMISSIONS
    Zone.__permissions__ = CLASSIFIERS_PERMISSIONS
    Tag.__permissions__ = CLASSIFIERS_PERMISSIONS
    

    What's important in there:

    • VISIBILITY_PERMISSIONS provides read access to an entity:
      • if user is in the 'managers' group,
      • or if visibility attribute's value is 'public',
      • or if visibility attribute's value is 'authenticated' and user (designed by the 'U' variable in the expression) is in the 'users' group (all authenticated users are expected to be in this group)
      • or if user is linked to the entity (the 'X' variable) through the may_be_read_by permission
    • we modify permissions of the entity types we use by importing them and modifying their __permissions__ attribute
    • notice the .copy(): we only want to modify 'add' permission for Comment, not for all entity types using VISIBILITY_PERMISSIONS!
    • remaning parts of the security model is done using regular groups:
      • 'users' is the group to which all authenticated users will belong
      • 'guests' is the group of anonymous users

    Step 2: security propagation in hooks

    To fullfill our requirements, we have to implement:

    Also, unless explicity specified, the visibility of an image should be the same as
    the visibility of its parent folder and the visibility of a comment should be the same as the
    one of the commented entity. If there is no parent entity, the default visibility is
    'authenticated'.
    

    This kind of 'active' rule will be done using CubicWeb's hook system. Hooks are triggered on database event such as addition of new entity or relation.

    The tricky part of the requirement is in unless explicitly specified, notably because when the entity addition hook is executed, we don't know yet its 'parent' entity (eg folder of an image, image commented by a comment). To handle such things, CubicWeb provides Operation, which allow to schedule things to do at commit time.

    In our case we will:

    • on entity creation, schedule an operation that will set default visibility
    • when a "parent" relation is added, propagate parent's visibility unless the child already has a visibility set

    Here is the code in cube's hooks.py:

    from cubicweb.selectors import implements
    from cubicweb.server import hook
    
    class SetVisibilityOp(hook.Operation):
        def precommit_event(self):
            for eid in self.session.transaction_data.pop('pending_visibility'):
                entity = self.session.entity_from_eid(eid)
                if entity.visibility == 'parent':
                    entity.set_attributes(visibility=u'authenticated')
    
    class SetVisibilityHook(hook.Hook):
        __regid__ = 'sytweb.setvisibility'
        __select__ = hook.Hook.__select__ & implements('Folder', 'File', 'Image', 'Comment')
        events = ('after_add_entity',)
        def __call__(self):
            hook.set_operation(self._cw, 'pending_visibility', self.entity.eid,
                               SetVisibilityOp)
    
    class SetParentVisibilityHook(hook.Hook):
        __regid__ = 'sytweb.setparentvisibility'
        __select__ = hook.Hook.__select__ & hook.match_rtype('filed_under', 'comments')
        events = ('after_add_relation',)
    
        def __call__(self):
            parent = self._cw.entity_from_eid(self.eidto)
            child = self._cw.entity_from_eid(self.eidfrom)
            if child.visibility == 'parent':
                child.set_attributes(visibility=parent.visibility)
    

    Remarks:

    • hooks are application objects, hence have selectors that should match entity or relation type to which the hook applies. To match relation type, we use the hook specific match_rtype selector.
    • usage of set_operation: instead of adding an operation for each added entity, set_operation allows to create a single one and to store the eids of the entities to be processed in the session transaction data. This is a good pratice to avoid heavy operations manipulation cost when creating a lot of entities in the same transaction.
    • the precommit_event method of the operation will be called at transaction's commit time.
    • in a hook, self._cw is the repository session, not a web request as usually in views
    • according to hook's event, you have access to different member on the hook instance. Here:
      • self.entity is the newly added entity on 'after_add_entity' events
      • self.eidfrom / self.eidto are the eid of the subject / object entity on 'after_add_relation' events (you may also get the relation type using self.rtype)

    The 'parent' visibility value is used to tell "propagate using parent security" because we want that attribute to be required, so we can't use None value else we'll get an error before we get any chance to propagate...

    Now, we also want to propagate the may_be_read_by relation. Fortunately, CubicWeb provides some base hook classes for such things, so we only have to add the following code to hooks.py:

    # relations where the "parent" entity is the subject
    S_RELS = set()
    # relations where the "parent" entity is the object
    O_RELS = set(('filed_under', 'comments',))
    
    class AddEntitySecurityPropagationHook(hook.PropagateSubjectRelationHook):
        """propagate permissions when new entity are added"""
        __regid__ = 'sytweb.addentity_security_propagation'
        __select__ = (hook.PropagateSubjectRelationHook.__select__
                      & hook.match_rtype_sets(S_RELS, O_RELS))
        main_rtype = 'may_be_read_by'
        subject_relations = S_RELS
        object_relations = O_RELS
    
    class AddPermissionSecurityPropagationHook(hook.PropagateSubjectRelationAddHook):
        __regid__ = 'sytweb.addperm_security_propagation'
        __select__ = (hook.PropagateSubjectRelationAddHook.__select__
                      & hook.match_rtype('may_be_read_by',))
        subject_relations = S_RELS
        object_relations = O_RELS
    
    class DelPermissionSecurityPropagationHook(hook.PropagateSubjectRelationDelHook):
        __regid__ = 'sytweb.delperm_security_propagation'
        __select__ = (hook.PropagateSubjectRelationDelHook.__select__
                      & hook.match_rtype('may_be_read_by',))
        subject_relations = S_RELS
        object_relations = O_RELS
    
    • the AddEntitySecurityPropagationHook will propagate the relation when filed_under or comments relations are added
      • the S_RELS and O_RELS set as well as the match_rtype_sets selector are used here so that if my cube is used by another one, it'll be able to configure security propagation by simply adding relation to one of the two sets.
    • the two others will propagate permissions changes on parent entities to children entities

    Step 3: testing our security

    Security is tricky. Writing some tests for it is a very good idea. You should even write them first, as Test Driven Development recommends!

    Here is a small test case that'll check the basis of our security model, in test/unittest_sytweb.py:

    from cubicweb.devtools.testlib import CubicWebTC
    from cubicweb import Binary
    
    class SecurityTC(CubicWebTC):
    
        def test_visibility_propagation(self):
            # create a user for later security checks
            toto = self.create_user('toto')
            # init some data using the default manager connection
            req = self.request()
            folder = req.create_entity('Folder',
                                       name=u'restricted',
                                       visibility=u'restricted')
            photo1 = req.create_entity('Image',
                                       data_name=u'photo1.jpg',
                                       data=Binary('xxx'),
                                       filed_under=folder)
            self.commit()
            photo1.clear_all_caches() # good practice, avoid request cache effects
            # visibility propagation
            self.assertEquals(photo1.visibility, 'restricted')
            # unless explicitly specified
            photo2 = req.create_entity('Image',
                                       data_name=u'photo2.jpg',
                                       data=Binary('xxx'),
                                       visibility=u'public',
                                       filed_under=folder)
            self.commit()
            self.assertEquals(photo2.visibility, 'public')
            # test security
            self.login('toto')
            req = self.request()
            self.assertEquals(len(req.execute('Image X')), 1) # only the public one
            self.assertEquals(len(req.execute('Folder X')), 0) # restricted...
            # may_be_read_by propagation
            self.restore_connection()
            folder.set_relations(may_be_read_by=toto)
            self.commit()
            photo1.clear_all_caches()
            self.failUnless(photo1.may_be_read_by)
            # test security with permissions
            self.login('toto')
            req = self.request()
            self.assertEquals(len(req.execute('Image X')), 2) # now toto has access to photo2
            self.assertEquals(len(req.execute('Folder X')), 1) # and to restricted folder
    
    if __name__ == '__main__':
        from logilab.common.testlib import unittest_main
        unittest_main()
    

    It is not complete, but it shows most of the things you will want to do in tests: adding some content, creating users and connecting as them in the test, etc...

    To run it type:

    [syt@scorpius test]$ pytest unittest_sytweb.py
    ========================  unittest_sytweb.py  ========================
    -> creating tables [....................]
    -> inserting default user and default groups.
    -> storing the schema in the database [....................]
    -> database for instance data initialized.
    .
    ----------------------------------------------------------------------
    Ran 1 test in 22.547s
    
    OK
    

    The first execution is taking time, since it creates a sqlite database for the test instance. The second one will be much quicker:

    [syt@scorpius test]$ pytest unittest_sytweb.py
    ========================  unittest_sytweb.py  ========================
    .
    ----------------------------------------------------------------------
    Ran 1 test in 2.662s
    
    OK
    

    If you do some changes in your schema, you'll have to force regeneration of that database. You do that by removing the tmpdb* files before running the test:

    [syt@scorpius test]$ rm tmpdb*
    

    BTW, pytest is a very convenient utilities to control test execution, from the logilab-common package.

    Step 4: writing the migration script and migrating the instance

    Prior to those changes, Iv'e created an instance, fed it with some data, so I don't want to create a new one, but to migrate the existing one. Let's see how to do that.

    Migration commands should be put in the cube's migration directory, in a file named file:<X.Y.Z>_Any.py ('Any' being there mostly for historical reason).

    Here I'll create a migration/0.2.0_Any.py file containing the following instructions:

    add_relation_type('may_be_read_by')
    add_relation_type('visibility')
    sync_schema_props_perms()
    

    Then I update the version number in cube's __pkginfo__.py to 0.2.0. And that's it! Those instructions will:

    • update the instance's schema by adding our two new relations and update the underlying database tables accordingly (the two first instructions)
    • update schema's permissions definition (the later instruction)

    To migrate my instance I simply type:

    [syt@scorpius ~]$ cubicweb-ctl upgrade sytweb
    

    I will then be asked some questions to do the migration step by step. You should say YES when it asks if a backup of your database should be done, so you can get back to the initial state if anything goes wrong...

    Conclusion

    This is a somewhat long post that I bet you will have to read at least twice ;) There is a hell lot of information hidden in there... But that should start to give you an idea of CubicWeb's power...

    See you next time for part III !


  • Building my photos web site with CubicWeb (Part I)

    2010/04/01 by Sylvain Thenault

    Desired features

    • photo gallery;
    • photo stored onto the fs and displayed through a web interface dynamically;
    • navigation through folder (album), tags, geographical zone, people on the picture... using facets;
    • advanced security (eg not everyone can see everything). More on this later.

    Let's go then

    Step 1: creating a new cube for my web site

    One note about my development environment: I wanted to use packaged version of CubicWeb and cubes while keeping my cube in my user directory, let's say ~src/cubes. It can be done by setting the following environment variables:

    CW_CUBES_PATH=~/src/cubes
    CW_MODE=user
    

    The new cube, holding custom code for this web site, can now be created using:

    cubicweb-ctl newcube --directory=~/src/cubes sytweb
    

    Step 2: pick building blocks into existing cubes

    Almost everything I want to represent in my web-site is somewhat already modelized in existing cubes that I'll extend for my needs:

    • folder, containing Folder entity type, which will be used as both 'album' and a way to map file system folders. Entities are added to a given folder using the filed_under relation.
    • file, containing File and Image entity type, gallery view, and a file system import utility.
    • zone, containing the Zone entity type for hierarchical geographical zones. Entities (including sub-zones) are added to a given zone using the situated_in relation.
    • person, containing the Person entity type plus some basic views.
    • comment, providing a full commenting system allowing one to comment entity types supporting the comments relation by adding a Comment entity.
    • tag, providing a full tagging system as an easy and powerful way to classify entities supporting the tags relation by linking the to Tag entities. This will allow navigation into a large number of pictures.

    Ok, now I'll tell my cube requires all this by editing cubes/sytweb/__pkginfo__.py:

    __depends_cubes__ = {'file': '>= 1.2.0',
                         'folder': '>= 1.1.0',
                         'person': '>= 1.2.0',
                         'comment': '>= 1.2.0',
                         'tag': '>= 1.2.0',
                         'zone': None,
                         }
    __depends__ = {'cubicweb': '>= 3.5.10',
                   }
    for key,value in __depends_cubes__.items():
        __depends__['cubicweb-'+key] = value
    __use__ = tuple(__depends_cubes__)
    

    Notice that you can express minimal version of the cube that should be used, None meaning whatever version available.

    Step 3: glue everything together in my cube's schema

    from yams.buildobjs import RelationDefinition
    
    class comments(RelationDefinition):
        subject = 'Comment'
        object = ('File', 'Image')
        cardinality = '1*'
        composite = 'object'
    
    class tags(RelationDefinition):
        subject = 'Tag'
        object = ('File', 'Image')
    
    class filed_under(RelationDefinition):
        subject = ('File', 'Image')
        object = 'Folder'
    
    class situated_in(RelationDefinition):
        subject = 'Image'
        object = 'Zone'
    
    class displayed_on(RelationDefinition):
        subject = 'Person'
        object = 'Image'
    

    This schema:

    • allows to comment and tag File and Image entity types by adding the comments and tags relations. This should be all we have to do for this feature since the related cubes provide 'pluggable section' which are automatically displayed in the primary view of entity types supporting the relation.
    • adds a situated_in relation definition so that image entities can be geolocalized.
    • add a new relation displayed_on relation telling who can be seen on a picture.

    This schema will probably have to evolve as time goes (for security handling at least), but since the possibility to change and update the schema evolving is one of CubicWeb features (and goals), we won't worry and see that later when needed.

    Step 4: creating the instance

    Now that I have a schema, I want to create an instance of that new 'sytweb' cube, so I run:

    cubicweb-ctl create sytweb sytweb_instance
    

    hint: if you get an error while the database is initialized, you can avoid having to reanswer to questions by running

    cubicweb-ctl db-create sytweb_instance
    

    This will use your already configured instance and start directly from the database creation step, thus skipping questions asked by the 'create' command.

    Once the instance and database are fully initialized, run

    cubicweb-ctl start sytweb_instance
    

    to start the instance, check you can connect on it, etc...

    Next times

    We will customize the index page, see security configuration, use the Bytes FileSystem Storage... Lots of cool stuff remaining :)

    Next post : security, testing and migration


  • Fun with graphs in apycot

    2010/03/24 by Arthur Lutz

    Yesterday I had a little quick fun with apycot in the train, using the existing plots infrastructure I managed to quickly add a few graphs to the application. I only had an old dump of our apycot for mercurial (http://apycot.hg-scm.org/) so the timespan is not huge, but I like it anyway! Here are some dev screenshots while you wait for this feature your your application... The pylint grades where pretty constant so I'm not including that graph.

    http://www.cubicweb.org/file/779761?vid=download http://www.cubicweb.org/file/779768?vid=download

    Now, I have to make solid code and integrate it properly.


  • CubicWeb 3.7 released

    2010/03/19

    Hi there !

    I'm pleased to announce the 3.7 release of CubicWeb, after a much shorter development cycle than for the 3.6...

    But it still have some interesting changes:

    • NOW DEPENDS ON PYTHON 2.5
    • use the newly created logilab.database package (you'll have to install it as well as upgrade logilab.common and rql)
    • proper behaviour on the repository side of cubiweb:
      • dropped unsafe_execute, execute is now unsafe by default in hooks and operations. You can still explicitly control security using the enabled_secury context manager
      • proper transaction hooks control using the hooks_control context manager
    • started some transaction undo support (only undo of deletion supported right now)
    • various other bug fixes and improvments

    Notice the 3.6 branch will still be maintained for some time.

    Enjoy!


  • Continuous Integration platform for Mercurial with apycot

    2010/03/15 by Arthur Lutz

    Since the mercurial 1.5 sprint Pierre-Yves has been working on improving Continuous Integration for Mercurial. All developers are encouraged to run the test suites and code quality checkers but it's no always feasible to test every cases, different OS, different python versions, strange test dependencies, slow coverage run, etc. Moreover it's generally useful to keep track of the results of previous tests, especially for benchmarks.

    At http://apycot.hg-scm.org/ you will find a production setup that now runs several variants of the tests-suite for all official repo and checks code style and documentation. Notification by email or RSS is available. For more details check out the FAQ.

    apycot is open source and uses the cubicweb platform, if you want to set up one for your project, check out the step by step documentation.

    http://www.cubicweb.org/file/749160?vid=download

  • CubicWeb 3.6 is (almost) out!

    2010/02/10 by Sylvain Thenault

    And that's great news, after several months of development (things started moving in the beginning of august 2009...), it should be available on our Debian repositories and ftp site in the next few hours.

    So, we can say this release contains a (too) large set of improvements and refactorings. I'll talk about the most important ones here.

    Appobject/Entity classes namespace cleanup

    First of all, the namespace cleanup... 3.6 is a step towards cleaning the entity classes (hence more generally appobject), which are used for a lot of things, making it impossible to tell for sure what could be used or not as an attribute or relation name. We decided to declare identifiers starting with \_cw or cw\_ reserved for the core classes. A lot of methods have been deprecated to cleanup the base appobject class namespace. The remaining methods on entity classes will be removed in future version, by the introduction of an ORM for database related methods, and by the (most probable) introduction of ZCA adapters for other aspects. The most notable renaming are:

    • .req -> ._cw
    • .rset -> .cw_rset
    • .row -> .cw_row
    • .col -> .cw_col

    This is probably what you'll see first when upgrading to 3.6: a huge stack of deprecation warnings on your screen :)

    Another step towards a nice and powerful form system

    • cleaner reponsibilities separation between form, field and widget

    • fields and widgets are now responsible for handling POSTed values (the editcontroller was handling this, making things really unflexible). The editcontroller has been rewritten and now properly gets values from fields. Another benefit is that you can now easily have a widget handling multiple inputs (see the new datetime picker for instance, or the custom widget for Bookmark.path)

    • refactored automatic forms:

      • rewrite 'generic relations' as a field
      • inlined forms are now encapsulated into a field

      so you get much more control on these parts of automatic forms by using mechanism provided generally by fields

      • clearer form relations tags: removed autoform_is_inlined, more understandable autoform_field_section

    Hooks refactoring

    Hooks are now regular appobjects, with selectors (don't forget to reuse Hook.__select__, remember that !). They should simply implement __call__ with no argument (well, only self) and will get info previously passed as argument as instance attributes, according to the matching event.

    Test API cleanup

    EnvBasedTC, ControllerTC, WebTest, RepoBasedTC are all gone. Simply use CubicWebTC, with an unified API similar to what you use in cubicweb-ctl shell and in usual development.

    The Bytes File System Storage

    You can now specify a custom storage for attributes of entities stored in the system source. This mechanism is used to provide a way to store Bytes attributes (such as File.data for instance) as files on the file-system instead of BLOBs in the database. You can configure which attributes should use this storage for your instance and then everything is transparent.

    Schema definition changes (yams 0.27)

    In your schema definition file:

    • "symetric" should be correctly spelled "symmetric" :)
    • "permissions" was renamed to "__permissions__"

    Also, permissions for relations are now supported per definition, not per type, at the cost of a visible impact when writing/reading the schema.

    Note about backward compatibility

    We worked hard to keep backward compatibility, but you shouldn't upgrade to 3.6 without checking that everything is fine... Check notably:

    • forms, if you're using custom forms by overriding internal methods
    • import for date functions from cubicweb.utils (they moved to logilab.common.date)

    And also

    CubicWeb 3.6 comes with a set of 37 cubes "3.6"-ready to avoid too much warnings!

    Enjoy!


  • CubicWeb documentation mini-sprint report

    2010/02/10 by Sylvain Thenault

    We held a one day sprint last week in our Paris office, trying to improve CubicWeb's documentation.

    There is a huge work to do on this, much more than we can do on a one day sprint, even with many people. But you have to begin with something :)

    So, after a quick meeting to define priorities:

    • Stéphanie, Charles and later Sandrine (from her US home-office), began to add some documentation and screenshots to cubes. They started with the following cubes: addressbook, person, basket, tag, folder, forgotpwd, forge, tracker, vcsfile, keyword, blog and comment.
    • Julien explored sphinx abilities to build the index and extract docstrings. He applied this to improve the documentation of selectors.
    • Adrien (ach) and Celso, our friend from Mexico, tackled the task to improve the tutorial from a beginner's point of view.
    • Arthur added some pieces of documentation found in our intranet, mailing-list...
    • Pyves worked on a cubicweb-ctl command to generate schema images (png) for cubes, to include them in the cube's documentation.
    • Adrien (adim) and I helped the various teams.

    Huum, I think I did not forgot anyone...

    If there is still a lot to do (we need more doc sprints, stay tuned), this is really a nice start! This site should soon be updated to include more valuable cubes description and online documentation extracted from the contributed doc.


  • CubicWeb documentation sprint in feb. 2010

    2010/01/22 by Nicolas Chauvat
    http://farm4.static.flickr.com/3042/2871708248_950831962c_s.jpg

    On February 2nd, 2010 Logilab will host in its head offices a one-day sprint dedicated to the improvement of the CubicWeb documentation.

    Get in touch with Logilab if you want to participate in person or via the net: contact at logilab dot fr.

    Photo by Adam Hyde from the FLOSS blog


  • MS SQL Server backuping gotcha

    2010/01/19

    While working on the port of CubicWeb to the Windows platform, including supporting MS Sql Server as the database backend, I got bitten by a weird behavior of that database engine. When working with cubicweb, most administrations command are wrappped by the cubicweb-ctl utility and database backups are performed by running cubicweb-ctl db-dump <instancename>. If the instance uses PostgreSQL as the backend, this will call the pg_dump utility.

    When porting to Sql Server, I could not find such a utility, but I found that Transact SQL has a BACKUP DATABASE command, so I was able to call it using Python's pyodbc module. I tested it interactively, and was satisfied with the result:

    >>> from logilab.common.db import get_connection
    >>> cnx = get_connection(driver='sqlserver2005', database='mydb', host='localhost', extra_args='autocommit;trusted_connection')
    >>> cursor = cnx.cursor()
    >>> cursor.execute('BACKUP DATABASE ? TO DISK = ?', ('mydb', 'C:\\Data\\mydb.dump'))
    >>> cnx.close()
    

    However, testing that very same code through cubicweb-ctl produced no file in C:\\Data\\. To make a (quite) long story short, the thing is that the BACKUP DATABASE command is asynchronous (or maybe the odbc driver is) and the call to cursor.execute(...) will return immediately, before the backup actually starts. When running interactively, by the time I got to type cnx.close() the backup was finished but when running in a function, the connection was closed before the backup started (which effectively killed the backup operation).

    I worked around this by monitoring the size of the backup file in a loop and waiting until that size gets stable before closing the connection:

    import os
    import time
    from logilab.common.db import get_connection
    
    filename = 'c:\\data\\toto.dump'
    dbname = 'mydb'
    cnx = get_connection(driver='sqlserver2005',
                         host='localhost',
                         database=dbname,
                         extra_args='autocommit;trusted_connection')
    cursor = cnx.cursor()
    cursor.execute("BACKUP DATABASE ? TO DISK= ? ", (dbname, filename,))
    prev_size = -1
    err_count = 0
    same_size_count = 0
    while err_count < 10 and same_size_count < 10:
        time.sleep(1)
        try:
            size = os.path.getsize(filename)
            print 'file size', size
        except OSError, exc:
            err_count +=1
            print exc
        if size > prev_size:
            same_size_count = 0
            prev_size = size
        else:
           same_size_count += 1
    cnx.close()
    

    I hope sharing this will save some people time...

    Note: get_connection() comes from logilab.common.db which is a wrapper module which tries to simplify writing code for different database backends by handling once for all various idiosyncrasies. If you want pure pyodbc code, you can replace it with:

    from pyodbc import connect
    cnx = connect(driver='SQL Server Native Client 10.0',
                  host='locahost',
                  database=dbname,
                  trusted_connection='yes',
                  autocommit=True)
    

    The autocommit=True part is especially important, because BACKUP DATABASE will fail if run from within a transaction.


  • Distributed scalable architecture using CubicWeb

    2010/01/14 by Arthur Lutz

    Here is a small example of one the things you can do with cubicweb's scalable architecture when serving a large number of users.

    http://www.cubicweb.org/file/619085?vid=download

    Obviously you can easily add machines hosting CubicWeb to the middle bit to scale up. Adding multiple postgres servers is possible but more tricky. In a later blog I will also show a way of split CubicWeb servers onto multiple servers (separate the web engine from the data repository part). Debian is one of the possible host systems, you can use something else, it's just easier with debian...

    If you want a more detailed explanation of how we setup such an environment, please comment and we'll try to find the time to document it.

    As a systems administrator, I can then enjoy the use of the following tools :

    • clusterssh - to access all machines at once and do common task by only typing it once (a must!)
    • htop - to monitor resources in a nicer way than the simple top
    • iotop - to monitor input/output load
    • varnishist - to check varnish is properly caching some content
    • apachetop - to watch in real time what is being accessed on the apache server
    • jnettop - to watch network flows
    • apt-get (on debian) to install all this in a a few simple commands...

  • CubicWeb 3.6 sprint report

    2009/12/14 by Sylvain Thenault

    Last week we held a cubicweb sprint in our new Paris office !

    We were a nice number of people: 7 from the Logilab's crew, including Sandrine, our US representative, Celso and Carlos from Mexico, plus some others guests and colleagues working on (cubicweb based of course) customer projects.

    The objective of the sprint was to kick out the 3.6 version of cubicweb, a big refactoring release started by Adrien and I a few months ago. Unfortunatly we had been preempted by some other projects and the cubicweb development branch was simply painfully following changes done in the stable branch.

    Also, we decided to start using mq as a basis for code review. The sprint was a nice opportunity to test and see if it was actually usable for both developer and code reviewer. But more on this latter :)

    The tasks to achieve to get this release out were:

    1. resurrect the default branch after 3 months of nasty bugs introduced by simply merging from the stable branch without any time to test
    2. update main cubes to the new test / uicfg / hooks / members api
    3. finish the editcontroller (which handle post of most web forms) refactoring
    4. finish the relation permissions change, including migration
    5. update the documentation
    6. test real applications

    Of course this was ambitious :) Among those point 0. and 1. and 3. took us much more time than I expected. The editcontroller work (2.) has not been finished yet, and we didn't find any time for the documentation (4.).

    Besides this, everyone (well, me at least ;) enjoyed its time while working hard all together in our new meeting room! The 3.6 version still needs a little work before being released, but the development branch is definitly back, with a great bunch of cubes ready. Among them : comment, tag, blog, keyword, tracker, forge, card, nosylist, etc...

    So many thanks to everyone, and particularly to our Mexican friends Carlos and Celso... Tequila! ;)

    By the way the good news is that we plan to do more sprints like this now that we've some room for it!


  • Customizing search box with magicsearch

    2009/12/13 by Adrien Di Mascio

    During last cubicweb sprint, I was asked if it was possible to customize the search box CubicWeb comes with. By default, you can use it to either type RQL queries, plain text queries or standard shortcuts such as <EntityType> or <EntityType> <attrname> <value>.

    Ultimately, all queries are translated to rql since it's the only language understood on the server (data) side. To transform the user query into RQL, CubicWeb uses the so-called magicsearch component which in turn delegates to a number of query preprocessor that are responsible of interpreting the user query and generating corresponding RQL.

    The code of the main processor loop is easy to understand:

    for proc in self.processors:
        try:
            return proc.process_query(uquery, req)
        except (RQLSyntaxError, BadRQLQuery):
            pass
    

    The idea is simple: for each query processor, try to translate the query. If it fails, try with the next processor, if it succeeds, we're done and the RQL query will be executed.

    Now that the general mechanism is understood, here's an example of code that could be used in a forge-based cube to add a new search shortcut to find tickets. We'd like to use the project_name:text syntax to search for tickets of project_name containing text (e.g pylint:warning).

    Here's the corresponding preprocessor code:

    from cubicweb.web.views.magicsearch import BaseQueryProcessor
    
    class MyCustomQueryProcessor(BaseQueryProcessor):
        priority = 0 # controls order in which processors are tried
    
        def preprocess_query(self, uquery, req):
            """
            :param uqery: the query as sent by the browser
            :param req: the standard, omnipresent, cubicweb's req object
            """
            try:
                project_name, text = uquery.split(':')
            except ValueError:
                return None # the shortcut doesn't apply
            return (u'Any T WHERE T is Ticket, T concerns P, P name %(p)s, '
                    u'T has_text %(t)s', {'p': project_name, 't': text})
    

    The code is rather self-explanatory, but here's a few additional comments:

    • the class is registered with the standard vregistry mechanism and should be defined along the views
    • the priority attribute is used to sort and define the order in which processors will be tried in the main processor loop
    • the preprocess_query returns None or raise an exception if the query can't be processed

    To summarize, if you want to customize the search box, you have to:

    1. define a new query preprocessor component
    2. define its priority wrt other standard processors
    3. implement the preprocess_query method

    and CubicWeb will do the rest !


  • Using gettext on windows

    2009/12/01
    http://www.gnu.org/graphics/gnu-head-sm.jpg

    CubicWeb relies on gnu gettext for its translation management. However, the binary installers easily found for gettext (such as the one in python(x,y)) are for older versions, and compiling it is not that easy (especially in the Python world where people do not necessarily have a C compiler at hand).

    We did the job and a binary installer for gnu gettext 0.17 is available on our ftp server.


  • Browsing the Semantic Web

    2009/10/31 by Nicolas Chauvat
    http://www.cubicweb.org/file/502157?vid=download

    Now that the Web of Data has become a reality, innovative applications are springing up everywhere. Here is a selection of web apps that help you browse the semantic web.

    • Parallax is a faceted browser that is demonstrated by displaying the content of Freebase.
    • Neofonie demonstrates its faceted browser by displaying the content of DBpedia at dbpedia.neofonie.de
    • VisiNav is a search engine that allows to refine searches in a way that reminds of facets.
    • Falcons is a search engine that indexes RDF data.
    • Sindice is a search engine that indexes RDF data as well as data extracted from Microformats. It offers public Sindice API that can be used to retrieve the search results as RDF, json or Atom.
    • SameAs is a service that returns all the equivalent URIs for a search term or a given URI.
    • When you enter search terms, Sig.ma collates the data from the resources included in the results of a search on Sindice.
    • When you publish your product data according to the GoodRelations ontology, informations like the price show up in Yahoo's search results.

    More and more services will appear in the coming months that make use of these new resources. Just for tagging, you may look at CommonTag, Zemanta and OpenCalais and imagine new ways to automate and facilitate the process of publishing information on the web.


  • Comparing CubicWeb with Drupal plus CCK extension

    2009/10/29 by Nicolas Chauvat
    http://www.cubicweb.org/file/502151?vid=download

    Drupal is a CMS written in PHP that is getting more and more visibility in the Semantic Web crowd. Several researchers from DERI have been using it as a test bed for their research projects and developed extensions to showcase their ideas. It is for example used to build the Semantic Web Dog Food site that archives the semantic web conferences and publishes them as Linked Open Data. The URL for this year's ISWC is http://data.semanticweb.org/conference/iswc/2009

    This led me to read more about Drupal than I had had the incentive before. I have not had time to give it a try, but I skimmed the documentation and will try to compare it with CubicWeb from a software architecture point of view.

    Drupal defines a Node as an information item. The CCK (aka Content Construction Kit) can be used to define new types of Nodes thru a web interface. Nodes and the bits and pieces used to display them as HTML are not packed together in components. The Features extension is planning on getting this bits packaged.

    If you are a Drupal user/developer and think I am not being fair to Drupal, please comment below.

    On the other hand, CubicWeb has implemented very early the concept of reusable component. What is called a Node in Drupal is an Entity in CubicWeb. By design, CubicWeb does not have a web interface to define entities. The data model is part of the code. To efficiently maintain applications in production, changes to the data model must be tracked with changes to the code. Data model changes imply migration procedures. In CubicWeb, all of this is versionned and made part of the components. Where Drupal needs to grow extensions like CCK and Features, CubicWeb has more advanced possibilities by design, for example the ability to develop featurefull applications by assembling components.

    This was a very short comparison. I'm looking forward to getting a chance of discussing it with knowledgeable Drupal hackers.


  • Relase early, release often

    2009/10/05 by Arthur Lutz

    Looking at the releases of the CubicWeb projects for the month of September alone, I think we can conclude that we are applying the Agile Software Development principle quite closely.

    http://farm4.static.flickr.com/3025/2732378117_cdd948fd1d_m.jpg
    • 11 releases of the cubicweb framework (now in stable and unstable flavors) : 3.5.2, 3.5.1, 3.5.0, 3.4.11, , 3.4.9, 3.4.8, 3.4.7, 3.4.6, 3.4.5, 3.4.4, 3.4.3
    • 3 releases of cubicweb-vcsfile
    • 4 releases of cubicweb-forge
    • 2 releases of cubicweb-drh
    • 2 releases of cubicweb-workorder
    • 1 release of cubicweb-conference, cubicweb-tracker, cubicweb-registration, cubicweb-timesheet, cubicweb-workcase, cubicweb-task, cubicweb-expense, cubicweb-calendar, cubicweb-invoice, cubicweb-nosylist, etc.

    Hope you can keep-up or use the stable versions...

    photo by kennymatic under creative commons


  • Running CubicWeb on Windows

    2009/09/08

    This was not supported (and still isn't officially). But rumors have been circulating about a port of CubicWeb on Windows.

    http://pc-astuces.seebz.net/images/logo-windows-small.jpg

    I can confirm that there is some truth in this. A few changesets have been circulating, to be merged in the official repository any time now, enabling one to run CubicWeb on Windows. Support for running CubicWeb as a Windows service is not available yet, but should become available in the next few weeks.

    Update: check out the source of the 3.5 branch, and you will be able to create and start instances on Windows. Use cubicweb-ctl start -D for non daemon start.


  • Sparkles everywhere, CubicWeb gets fizzy

    2009/07/28 by Adrien Di Mascio
    http://www.logilab.org/file/9845/raw/sparkling.jpg

    Last week, we finally took a few days to dive into SPARQL in order to transform any CubicWeb application into a potential SPARQL endpoint.

    The first step was to get a parser. Fortunately the w3c provides a grammar definition and around 200 test cases. There was a few interesting options around there: we tried to reuse rdflib, rasqal, the sparql.g version designed for antlr3 and SimpleParse but after two days of work, we had nothing that worked well enough. We decided it was not worth it and switched to yapps since we knew yapps and rql already had a dependency on it.

    Maybe we'll consider changing the parser at some point later but the priority was to get something working as soon as we could and we finally came up with a version of fyzz passing 90% of the W3C test suite (of course, there might be some false positives).

    Fyzz parses the SPARQL query and generates something we decided to call an AST although it's still a bit rough for now. Fyzz understands simple triples, distincts, limits, offsets and other basic functionalities.

    Please note that fyzz is totally independent of cubicweb and it can be reused by any project.

    Here's an example of how to use fyzz:

    >>> from fyzz.yappsparser import parse
    >>> ast = parse("""PREFIX doap: <http://usefulinc.com/ns/doap#>
    ... SELECT ?project ?name WHERE {
    ...    ?project a doap:Project;
    ...         doap:name ?name.
    ... }
    ... ORDER BY ?name LIMIT 5 OFFSET 10
    ... """)
    >>> print ast.selected
    [SparqlVar('project'), SparqlVar('name')]
    >>> print ast.prefixes
    {'doap': 'http://usefulinc.com/ns/doap#'}
    >>> print ast.orderby
    [(SparqlVar('name'), 'asc')]
    >>> print ast.limit, ast.offset
    5 10
    >>> print ast.where
    [(SparqlVar('project'), ('', 'a'), ('http://usefulinc.com/ns/doap#', 'Project')),
     (SparqlVar('project'), ('http://usefulinc.com/ns/doap#', 'name'), SparqlVar('name'))]
    

    This AST is then processed and transformed into a RQL query which can finally be processed by CubicWeb directly.

    Here's what can be done in cubicweb-ctl shell session (of course, this can also be done in the web application) of our forge cube:

    >>> from cubicweb.spa2rql import Sparql2rqlTranslator
    >>> query = """PREFIX doap: <http://usefulinc.com/ns/doap#>
    ... SELECT ?project ?name WHERE {
    ...    ?project a doap:Project;
    ...         doap:name ?name.
    ... }
    ... ORDER BY ?name LIMIT 5 OFFSET 10
    ... """
    >>> qinfo = translator.translate(query)
    >>> rql, args = qinfo.finalize()
    >>> print rql, args
    Any PROJECT, NAME ORDERBY NAME ASC LIMIT 5 OFFSET 10 WHERE PROJECT name NAME, PROJECT is Project {}
    

    From the above example, we can notice two things. First, for cubicweb to understand the doap namespace, we have to declare the correspondance between the standard doap vocabulary and our internal schema, this is done with yams.xy:

    >>> from yams import xy
    >>> xy.register_prefix('http://usefulinc.com/ns/doap#', 'doap')
    >>> xy.add_equivalence('Project', 'doap:Project')
    >>> xy.add_equivalence('Project name', 'doap:Project doap:name')
    

    Secondly, for now, we notice that the case is not preserved during the transformation : ?project becomes PROJECT in the rql query. This is probably something that we'll need to tackle quickly.

    We've also add a few views in CubicWeb to wrap that and it will be available in the upcoming version 3.4.0 and is already available through our pulic mercurial repository.

    The door is now open, the path is still long, stay tuned !

    image under creative commons by beger (original)


  • CubicWeb at BayPiggies/OSCON in July 2009

    2009/07/14 by Nicolas Chauvat
    http://www.logilab.org/file/9631?vid=download

    I am pleased to announce that CubicWeb will be presented during a BayPIGgies meeting that will exceptionally take place in the OSCON conference building as the closing event of the Bird of Feathers on July 23rd at 8pm.

    Joins us to get to know more about CubicWeb.

    Read the report.


  • INSEE, XML and RDF

    2009/07/06 by Nicolas Chauvat
    http://insee.fr/fr/css/images/logo_insee.gif

    I discovered that the French Institute for Statistics and Economic Studies (INSEE) has published part of its data as XML and RDF:

    We will try to put that data to good use.


  • Graphing version progress

    2009/07/06 by Arthur Lutz

    As you might have noticed we've upgraded http://www.logilab.org and http://www.cubicweb.org to CubicWeb 3.3 and a bunch of cubes were upgraded too. We can now benefit from a few cool bugfixes and features on those two forges.

    One of them I like and wish to mention is the graphing of a project's progress as a Burn Down Chart, you can see an example below. We're using the some jQuery magic here, and so you can roll over the mouse to get more info on the graph... (not on the screenshot below). This type of graph is generated on all the version views... This is particularly useful on some of our extranets to see the progress of a version (and if tickets were added along the way).

    http://www.cubicweb.org/file/344424?vid=download

    For the coders out there you can check out cubicweb/web/views/plots.py and the example in the forge cube.


  • News from Europython 2009

    2009/07/02
    http://www.europython.eu/images/europython_logo.png

    Nicolas gave a talk at Europython2009 about CubicWeb. Reinout Van Rees posted his notes about the talk on his blog. Thanks Reinout. You may also read Nicolas' slides and watch his lightning talk.


  • What's new in cubicweb 3.3

    2009/06/24 by Arthur Lutz

    After the CubicWeb 3.2 blackout, the release early, release often mantra strikes back and CubicWeb 3.3 is out ! A few bugs were fixed, mainly migration scripts bug, and some new functionalities were added among which the long awaited standard plotting feature. We've added piechart support (with gchartwrapper) and standard plots with flot.

    under creative commons by jared

    Features

    • jquery has been updated to the latest 1.3.x version
    • plotting facilities using Flot and Google Chart have been added (replacing sometimes similar facilities using matplotlib)
    • the i18n command names have been changed
    • also a non-negligible amount of internal refactorings occurred, but this should be quite transparent

    Bugs fixed

    • problems with migrations using SQL has been fixed
    • bugs with the multi-source planner have been fixed
    • problems with synchronize-schema and not-null constraints

    photo licenced under CreativeCommons by jared


  • CubicWeb for DBPedia and OpenLibrary at PyConFr'09

    2009/06/05 by Nicolas Chauvat
    http://www.cubicweb.org/file/343602?vid=download

    I presented CubicWeb at the French Python Conference held in Paris last week-end. Check out the slides and the video. See also my recent post Fetching book descriptions and covers on logilab.org.

    The code used during the demo uses the brand new RangeFacet, DateRangeFacet and HasRelationFacet brought by CubicWeb 3.3 and is available in the cubes dbpedia and book. We will put the demos online in a couple weeks once we get a new server with more horsepower. Help would be welcome to set them up as Amazon EC2 or Eucalyptus instances.


  • Cubicweb 3.2 : what's new

    2009/06/03 by Aurelien Campeas
    http://farm4.static.flickr.com/3045/2585844966_05f617cd92_m.jpg

    Cubicweb has experienced a rather large shakeup. Some things needed major restructuration, and that is why you have been left with few releases in the past few weeks. All the cubes available at http://www.cubicweb.org/project have been updated accordingly.

    Version 3.2 brings us considerable improvements for:

    Form construction

    Cubicweb has had for long a nice system of forms smart enough to build themselves out of one cube's schema and some programmer-provided hints (or 'relation tags') to fine-tune things.

    It was not easy however to customize these forms nor to build new ones from scratch.

    So the new form systems draws from django-forms flexibility and style, keeping all the automatic goodness, and also make it quite easy now to build or customize forms at will.

    This is the area were backwards compatibility is mostly gone. Custom forms will have to be rewritten. Don't be angry about that, the forms overhaul was long overdue, and from now it will only move in small evolutionary, well-mannered steps.

    Relation tags

    Along with the form subsystem is the __rtags__ mechanism substantially updated and made more extensible. The __rtags__ were quite incorrectly attached to entities class at the ORM level instead of being related to views and forms. The cubicweb.web.uicfg module now provides a comprehensive catalog of relation tags instances allowing automatic forms and views customisation in a nicely declarative manner.

    Cubicweb 3.2 still remains compatible with the old __rtags__.

    View selection/filtering

    Cubiweb has also had for long a nice mechanism to filter views applicable to a given result set, the selector system. Various base classes were provided to hide selectors from the programmer and it had grown a little messy.

    Selectors now have a nicer declarative feeling and the framework does not try to hide them. Quite the opposite: writing, maintaining and using selectors is now a breeze, and the base classes are gone. More is less !

    However Cubicweb 3.2 remains backward compatible with the old selectors. Runtime warnings will help you track these and adapt as you see fit.

    Other features

    On the smaller features side, worth mentioning are:

    • new RichString attribute type in schema definitions, that simplifies format and encoding management,
    • inline relation edition is now possible (it was formerly limited to attributes) with 'reledit' view,
    • workflow definition has been simplified,
    • web/views has been somewhat cleanup up and reorganized,
    • automatic registration of app objects can now be switched to manual mode (no more hairy hard-to-debug registerer mechanism),
    • a generic SIOC view,
    • a view synthetizing permissions across a whole app.

    We hope you enjoy this release! The cubicweb development team.

    photo by jared under creative commons


  • Some new standard facets on the way

    2009/05/29 by Adrien Di Mascio

    CubicWeb has this really nice builtin facet system to define restrictions filters really as easily as possible.

    We've just added two new kind of facets in CubicWeb :

    • The RangeFacet which displays a slider using jquery to choose a lower bound and an upper bound. The RangeWidget works with either numerical values or date values
    • The HasRelationFacet which displays a simple checkbox and lets you refine your selection in order to get only entities that actually use this relation.
    http://www.cubicweb.org/file/343498?vid=download

    Here's an example of code that defines a facet to filter musical works according to their composition date:

    class CompositionDateFacet(DateRangeFacet):
        # 1. make sure this facet is displayed only on Track selection
        __select__ = DateRangeFacet.__select__ & implements('Track')
        # 2. give the facet an id (required by CubicWeb)
        id = 'compdate-facet'
        # 3. specify the attribute name that actually stores the date in the DB
        rtype = 'composition_date'
    

    And that's it, on each page displaying tracks, you'll be able to filter them according to their composition date with a jquery slider.

    All this, brought by CubicWeb (in the next 3.3 version)


  • Cubicweb News 09.04

    2009/04/28 by Arthur Lutz

    In april a bunch of bugs have been corrected on the stable branch of cubicweb (3.1 series) and we've been working on the next generation series : 3.2. Here's a quick summary of what's been going on :

    • cubicweb (the framework) was released twice with 3.1.3 and 3.1.4 which fixed a few bugs in the querier and the management screens
    • cubicweb-blog 1.5.0 was released with some improvements to the graphical rendering
    • cubicweb-tag 1.4.5 was released with notable improvements to tag clouds (added colors and better scaling of tags).
    • cubicweb-file got a bugfix in 1.4.4
    • cubicweb-mailinglist got a bugfix 1.3.1.

    Next up, we are working on the 3.2.0 version of cubicweb with some particular focus on :

    • form generation
    • more explicit view registration (less magic)
    • simpler workflow definitions
    • js, css and ajax improvements

    Do not hesitate to try the development branch (named tls-sprint at the moment) or read the changes at http://www.logilab.org/hg/cubicweb


  • Cubicweb to be presented at French Linux Conference

    2009/03/31 by Arthur Lutz
    http://www.solutionslinux.fr/images/index_07.jpg

    The CubicWeb plateform will be on display at the French conference about linux "Solution Linux" hosted in Paris in the next 3 days. You can meet us at the System@tic stand or see us talk about it during a talk about Web2 this afternoon.

    More info in french on the Logilab.org Blog.


  • Profiling your CubicWeb instance

    2009/03/27 by Adrien Di Mascio

    If you feel that one of your pages takes more time than it should to be generated, chances are that you're making too many RQL queries. Obviously, there are other reasons but my personal experience tends to show this is first thing to track down. Luckily for us, CubicWeb provides a configuration option to log rql queries. In your all-in-one.conf file, set the query-log-file option:

    # web application query log file
    query-log-file=~/myapp-rql.log
    

    Then restart your application, reload your page and stop your application. The file myapp-rql.log now contains the list of RQL queries that were executed during your test. It's a simple text file containing lines such as:

    Any A WHERE X eid %(x)s, X lastname A {'x': 448} -- (0.002 sec, 0.010 CPU sec)
    Any A WHERE X eid %(x)s, X firstname A {'x': 447} -- (0.002 sec, 0.000 CPU sec)
    

    The structure of each line is:

    <RQL QUERY> <QUERY ARGS IF ANY> -- <TIME SPENT>
    

    Use the cubicweb-ctl exlog command to examine and summarize data found in such a file:

    adim@crater:~$ cubicweb-ctl exlog < ~/myapp-rql.log
    0.07 50 Any A WHERE X eid %(x)s, X firstname A {}
    0.05 50 Any A WHERE X eid %(x)s, X lastname A {}
    0.01 1 Any X,AA ORDERBY AA DESC WHERE E eid %(x)s, E employees X, X modification_date AA {}
    0.01 1 Any X WHERE X eid %(x)s, X owned_by U, U eid %(u)s {, }
    0.01 1 Any B,T,P ORDERBY lower(T) WHERE B is Bookmark,B title T, B path P, B bookmarked_by U, U eid %(x)s {}
    0.01 1 Any A,B,C,D WHERE A eid %(x)s,A name B,A creation_date C,A modification_date D {}
    

    This command sorts and uniquifies queries so that it's easy to see where is the hot spot that needs optimization.

    Having said all this, it would probably be worth talking about the fetch_attrs attribute you can define in your entity classes because it can greatly reduce the number of queries executed but I'll make a specific blog entry for this.

    I should finally mention the existence of the profile option in the all-in-on.conf. If set, this option will make your application run in an hotshot session and store the results in the specified file.


  • Presenting results with different views

    2009/03/22 by Nicolas Chauvat

    This article is part of the endless "you are never the only one experimenting with what sounds like a good idea". Just compare the following links:

    The MIT Simile project produced the Exhibit mega-js-widget:

    Google ran an experiment with alternate views for search results:

    • Location of PGA Tour tournaments
    • Evolution of nanotechnologies over time
    • Images in search results (click on Images on the right)

    CubicWeb has built-in support for applying views to a selection of objects:

    • Impressionism paintings in the museums of Normandy (click on the tabs)

  • Using email with CubicWeb

    2009/03/18 by Arthur Lutz

    You might have noticed here and there the mysterious user "mailbot" on cubicweb.org or logilab.org (both running the CubicWeb web app). Who is this user ?

    http://farm4.static.flickr.com/3244/2959912279_8446aa1abd_m.jpg

    Well, one of the cool features about cubicweb is that you can interact with it simply by using your email. When you are registered on a site, you can subscribe to a software project for example, from then on, you receive notifications of the new tickets and comments on the project. When you receive such a notification you can simply do an email reply to the new ticket or new comment, and cubicweb on the receiving end will import the content of your email to the website. When the content is imported that way, it's the mailbot doing the job.

    This is not rocket science, but it sure is useful. Follow the activity of the site by email and interact directly with comments and tickets from your mail client!

    image by husin.sani under creative commons


  • Migration in Python Web Frameworks ORMs

    2009/03/13 by Nicolas Chauvat

    Today, I felt like doing a quick tour of the migration features provided by the ORMs used by the Python web frameworks. I started with Django. South looks better than Django-evolution which looks much better than dmigrations which is very low level. I also had a look at SQLAlchemy.migrate, but again, that's too low level for me since I am looking to define migrations with the same vocabulary that is used for the data model, independently of the underlying database schema.

    http://south.aeracode.org/raw-attachment/wiki/Logo/logo-trac.png

    The features listed in the South documentation have all been in CubicWeb for some time, except dependencies and autodetection. In my opinion, the dependency feature is not needed when you already have a list of scripts ordered by number, which is the case in South and in CubicWeb. The autodetection feature is more interesting, but it is tricky to get right. CubicWeb migration mechanism has had some kind of autodetection for a long time, but it is limited to the part that is easy to get right, yet quite common and useful:

    • synchronizing properties of attributes and relationships (i.e. a Person.name becomes fulltextindexed or a has_portfolio relationship changes from 1-1 to 1-n)
    • synchronizing permissions
    http://farm2.static.flickr.com/1007/666142945_1d675bc2a7_m.jpg

    For other common tasks like adding or removing entities and attributes, high-level directives are provided like add_entity_type or remove_attribute.

    Up to now, not pushing autodetection of changes in the data model has been a deliberate choice, for diff'ing two models is complex and creating a migration path is even more difficult. Moreover, letting the ORM automatically overwrite local changes in the database schema can be harmful in some cases.

    In CubicWeb, the idea is that the developer knows better than the framework, so let him decide what's best and provide him with a concise vocabulary to write the migration scripts.

    photo by Tim in Sydney under creative commons.


  • Google Maps and CubicWeb

    2009/03/09 by Adrien Di Mascio
    http://maps.google.com/intl/fr_ALL/images/maps_logo_small_blue.png

    There is this so-called 'gmap-view' in CubicWeb, the question is: how to use it ?

    Well, first, no surprise, you have to generate an API key to be able to use google maps on your server (make sure your usage conforms the terms as defined by Google).

    Now, let's say you have defined the following schema:

    class Company(EntityType):
        name = String(required=True, maxsize=64)
        # ... some other attributes ...
        latitude = Float(required=True)
        longitude = Float(required=True)
    
    class Employee(EntityType):
        # ... some attributes ...
        works_for = SubjectRelation('Company', cardinality='1*')
    

    And you'd like to be able to display companies on a map; you've also got these nice icons that you'd wish to use as markers on the map. First thing, define those three icons as external resources. You can do that by editing your CUBE/data/external_resources file:

    SMALL_MARKER_ICON=DATADIR/small_company.png
    MEDIUM_MARKER_ICON=DATADIR/MEDIUM_company.png
    BIG_MARKER_ICON=DATADIR/big_company.png
    

    We're nearly done, now. We just have to make our entity class implement the cubicweb.interfaces.IGeocodable interface. Here's an example:

    from cubicweb.entities import AnyEntity
    from cubicweb.interfaces import IGeocodable
    
    class Company(AnyEntity):
        id = 'Company' # this must match the type as defined in your schema
        __implements__ = AnyEntity.__implements__ + (IGeocodable,)
    
        def size(self):
            return self.req.execute('Any COUNT(E) WHERE E works_for C, C eid %(c)s',
                                    {'c': self.eid})
    
        # this is a method of IGeocodable
        def marker_icon(self):
            size = self.size()
            if size < 20:
                return self.req_external_resource('SMALL_MARKER_ICON')
            elif size < 500:
                return self.req_external_resource('MEDIUM_MARKER_ICON')
            else:
                return self.req_external_resource('BIG_MARKER_ICON')
    

    That's it, you can now call the gmap-view on a resultset containing companies:

    rset = self.req.execute('Any C WHERE C is Company')
    self.wview(rset, 'gmap-view', gmap_key=YOUR_API_KEY)
    

    Further configuration is possible, especially to control the size of the map or the default zoom level.

    To be fair, I must say that in a real-life cube, chances are you won't be able to specificy directly latitude and longitude and that you'll only have an address. This is slightly more complex to do since you'll need to query a geocoding service (the google one for instance) to transform your address into latitude/longitude. This will typically be done in a hook

    Here is an screenshot of google maps on a production site, the museums in Normandy :

    http://www.cubicweb.org/file/229641?vid=download

  • What's new in CubicWeb 3.1.0

    2009/03/04 by Arthur Lutz
    http://www.cubicweb.org/file/212907?vid=download

    Here is a brief summary of what you get for the new CubicWeb 3.1.0 release. You could obviously go though the tickets on the version page, but here is the short version.

    What new features ?

    • a few OWL and Linked_Data functionalities
    • navigation is now more complete on search results
    • when installing a new cube that requires anonymous access (public site) the installer enables that access

    What bugs are fixed ?

    • a few things didn't work with opera and IE6
    • json controller conflicts solved
    • the newcube command is working again
    • facets don't get in the way of the association process anymore
    • and more...

    Hope you enjoy this version... to see what's coming next, you can check out the planned versions of CubicWeb : 3.1.1 and 3.2.0.


  • Using Facets in Cubicweb

    2009/02/25 by Adrien Di Mascio

    Recently, for internal purposes, we've made a little cubicweb application to help us organizing visits to find new office locations. Here's an excerpt of the schema:

    class Office(WorkflowableEntityType):
        price = Int(description='euros / m2 / HC / HT')
        surface = Int(description='m2')
        description = RichString(fulltextindexed=True)
        has_address = SubjectRelation('PostalAddress', cardinality='1?', composite='subject')
        proposed_by = SubjectRelation('Agency')
        comments = ObjectRelation('Comment', cardinality='1*', composite='object')
        screenshots = SubjectRelation(('File', 'Image'), cardinality='*1',
                                      composite='subject')
    

    The two other entity types defined in the schema are Visit and Agency but we can also guess from the above that this application uses the two cubes comment and addressbook (remember, cubicweb is only a game where you assemble cubes !).

    While we know that just defining the schema in enough to have a full, usable, (testable !) application, we also know that every application needs to be customized to fulfill the needs it was built for. So in this case, what we needed most was some custom filters that would let us restrict searches according to surfaces, prices or zipcodes. Fortunately for us, Cubicweb provides the facets (image) mechanism and a few base classes that make the task quite easy:

    class PostalCodeFacet(RelationFacet):
        id = 'postalcode-facet'             # every registered class must have an id
        __select__ = implements('Office')   # this facet should only be selected when
                                            # visualizing offices
        rtype = 'has_address'               # this facet is a filter on the entity linked to
                                            # the office thrhough the relation has_address
        target_attr = 'postalcode'          # the filter's key is the attribute "postal_code"
                                            # of the target PostalAddress entity
    

    This is a typical RelationFacet: we want to be able to filter offices according to the attribute postalcode of their associated PostalAdress. Each line in the class is explained by the comment on its right.

    Now, here is the code to define a filter based on the surface attribute of the Office:

    class SurfaceFacet(AttributeFacet):
        id = 'surface-facet'              # every registered class must have an id
        __select__ = implements('Office') # this facet should only be selected when
                                          # visualizing offices
        rtype = 'surface'                 # the filter's key is the attribute "surface"
        comparator = '>='                 # override the default value of operator since
                                          # we want to filter according to a minimal
                                          # value, not an exact one
    
        def rset_vocabulary(self, ___):
            """override the default vocabulary method since we want to hard-code
            our threshold values.
            Not overriding would generate a filter box with all existing surfaces
            defined in the database.
            """
            return [('> 200', '200'), ('> 250', '250'),
                    ('> 275', '275'), ('> 300', '300')]
    

    And that's it: we have two filter boxes automatically displayed on each page presenting more than one office. The price facet is basically the same as the surface one but with a different vocabulary and with rtype = 'price'.

    (The cube also benefits from the builtin google map views defined by cubicweb but that's for another blog).


  • Unittesting with CubicWeb

    2009/02/17 by Arthur Lutz

    In test driven developpement (TDD), you write the test before you write the code. On a web application, number of levels can be tested. Here are a few hints at how we manage some of the testing with CubicWeb.

    We use pytest (which is an extension of python's unittest framework available in logilab-common) to execute all tests across the cubes. Even in the core of cubicweb the tests are spread out across the server, web part, repository, common tools... so a simple pytest command crawls though all theses tests and runs them.

    http://www.sqlite.org/images/SQLite.gif

    The problem : One of the tricky things with testing CubicWeb is that the structure of the data is imported into the database (which enables us to easily modify the schema on running data), and that test data can be long to generate and fake for a web application that is used to talk to a proper database server (postgres). So we though of inserting test data into an sqlite database. After a bit of work on compatibility, it was up an running. But setting up that database was (and still is) quite long, testing was becoming way too long, TDD (with frequent testing) was becoming impossible.

    The solution : we ended up storing the sqlite database in a temporary file which is used up if it's not too old, TDD was back in the loop. So if you're developing for CubicWeb don't worry about those test/tmpdb files, on the contrary, that means you're running tests. For writing tests, check out the content about it in the book.


  • RSS for latest releases

    2009/01/28 by Arthur Lutz
    http://upload.wikimedia.org/wikipedia/commons/thumb/4/43/Feed-icon.svg/128px-Feed-icon.svg.png

    The Cubicweb Framework can give you an RSS feed of any selection that you make. When you master RQL (Relational Query Language - more on that coming soon) you can build yourself some cool RSS feeds to follow the site's activity.

    Here is one that we cooked up for you. The latest releases of packages on cubicweb, all the cubes, the framework releases straight to your RSS reader : subscribe here.


  • More CubicWeb releases last week

    2009/01/19 by Arthur Lutz

    We're still busy with the CubicWeb 3.0 releases. We did two releases of cubicweb last week : 3.0.2 and 3.0.3.

    These were mainly for bugfixes, particularly about how the multisource functionality was working.


  • CubicWeb 3.0.1 bugfix release

    2009/01/14 by Arthur Lutz

    Shortly after the release of CubicWeb under the GPL licence, we've release a quickfix version to correct a few bugs :

    • XHTML validity wasn't always there because of a bug in cutting parts of texts
    • cubicweb-ctl had a few things corrected
    • permissions on certain actions we're properly placed
    • a few bugfix in the generation of the configuration

    The new version is 3.0.1, you can see the corrected tickets here.


  • Collections des musées de Haute-Normandie

    2008/12/22 by Nicolas Chauvat

    Logilab announced that its most recent application went on-line on December 19th, 2008. It publishes the artwork collections of 41 museums of the Normandy region. It features a very simple user interface with facets for selecting items and tabs for choosing how to display the selected items. It uses the timeline widget from the Simile project as well as Google Maps to place the selected items in time and space.

    Visit Collections des Musées de Haute-Normandie.

    http://www.cubicweb.org/file/1241?vid=download

back to pagination (10 results)