subscribe to this blog

CubicWeb Blog

News about the framework and its uses.

We're going to PGDay France, the Postgresql Community conference

2013/06/11 by Arthur Lutz

A few people of the CubicWeb team are going to attend the French PostgreSQL community conference in Nantes (France) on the 13th of june.

We're excited to learn more about the following topics that are relevant to CubicWeb's development and features :

Obviously we'll pay attention to all the talks during the day. If you're attending, we hope to see you there.

OpenData meets the Semantic Web at WOD2013

2013/06/10 by Arthur Lutz

With a few people from Logilab we went to the 2nd International Workshop on Open Data (WOD), on the 3rd of june.

Although the main focus was an academic take on OpenData, a lot of talks were related to the Semantic Web technologies and especially LinkedData.

The full program (and papers) is on the following website. Here is a quick review of the things we though worth sharing.

  • privacy oriented ontologies :
  • interesting automations done to suggest alignments when initial data is uploaded to an opendata website
  • some opendata platforms have built-in APIs to get files, one example is Socrata :
  • some work is being done to scale processing of linked data in the cloud (did you know you could access ready available datasets in the Amazon cloud ? DBPedia for example )
  • the data stored in wikipedia can be a good source of vocabulary on certain machine learning tasks (and in the future, wikidata project)
  • there is an RDF extension to Google Refine (or OpenRefine), but we haven't managed to get it working out of the box,
  • WebSmatch uses morphological operators (erosion / dilation) to identify grids and zones in Excel Spreadsheets and then aligns column data on known reference values (e.g. country lists).

We naturally enjoyed the presentation made by Romain Wenz about with the unavoidable mention of Victor Hugo (and CubicWeb).

Thanks to the organizers of the conference and to the National French Library for hosting the event. gets the Stanford Prize for Innovation in Research Libraries

2013/03/01 by Nicolas Chauvat and Gallica just got awarded the Stanford Prize for Innovation in Research Libraries 2013. The CubicWeb community is very pleased to see that, which is built with CubicWeb, is being recognized at the top international level as leading innovation its domain! Read the comments of the judges for more details.

CubicWeb at Data Tuesday on Feb 26th 2013

2013/02/15 by Nicolas Chauvat

CubicWeb was showcased at Data Tuesday on Feb 26th 2013. The other presentations were interesting, especially, the soon-to-be-launched OpenMeteoData and the very useful scikit.learn.

CubicWeb rewarded at Dataconnexion 2013

2013/02/06 by Nicolas Chauvat

CubicWeb got rewarded yesterday at the award ceremony of the Dataconnexions 2013 contest.

Dataconnexions is a contest organized by Etalab, the organization part of the French State that is in charge of, that catalogs the open data published by the french administration.

Congratulations to all the developers and users of CubicWeb and welcome to the people who will join the CW community thanks to the media coverage we are now experiencing.

Read the announce to the press and the slides.

Logilab's roadmap for CubicWeb as of February 2013

2013/02/04 by Nicolas Chauvat

The Logilab team now holds a roadmap meeting every two months to plan its CubicWeb development effort. Here are the decisions that were taken on Feb 1st, 2013.

Version 3.17

This version should be published before the end of March and will finish all the things that are work in progress. It will include:

  • the refactoring necessary to introduce persistant sessions,
  • the shrinking of web/views: everything that does not deserve its own cube (like sioc, embed, geocoding, etc) will go into a cube named legacyui (this will open the door to squareui),
  • stop serving pages with "content-type: application/xhtml",
  • handling postgresql schemas (will require a new version of logilab.database),
  • a new logo.


Once the cube legacyui extracted (in version 3.17), it will be possible to move forward swiftly with squareui. Due to its other duties, one can not expect the core CW team to develop squareui. People interested will be in charge and ideally the squareui cube could be released when cubicweb 3.17 will be published.

Cleaning up the backlog

The lead CW developers will spend about 20% of their time cleaning up the ticket backlog at the forge (900 open tickets and 50 in progress !)

The first step will be to reduce the number of tickets "in progress", then to organize the open tickets and merge the duplicates.

Version 3.18

This version is due at the end of may 2013. It will include:

  • persisting sessions,
  • WSGI,
  • RESTfulness: support for HTTP verbs PUT / DELETE, enforcement of the semantics of GET / POST (may be difficult to maintain backward-compatibility)

Mid-term goals

The mid-term goals are:

  • possibility to add new base types (Array, HStore, Geometry, TSVector, etc.) that would use extensions from the SQL backend

  • FROM clause in rql queries

  • websockets

  • defining attribute on relations and defining "virtual" relations or rules:

    class Contribution(EntityType):
        author = SubjectRelation('Person', cardinality='1*', inlined=True)
        book = SubjectRelation('Book', cardinality='1*', inlined=True)
        role = SubjectRelation('Role', cardinality='1*', inlined=True)
    preface_writer = VirtualRelation('C is Contribution, C author S, C book O, '
                                     'C role R, R name "preface writer"')


    Any P WHERE B is Book, P preface_writer B

    Will we need a materialized view in the database, a standard relation maintained by hooks, rewrite the RQL on-the-fly ? Time will tell.

  • cards with logic (mustache js templates for example)

  • coffeescript ? brython ? javascript ? prototype something with CubicDB + WebService that outputs json + user interface in full javascript

  • package separately Cubic(Web)DB et CubicWeb ?

  • think about the overall architecture (using WSGI, persistent sessions, etc.), and find solutions that fit a distributed architecture (look at paste.deploy, circus, etc.)

  • clean up the javascript en web/data/*.js

  • configurable metadata, managing the size of the entities table

  • more SPARQL

  • namespaces for the data models of the cubes

As already said on the mailing list, other developers and contributors are more than welcome to share their own goals in order to define a roadmap that best fits everyone's needs.

Logilab's next roadmap meeting will be held at the beginning of April 2013.

What's new in CubicWeb 3.16

2013/01/23 by Aurelien Campeas

What's new in CubicWeb 3.16?

New functionalities

  • Add a new dataimport store (SQLGenObjectStore). This store enables a fast import of data (entity creation, link creation) in CubicWeb, by directly flushing information in SQL. This may only be used with PostgreSQL, as it requires the 'COPY FROM' command.

API changes

  • Orm: set_attributes and set_relations are unified (and deprecated) in favor of cw_set that works in all cases.

  • db-api/configuration: all the external repository connection information is now in an URL (see #2521848), allowing to drop specific options of pyro nameserver host, group, etc and fix broken ZMQ source. Configuration related changes:

    • Dropped 'pyro-ns-host', 'pyro-instance-id', 'pyro-ns-group' from the client side configuration, in favor of 'repository-uri'. NO MIGRATION IS DONE, supposing there is no web-only configuration in the wild.
    • Stop discovering the connection method through repo_method class attribute of the configuration, varying according to the configuration class. This is a first step on the way to a simpler configuration handling.

    DB-API related changes:

    • Stop indicating the connection method using ConnectionProperties.
    • Drop _cnxtype attribute from Connection and cnxtype from Session. The former is replaced by a is_repo_in_memory property and the later is totaly useless.
    • Turn repo_connect into _repo_connect to mark it as a private function.
    • Deprecate in_memory_cnx which becomes useless, use _repo_connect instead if necessary.
  • the "tcp://" uri scheme used for ZMQ communications (in a way reminiscent of Pyro) is now named "zmqpickle-tcp://", so as to make room for future zmq-based lightweight communications (without python objects pickling).

  • Request.base_url gets a secure=True optional parameter that yields an https url if possible, allowing hook-generated content to send secure urls (e.g. when sending mail notifications)

  • Dataimport ucsvreader gets a new boolean ignore_errors parameter.

Unintrusive API changes

  • Drop of cubicweb.web.uicfg.AutoformSectionRelationTags.bw_tag_map, deprecated since 3.6.

User interface changes

  • The RQL search bar has now some auto-completion support. It means relation types or entity types can be suggested while typing. It is an awesome improvement over the current behaviour !
  • The action box associated with table views (from has been transformed into a nice-looking series of small tabs; it means that the possible actions are immediately visible and need not be discovered by clicking on an almost invisible icon on the upper right.
  • The uicfg module has moved to web/views/ and ui configuration objects are now selectable. This will reduce the amount of subclassing and whole methods replacement usually needed to customize the ui behaviour in many cases.
  • Remove changelog view, as neither cubicweb nor known cubes/applications were properly feeding related files.

Other changes

  • 'pyrorql' sources will be automatically updated to use an URL to locate the source rather than configuration option. 'zmqrql' sources were broken before this change, so no upgrade is needed...
  • Debugging filters for Hooks and Operations have been added.
  • Some cubicweb-ctl commands used to show the output of msgcat and msgfmt; they don't anymore.

December 2012 CubicWeb Sprint Report

2012/12/21 by Nicolas Chauvat

For two days, on dec 13th/14th 2012, ten hackers gathered at Logilab to improve the user interface of CubicWeb. This hackathon was initiated by Crealibre. About a year ago, they started the Orbui project, a new user interface for CubicWeb based on the Bootstrap HTML/CSS framework.

Several projects at Logilab and Crealibre proved that Orbui was heading in the right direction, but that it had to fight with the default user interface of Cubicweb. Orbui makes different design/ergonomic choices and needs different HTML/CSS structure and Javascript components.

Sylvain published a roadmap back in may with a section titled "on the road to Bootstrap". After more than half a day of heated debate on the firts day, it was decided to follow the direction he pointed to. We started extracting from CubicWeb the default user interface and turning it into a set of cubes:

  • cubicweb-legacyui: css, views and templates extracted from CubicWeb 3.16, so as to provide full backward compatibility
  • cubicweb-bootstrap: empty cube with only bootstrap version 2.2.2 in data/
  • cubicweb-squareui: bootstrapified version of legacyui (slightly altered to benefit from the bootstrap css without breaking backward compatibility too hard)

At the end of the sprint, one could add_cube('squareui') on an existing application and keep it usable... and get "some kind of responsiveness" for free, thus proving that we were on the right track.

A lot of work is still ahead of us, but we have moved a few step forward towards the goal of making it easier to implement different UIs on top of CubicWeb 3.17.

For the curious, here is what the skeleton of legacyui.views.maintemplate (aka cw.web.views.maintemplate) looks like:

<body> (MainTemplate.template_body_header)
  <table id="header"> (HTMLPageHeader.main_header)
    for header in self.headers:
       <td id="header-{left,center,right}">
           render selected components(ctxcomponents, header-{left,center,right})
  <div id="stateheader">
     <div class="stateMessage"> HTMLPageHeader.state_header
  <div id="page"> MainTemplate.template_body_header
    <table id="mainLayout"> MainTemplate.template_body_header
      if boxes (selected components(ctxcomponents, left): MainTemplate.nav_column
        <td id="navColumnLeft">
          <div class="navboxes">
             render boxes
      <td id="contentColumn"> MainTemplate.template_body_header
         render selected components(rqlinput)
         render selected components(applmessages)
         if navtop (selected components(ctxcomponents, navtop):
           <div id="contentheader">
             render components
           <div class='clear'/>
         <div id="pageContent">
           if vtitle:
              <div class="vtitle" />
           if etypenavigation:
              render etypenavigation
           view pagination
           <div id="contentmain">
              render view
           view pagination
         if navbottom (selected components(ctxcomponents, navbottom):
           <div id="contentfooter">
             render components
      if boxes (selected components(ctxcomponents, right): MainTemplate.nav_column
        <div id="navColumnRight">
          <div class="navboxes">
             render boxes
  <div id="footer">
     render actions selected (actions, 'footer')

and here is what the skeleton from squareui.views.maintemplate looks like:

<div class="container-fluid">
  <div id="header" class="row-fluid">
    <!-- .header -->
  <div class="row-fluid">
    <div id="navColumnLeft" class="span3">
      <!-- .leftcolumn -->
    <div id="contentColumn" class="span6">
      <!-- .contentcol -->
      <div class="row-fluid">
        <div id="contentheader" class="span12">
          <!-- .contentheader -->
      <div class="row-fluid">
        <div id="contentmain" class="span12">
          <!-- .contentmain -->
      <div class="row-fluid">
        <div id="contentfooter" class="span12">
          <!-- .contentfooter -->
    <div id="navColumnRight" class="span3">
      <!-- .rightcolumn -->
  <div id="footer" class="row-fluid">
    <!-- .footer -->

Stay tuned for the updates on this (important) topic!

Géo − Geonames alignment

2012/12/20 by Simon Chabot

This blog post describes the main points of the alignment process between the French National Library's Géo repository of data, and the data extracted from Geonames.

Alignment is the process of finding similar entities in different repositories. The Géo repository of data contains a lot of locations and the goal is to find those locations in the Geonames repository, and to be able to say that location in *Géo* is the same than this one in *Geonames*. For that purpose, Logilab developed a library, called Nazca, to build those links.

To process the alignment between Géo and Geonames, we divided the Géo repository into two groups:

  • A group gathering the Géo data having information about longitude and latitude.
  • An other, gathering the data having no information about longitude and latitude.

Group 1 - Data having geographical information

The alignment process is made in five steps (see figure below):

1. Data gathering

We gather the information needed to align, that is to say, the unique identifier, the name, the longitude and the latitude. The same applies to the Geonames data.

2. Standardization

This step aims to make the data the as standard as possible. ie, set to lower case, remove the stop words, remove the punctuation and so on.

4. Alignment

Thanks to the Kdtree, we can quickly find the geographical nearest neighbours. During this fourth step, we loop over the nearest neighbours and assign to each a grade according to the similarity of its name and the name of the location we're looking for, using the Levenshtein distance. The alignment will be made with the best graded one.

5. Saving the results

Finally, we save all the results into a file.

Group 2 - Data having no geographical information

Let's have a look to the data having no information on the longitude and the latitude. The steps are more or less the same than before, except that we cannot find neighbours using a Kdtree. So, we use an other method to find location having a quite high level of similarity in their names. This method is called the Minhashing which has been shown to be quite relevant for this purpose.

To minimise the amount of mistakes, we try to gather locations according to their country, knowing the country in often written in the location's preferred_label. This pre-treatment helps us to filter out the cities having the same name but located in different countries. For instance, there is Paris in France, there is Paris in the United States, and there is Paris in Canada. So the alignment is made country by country.

The fourth and the fifth steps remain the sames.

Results obtained

The results we got are the followings :

  Amount of locations Aligned Non-aligned
Group 1 97572 (89.3%) (10.7%)
Group 2 150528 (72.9%) (27.1%)
Total 248100 (79.3%) (20.7%)

One problem we met is the language used to describe the location. Indeed, the similarity grade is given according the distance between the names, and one can notice that Londres and London, for instance, do not having the same spelling.despite they represent the same location.

Results improvement

In order to improve a little bit the results, we had a closer look to the 10.7% non-aligned of the first group. The problem of the language mentioned before was pretty clear. So we decided to use the following definition : two locations are identical, if they are geographically very close. Using this definition, we get rid of the name, and focus on the longitude and the latitude only.

To estimate the exactness of the results, we pick 50 randomly chosen location and process to a manual checking. And the results are pretty good ! 98% are correct (49/50). That's how, based on a purely geographical approach, we can increase the results covering rate (from 89.3% to 99.6%).

In the end, we get those results :

  Amount of locations Aligned Non-aligned
Group 1 97572 (99.6%) (0.4%)
Group 2 150528 (72.9%) (27.1%)
Total 248100 (83.4%) (16.4%)

Candidature au concours dataconnexions#2

2012/12/20 by Nicolas Chauvat

Au nom de la communauté des utilisateurs et développeurs de CubicWeb, je viens de déposer la candidature suivante au concours dataconnexions#2.

1. Questionnaire de description du Projet

Intitulé du projet

CubicWeb - plate-forme libre de développement pour le web sémantique

Catégorie de concours choisie

Choisir parmi: Grand public / Professionnel / Utilité publique / Mobilité et territoires

Utilité publique (?)

Quel problème tentez-vous de résoudre ?

Décrivez le (ou les) problème(s) que votre projet tente de résoudre, ainsi que son (leur) importance : taille du marché, fréquence d’utilisation potentielle, population concernée, bénéfices éventuels de service public, etc. (maximum 1000 signes).

L'avènement du web sémantique et de l'Open Data nécessite de disposer d'outils adaptés pour développer des applications centrées sur les données.

Ces outils doivent permettre d'importer des données facilement, de les mettre en relation lorsqu'elles proviennent de sources disjointes, de les republier et de faciliter leur interrogation et leur visualisation.

Idéalement, ces outils doivent utiliser et respecter les standards ouverts d'internet afin de simplifier les communications et les échanges, mais aussi faciliter le développement pour les terminaux multiples (ordinateur, tablette, smartphone).

Comment tentez-vous de le résoudre ?

Décrivez votre produit, service ou visualisation, dans sa forme actuelle et le cas échéant après les développements futurs éventuels que vous envisagez. Précisez le ou les jeux de données publiques que vous utilisez à cet effet (maximum 1000 signes).

CubicWeb est une plate-forme libre de développement pour le web sémantique.

CubicWeb permet aux développeurs de se concentrer sur les spécificités de leur application plutôt que d'avoir à réinventer les briques essentielles de l'import, la fusion, la publication, l'interrogation et la visualisation de données.

CubicWeb est un logiciel libre développé ouvertement sur internet par une communauté réduite mais déjà internationale. CubicWeb est disponible sous licence LGPL, respecte les standards du W3C (RDF, SPARQL, HTML5, CSS3, Responsive Design) et sait gérer nativement plusieurs modèles de données faisant office de standards de fait (FOAF, SIOC, DOAP, etc).

Quel est votre modèle d’affaire ?

Décrivez le modèle d’affaire de votre projet, c’est-à-dire les conditions de sa pérennité et de son développement : plan d’affaires et projections commerciales dans le cas d’un projet entrepreneurial ; objectifs, donneurs clés, partie prenantes dans le cas d’un projet d’ordre civique (maximum 1000 signes).

Plusieurs sociétés commerciales s'appuient aujourd'hui sur CubicWeb pour vendre des services informatiques. L'objectif de cette communauté est de croître pour bénéficier d'une audience plus large et d'une mutualisation plus importante des coûts de maintenance et de développement de la plate-forme CubicWeb.

Parmi les utilisateurs de CubicWeb, on compte à ce jour la Bibliothèque nationale de France, EDF, GDF-Suez, le Commissariat à l'Energie Atomique, le Centre National d'Etudes Spatiales, l'Institut Radioprotection et Sûreté Nucléaire, l'INRIA, des laboratoires de recherche médicale et des entreprises du domaine informatique.

Quel est l’état d’avancement de votre projet ?

Décrivez les étapes que vous avez franchies, les ressources mobilisées, les indicateurs et métriques déjà établies, etc. (maximum 1000 signes).

Le projet CubicWeb est issu d'un effort de R&D commencé en 2001 par la société Logilab, qui avait comme objectif de se doter d'un outil permettant le développement d'applications centrées sur les données et respectant les standards du web sémantique en cours d'élaboration au W3C.

Depuis 2008, CubicWeb est un logiciel libre dont le développement est mené ouvertement sur internet.

Qui vous accompagne sur ce projet ?

Décrivez l’équipe qui vous accompagne dans votre projet (le cas échéant), vos compétences, expériences et réalisations, ainsi que les partenaires éventuels qui vous soutiennent (maximum 1000 signes).


Comment DataConnexions peut-­il vous aider ?

Détaillez toutes les précisions additionnelles que vous souhaiteriez apporter au sujet de votre projet, et expliquez en quoi DataConnexions peut contribuer à pérenniser son développement (maximum 1000 signes).

Plusieurs sociétés commerciales s'appuient aujourd'hui sur CubicWeb pour vendre des services informatiques. Les utilisations industrielles de CubicWeb sont variées et concernent des applications importantes, voire critiques.

CubicWeb est un outil peu (re)connu et sa communauté est aujourd'hui réduite, malgré ses solides références et le récent engouement pour l'Open Data.

DataConnexions pourrait être une tribune et une vitrine permettant à CubicWeb de trouver de nouveaux développeurs d'applications préférant bénéficier de l'expérience capitalisée dans cet outil libre plutôt que de rédécouvrir et déjouer un par un les pièges rencontrés au cours des dix ans qui ont été nécessaires à sa réalisation.

L'objectif de cette candidature est donc de faire croître la communauté des utilisateurs et contributeurs de CubicWeb.

2. Vidéo de présentation

Lien permettant de télécharger une vidéo décrivant le Projet et ses fonctionnalités, d’une durée maximale de 3 minutes

Ce n’est pas la qualité de la vidéo qui est jugée, mais le projet lui-même. La vidéo doit permettre de rendre compte des fonctionnalités du projet. Les candidats sont encouragés à réaliser une capture d’écran ou un « screencast » (par exemple avec des outils tels que CamStudio, Jing ou Screenr).

Démonstration de l'utilisation de CubicWeb pour importer et visualiser la liste des gares françaises téléchargée depuis Sélection des gares par le filtre à facettes et affichage sur fond de carte openstreetmap, puis export en RDF, JSON et CSV.

CubicWeb est une plate-forme libre de développement pour le web sémantique, qui permet aux développeurs de se concentrer sur les spécificités de leur application plutôt que d'avoir à réinventer les briques essentielles de l'import, la fusion, la publication, l'interrogation et la visualisation de données.

Lien vers vidéo sur youtube. Miroir de la vidéo sur

3. Accès en ligne au projet

Lien permettant d’accéder au Projet, ou au code informatique compilé et interprétable du Projet

Par exemple : URL permettant de consulter, ou, le cas échéant, de télécharger l’application, accompagnée, si nécessaire, d’instructions à cet effet. L’application devra être facile à installer et aisément démontrable sur sa plateforme de destination.

4. Supports de communication

Description Non Confidentielle

Décrivez le Projet dans des termes compatibles avec une diffusion au grand public : non confidentiels, compréhensibles par le plus grand nombre, et mettant en avant l’intérêt du projet (maximum 1000 signes).

cf "comment tentez-vous de le résoudre"

Elément visuel de description

Lien vers un élément visuel décrivant et mettant en valeur le projet et ses fonctionnalités (capture d’écran, page d’accueil, schéma de description).


Logo du projet

Lien vers le logo du projet.