show 133 results

Blog entries

  • Links roundup from dotjs.eu

    2012/12/05 by Arthur Lutz

    A few people from Logilab attended the dotjs conference in Paris last week. The conference wasn't exactly what we expected, we were hoping for more technical talks. Nevertheless, some of the things we saw were quite interesting. Some of them could be relevant to CubicWeb.

    http://www.cubicweb.org/file/2532779?vid=download

    Here is a raw roundup of links collected last friday :

    Chrome developer toolsyeomangrunt.jsbackbone.jsDartTypeScriptLangExpress.jsMochaTestacularSASSAngular.jsEnyo.jsSocket.iowhen.jsCoffeescriptSource Maps explained


  • CubicWeb sprint in Paris - 2012/12/13-14

    2012/11/11 by Nicolas Chauvat

    Topics

    To be decided. Some possible topics are :

    • Work on CubicWeb front end : Anything related to Themaintemplate, primaryview, reledit, tables handling etc.
    • Share the Evolution and more integration of the OrbUI project for CW
    • Things to do for HTML5 and bootstrap integration
    • Work on ideas from Thoughts on CubicWeb 4
    • ...

    other ideas are welcome, please bring them up on cubicweb@lists.cubicweb.org

    Location

    This sprint will take place in decembre 2012 from thursday the 13th to friday the 14th. You are more than welcome to come along, help out and contribute. An introduction is planned for newcomers.

    Network resources will be available for those bringing laptops.

    Address : 104 Boulevard Auguste-Blanqui, Paris. Ring "Logilab" (googlemap)

    Metro : Glacière

    Contact : http://www.logilab.fr/contact

    Dates : 13/12/2012 to 14/12/2012

    Participants

    • Celso Flores (Crealibre - Mexico)
    • Carine Fourrier (Crealibre - Mexico)
    • ...

  • Building your URLs in cubicweb

    2012/09/25 by Stéphane Bugat

    Building your URLs in cubicweb

    Aim

    In cubicweb, you often have to build url's that redirect the current view to a specific entity view or allow the execution of a given action. Moreover, you often want also to fallback to the previous view once the specific action or edition is done, or redirect also to another entity's specific view.

    To do so, cubicweb provides you with a set of powerful tools, however as there is often more than one way to do it, this blog entry is here to help you in choosing the preferred way.

    Tools at your disposal

    The universal URL builder: build_url()

    build_url is accessible in any context, so for instance in the rendering of a given entity view you can call self._cw.build_url to build you URLs easily, which is the most common case. In class methods (for instance, when declaring the rendering methods of an EntityTableView), you can access it through the context of instantiated appobject which are usually given as argument, e.g. entity._cw.build_url. For test purposes you can also call session.build_url in cubicweb shells.

    build_url basically take a first optional, the path, relative to the base url of the site, and arbitrary named arguments that will be encoded as url parameters. Unless you wish to direct to a custom controller, or to match an URL rewrite url, you don't have to specify the path.

    Extra parameters given to build_url will vary according to your needs, however most common arguments understood by default cubicweb views are the followings:

    • vid: the built view __regid__;
    • rql: the RQL query used to retreive data on which the view should be applied;
    • eid: the identifier of an entity, which you should use instead of rql when the view apply to a single entity (most often);
    • __message: an information message to display inside the view;
    • __linkto: in case of an entity creation url, will allow to set some specific relations between both entities;
    • __redirectpath: the URL of the entity of the redirection;
    • __redirectvid: the view id of the redirection.

    __redirectvid and __redirectpath are used to control redirection after posting a form and are more detailed in the cubicweb documentation, chapter related to the edition control (http://docs.cubicweb.org/devweb/edition/editcontroller.html).

    Exploring entities associated URLs

    Generally, an entity has two important methods that retrieve its absolute or relative urls:

    • entity.rest_path() will return something like <type>/<eid> where <type> corresponds to the entity type and <eid> the entity eid;
    • entity.absolute_url() will return the full url of the entity http://<baseurl>/<type>/<eid>. In case you want to access a specific view of the entity, just pass the vid='myviewid' argument. You can give arbitrary arguments to this method that will be encoded as url parameters.

    Getting a proper RQL

    Passing the rql to the build_url method requires to have a proper RQL expression. To do so, there is a convenience method, printable_rql(), that is accessible in rset resulting from RQL queries. This allows to apply a view to the same result set as the one currently process, simply using rql = self.cw_rset.printable_rql().

    Getting URLs from the current view

    There are several ways to get URL of the current view, the canonical one being to use self._cw.relative_path(includeparams=True) which will return the path of the current view relative to the base url of the site (otherwise use self._cw.url(), including parameters or not according to value given as includeparams).

    You can also retrieve values given to individual parameters using self._cw.form, eg:

    • self._cw.form.get('vid', '') will return only the view id;
    • self._cw.form.get('rql', '') will return only the RQL;
    • self._cw.form.get('__redirectvid', '') will return the redirection view if defined;
    • self._cw.form.get('__redirectpath', '') will return the redirection path if defined.

    How to redirect to non-entity view?

    This case often appears when you want to create a link to a startup view or a controller. It the first case, you simply build you URL like this:

    self._cw.build_url('view', vid='my_view_id')
    

    The latter case appears when you want to call a controller directly without having to define a form in your view. This can happen for instance when you want to create a URL that will set a relation between 2 objects and do not need any confirmation for that. The URL construction is done like this:

    self._cw.build_url('my_controller_id', arg1=value1, arg2=value2, ...)
    

    Any extra arguments passed to the build_url method will be available in the controller as key, values pairs of the self._cw.forms dictionary. This is especially useful when you want to define some kind of hidden attributes but there is not form to put them into.

    And, last but not least, a convenient way to get the root URL of the instance:

    self._cw.base_url()
    

    Some concrete cases

    Get the URL of the outofcontext view of an entity:

    link = entity.absolute_url(vid='outofcontext')
    

    Create a link to a given controller then fall back to the current view:

    • In your entity view:
    self.w(u'<a href="%s">Click me</a>' % xml_escape(
            self._cw.build_url('mycontrollerid',
                    arg1=value1, arg2=value2,
                    rql=self.cw_rset.printable_rql(),
                    __redirectvid=self._cw.form.get('vid',''))))
    
    • In your controller:
    def publish(self, rset):
         value1, value2 = self._cw.form['arg1'], self._cw.form['arg2']
         # do some stuff with value1 and value2 here...
         raise Redirect(self._cw.build_url(rql=self._cw.form['rql'],
             vid=self._cw.form['__redirectvid'],
             __message=_('you message')))
    

    Create a link to add a given entity and relate this entity to the current one with a relation 'child_of', then go back to the current entity's view:

    entity = self.cw_rset.get_entity(0,0)
    self.w(u'<a href="%s">Click me</a>' % xml_escape(
            self._cw.build_url('add/Mychildentity',
                    __linkto='child_of:%s:object' % entity.eid,
                    __redirectpath=entity.rest_path(),
                    __redirectvid=self._cw.form.get('vid', ''))))
    

    Same example, but we suppose that we are in a multiple rset entity view, and we want to go back afterwards to this view:

    entity = self.cw_rset.get_entity(0,0)
    self.w(u'<a href="%s">Click me</a>' % xml_escape(
            self._cw.build_url('add/Mychildentity',
                    rql=self.cw_rset.printable_rql(),
                    __linkto='child_of:%s:object' % entity.eid,
                    __redirectvid=self._cw.form.get('vid', ''))))
    

    Create links to all 'menuactions' in a view:

    actions = self._cw.vreg['actions'].possible_actions(self._cw, rset=self.cw_rset)
    action_links = [unicode(self.action_link(x)) for x in actions.get('menuactions', ())]
    self.w( u'  |  '.join(action_links))
    

  • How to create your own forms and controllers?

    2012/09/05 by Stéphane Bugat

    Aim

    Sometimes you need to associate to a given view your own specific form and the associated controller. We will see in this blog entry how it can be done in cubicweb on a concrete case.

    The case

    Let's suppose you're working on a social network project where you have to develop friend-of-a-frient (foaf) relationships between persons. For that purpose, we use the cubicweb-person cube and create in our scheme relations between persons like X in_contact_with Y:

    class in_contact_with(RelationDefinition):
          subject = 'Person'
          object = 'Person'
          cardinality = '**'
          symmetric = True
    

    We will also assume that a given Person corresponds to a unique CWUser through the relation is_user.

    Although it is not evident, we would like that any connected person can chose to disconnect himself from another person at any time. For that, we will create a table view that will display the list of connected users, with a custom column giving the ability to "disconnect" with the person.

    Before disconnecting with this particular person, we would like also to have a confirmation form.

    How to proceed

    The following steps were defined to address the above issue:

    1. Define a "contact view" that will display the list of known contacts of the connected user ;
    2. In this contact view, allow the user to click on a specific contact so as to remove him ;
    3. Create a deletion confirmation view, that will contain:
      • A form holding the buttons for deletion confirmation or cancel;
      • A controller responsible for the actual deletion or the cancelling.

    The contact view

    Rendering a table view of connected persons

    To display the list of connected persons to the current person, but also to add custom columns that do not refer specifically to attributes of a given entity, the best choice is to use EntityTableView (see here for more information):

    class ContactView(EntityTableView):
        __regid__ = 'contacts_tableview'
        __select__ = is_instance('Person')
        columns = ['person', 'firstname', 'surname', 'email', 'phone', 'remove']
        layout_args = {'display_filter': 'top', 'add_view_actions': None}
    
        def cell_remove(w, entity):
            """link to the suppression of the relation between both contacts"""
            icon_url = entity._cw.data_url('img/user_delete.png')
            action_url = entity._cw.build_url(eid=entity.eid,
                    vid='suppress_contact_view',
                    __redirectpath=entity._cw.relative_path(),
                    __redirectvid=entity._cw.form.get('__redirectvid', ''))
            w(u'<a href="%(actionurl)s" title="%(title)s">'
                    u'<img alt="%(title)s" src="%(url)s" /></a>'
                    % {'actionurl': xml_escape(action_url),
                       'title': _('remove from contacts'),
                       'url':icon_url})
    
        column_renderers = {
                'person': MainEntityColRenderer(),
                'email': RelatedEntityColRenderer(
                    getrelated=lambda x:x.primary_email and x.primary_email[0] \
                            or None),
                'phone': RelatedEntityColRenderer(
                    getrelated=lambda x:x.phone and x.phone[0] or None),
                'remove': EntityTableColRenderer(
                    renderfunc=cell_remove,
                    header=''),}
    

    A few explanations about the above view:

    • By default, the column attribute contains a list of displayable attributes of the entity. If one element of the list does not correspond to an attribute, which is the case for 'remove' here, it has to have rendering function defined in the dictionnary column_renderers.
    • However, when the column header refers to a related entity attribute, we can easily use the rendering function RelatedEntityColRenderer, as it is the case for the email and phone display.
    • As for concerns the 'remove' column, we render a clickable image in the cell_remove method. Here we have chosen an icon from famfamsilk that is putted in our data/ directory, but feel free to chose a predefined icon in the cubicweb shared data directory.

    The redirection URL associated to each image has to be a link to a specific action allowing the user to remove the selected person from its contacts. It is built using the self._cw.build_url() convenience function. The redirection view, 'suppress_contact_view', will be defined later on. The eid argument passed refers to the id of the contact person the user wants to remove.

    Calling the contact view

    The above view has to be called with a given rset which corresponds to the list of known contacts for the connected user. In our case, we have defined a StartupView for the contact management, in which in the call function we have added the following piece of code:

    person = self._cw.user.related('is_user', 'object').get_entity(0,0)
    rset = self._cw.execute(
            'Any X WHERE X is Person, X in_contact_with Y, '
            'Y eid %(eid)s', {'eid': person.eid})
    self.w(u'<h3>' + _('Number of contacts in my network:'))
    self.w(unicode(len(rset)) + u'</h3>')
    if len(rset) != 0:
        self.wview('contacts_tableview', rset)
    

    The Person corresponding to the connected user is retrieved thanks to the use of the related method and the is_user relation. The contact table view is displayed inside the parent StartupView.

    Creation of the deletion confirmation view

    Defining the confirmation view for contact deletion

    The corresponding view is a simple View class instance, that will display a confirmation message and the related buttons. It could be defined as follows:

    class SuppressContactView(View):
        __regid__ = 'suppress_contact_view'
    
        def cell_call(self, row, col):
            entity = self.cw_rset.get_entity(row, col)
            msg = self._cw._('Are you sure you want to remove %(name)s from your contacts?')
            self.w(u'<p>' + msg % {'name': entity.dc_long_title()} + u'</p>')
            form = self._cw.vreg['forms'].select('suppress_contact_form',
                    self._cw, rset=self.cw_rset)
            form.add_hidden(u'eidto', entity.eid)
            form.add_hidden(u'eidfrom', self._cw.user.related('is_user',
                'object').get_entity(0,0).eid)
            form.render(w=self.w)
    

    Inside the cell_call() method of this view, we will have to render a form which aims at displaying both buttons (confirm deletion or cancel deletion). This form will be described later on.

    The Person contact to remove is retrieved easily thanks to cw_rset. The Person corresponding to the connected user is here also retrieved thanks to the is_user relation. To make both of them available in the form, we add them at the instanciation of the form using the convenience function add_hidden(key,val).

    Defining the deletion form

    The deletion form as mentioned previously is only here to hold both buttons for the deletion confirmation or the cancelling. Both buttons are declared thanks to the form_buttons attribute of the form, which is instanciated from forms.FieldsForm:

    class SuppressContactForm(forms.FieldsForm):
        __regid__ = 'suppress_contact_form'
        domid = 'delete_contact_form'
        form_renderer_id = 'base'
    
        @property
        def action(self):
            return self._cw.build_url('suppress_contact_controller')
    
        form_buttons = [
                fw.Button(stdmsgs.BUTTON_DELETE, cwaction='delete'),
                fw.Button(stdmsgs.BUTTON_CANCEL, cwaction='cancel')]
    

    Specifying a given domid will ensure that your form will have a specific DOM identifier,the controller defined in the action method will be called without any ambiguity. The form_renderer_id is precised here so as to avoid additional display of informations which don't make sense here.

    Defining the controller

    The custom controller is instanciated from the Controller class in cubicweb.web.controller. The declaration of the controller should have the same domid than the calling form, as mentioned previously. The related actions are described in the publish() method of the controller:

    class SuppressContactController(Controller):
        __regid__ = 'suppress_contact_controller'
        domid = 'delete_contact_form'
    
        def publish(self, rset=None):
            if '__action_cancel' in self._cw.form.keys():
                msg = self._cw._('Deletion canceled')
                raise Redirect(self._cw.build_url(
                    vid='contact_management_view',
                    __message=msg))
            elif '__action_delete' in self._cw.form.keys():
                xid = self._cw.form['eidfrom']
                dead_contact = self._cw.entity_from_eid(xid)
                yid = self._cw.form['eidto']
                self._cw.execute(
                        'DELETE X in_contact_with Y'
                        '  WHERE X eid %(xid)s, Y eid %(yid)s',
                        {'xid': xid, 'yid': yid})
                msg = self._cw._('%s removed from your contacts') %\
                    dead_contact.dc_long_title()
                raise Redirect(self._cw.build_url(
                    vid='contact_management_view',
                    __message=msg))
    

    Retrieving of the user action is performed by testing if the '__action_<action>', where <action> refers to the cwaction in the button declaration, is present in the form keys. In the case of a cancelling, we simply redirect to the contact management view with a message specifying that the deletion has been cancelled. In the case of a deletion confirmation, both Person id's for the connected user and for the contact to remove are retrieved from the form hidden arguments.

    The deletion is performed using an RQL request on the relation in_contact_with. We also redirect the view to the contact management view, this time with another message confirming the deletion of the contact link.


  • Logilab at the LawFactory

    2012/07/16 by Vincent Michel

    We have been playing along with political data for a while, using CubicWeb to store and query various sets of open data (e.g. NosDeputes, data.gouv.fr), and testing different visualization tools. In particular, we have extended our prototype of News Analysis (see the presentation we made last year at Euroscipy), in order to use these political datasets as reference for the named entities extraction part. Last week's conference "The Law Factory" at Sciences Po was a really nice opportunity to meet people with similar interests in opendata for political sciences, and to find out which questions we should be asking our data ! Check out the talk of our presentation and a few screencasts (no sound) :

    Comments are welcome !

    Interresting things seen at #OLPC

    Among the different things that we have seen, we want to emphasize on:

    • Law is Code (http://gitorious.org/law-is-code/) - This project by the team of Regards Citoyens, aims at analysing the laws and amendments, by extracting information from the French National Assembly website, and by pushing the contributions of the members of parlement to a given law in a git repository. If we can find the time, we'll turn that into a mercurial repository and integrate it into our above application using cubicweb-vcsfile.
    http://www.cubicweb.org/file/2423768?vid=download
    • Both national websites (Assemblée Nationale, Sénat), do not allow (yet...) to get data any other way than parsing the sites. However, it seems that the people involved are aware of the issues of opendata, and this may changed in the next months. In particular, the Senat use two databases (Basile and Ameli), and opening them to the public could be really interesting
    • Different projects about African parlements can be found on the following website : http://www.parliaments.info
    • Check out, ITCparliement which gives tools to analyse and share data from many different parliments.

    Saturday, at La Cantine Numérique, the discussions focused on the possibilities to share tools, and the possible collaborations. I think that this is the crucial point: How people can share tools and use them in a efficient way, without being an IT expert ?

    How does this inspire us for CubicWeb ?

    In this way, we have are thinking about some evolutions of CubicWeb that can fullfill (part) of these requirements:

    • easier installation, especially on Windows, and easier Postgresql configuration. This could perhaps be made by allowing some graphical interface for creating/managing the instances and the databases.
    • a graphical tool for schema construction. Even if the construction of a data model in CubicWeb is quite simple, and rely on the straightforward Python syntax, it could be interesting to expose a graphical tool for adding/removing/modifying entities from the schema, as well as some attributes or relations.
    • easier ways to import data. This point is not trivial, and we don't want to develop a specific language for defining import rules, that could be used for 80% of the cases, but will be painful to extend to the 20% exotic cases. We would rather develop some helpers to ease the building of some import scripts in Python, and to upload some CubicWeb instances already filled with open databases.

    Demo of CubicWeb as a follow up

    As a follow up of the conference, we are openning a demo site using CubicWeb to expose data of the past legislative and presidential elections (2002, 2007, 2012)

    https://www.cubicweb.org/file/2425136?&vid=download

    The data used is published under Licence Ouverte / Open Licence by http://data.gouv.fr.

    This demo site allows you to deeply explore the data, with different visualisations, and complex queries. Again, comments are welcome, especially if you want to retrieve some information but you don't know how to! This demo site will probably evolve in the next weeks, and we will use it to test different cubes that we have been building.

    PS: We are sorry we cannot open the propotype of news aggregator for now, as there are still licensing issues concerning the reusability of the different news sources that we get articles from.


  • What's new in CubicWeb 3.15

    2012/05/14 by Sylvain Thenault

    CubicWeb 3.15 introduces a bunch of new functionalities. In short (more details below):

    • ability to use ZMQ instead of Pyro to connect to repositories
    • ZMQ inter-instances messages bus
    • new LDAP source using the datafeed approach, much more flexible than the legacy 'ldapuser' source
    • full undo support

    Plus some refactorings regarding Ajax function calls, WSGI, the registry, etc. Read more for the detail.

    New functionalities

    • Add ZMQ server, based on the cutting edge ZMQ socket library. This allows to access distant instances, in a similar way as Pyro.
    • Publish/subscribe mechanism using ZMQ for communication among cubicweb instances. The new zmq-address-sub and zmq-address-pub configuration variables define where this communication occurs. As of this release this mechanism is used for entity cache invalidation.
    • Improved WSGI support. While there are still some caveats, most of the code which was twisted only is now generic and allows related functionalities to work with a WSGI front-end.
    • Full undo/transaction support: undo of modifications has finally been implemented, and the configuration simplified (basically you activate it or not on an instance basis).
    • Controlling HTTP status code returns is now much easier:
      • WebRequest now has a status_out attribute to control the response status ;
      • most web-side exceptions take an optional status argument.

    API changes

    • The base registry implementation has been moved to a new logilab.common.registry module (see #1916014). This includes code from :

      • cubicweb.vreg (everything that was in there)
      • cw.appobject (base selectors and all).

      In the process, some renaming was done:

      • the top level registry is now RegistryStore (was VRegistry), but that should not impact CubicWeb client code;
      • former selectors functions are now known as "predicate", though you still use predicates to build an object'selector;
      • for consistency, the objectify_selector decorator has hence been renamed to objectify_predicate;
      • on the CubicWeb side, the selectors module has been renamed to predicates.

      Debugging refactoring dropped the need for the lltrace decorator. There should be full backward compat with proper deprecation warnings. Notice the yes predicate and objectify_predicate decorator, as well as the traced_selection function should now be imported from the logilab.common.registry module.

    • All login forms are now submitted to <app_root>/login. Redirection to requested page is now handled by the login controller (it was previously handled by the session manager).

    • Publisher.publish has been renamed to Publisher.handle_request. This method now contains a generic version of the logic previously handled by Twisted. Controller.publish is not affected.

    Unintrusive API changes

    • New 'ldapfeed' source type, designed to replace 'ldapuser' source with data-feed (i.e. copy based) source ideas.
    • New 'zmqrql' source type, similar to 'pyrorql' but using ømq instead of Pyro.
    • A new registry called 'services' has appeared, where you can register server-side cubicweb.server.Service child classes. Their call method can be invoked from a web-side AppObject instance using the new self._cw.call_service method or a server-side one using self.session.call_service. This is a new way to call server-side methods, much cleaner than monkey patching the Repository class, which becomes a deprecated way to perform similar tasks.
    • a new ajaxfunction registry now hosts all remote functions (i.e. functions callable through the asyncRemoteExec JS api). A convenience ajaxfunc decorator will let you expose your python functions easily without all the appobject standard boilerplate. Backwards compatibility is preserved.
    • the 'json' controller is now deprecated in favor of the 'ajax' one.
    • WebRequest.build_url can now take a __secure__ argument. When True, cubicweb tries to generate an https url.

    User interface changes

    A new 'undohistory' view exposes the undoable transactions and gives access to undo some of them.


  • Thoughts on CubicWeb 4.0

    2012/05/14 by Sylvain Thenault

    This is a fairly technical post talking about the structural changes I would like to see in CubicWeb's near future. Let's call that CubicWeb 4.0! It also drafts ideas on how to go from here to there. Draft, really. But that will eventually turn into a nice roadmap hopefully.

    The great simplification

    Some parts of cubicweb are sometimes too hairy for different reasons (some good, most bad). This participates in the difficulty to get started quickly. The goal of CubicWeb 4.0 should be to make things simpler :

    • Fix some bad old design.
    • Stop reinventing the wheel and use widely used libraries in the Python Web World. This extends to benefitting from state of the art libraries to build nice and flexible UI such as Bootstrap, on top of the JQuery foundations (which could become as prominent as the Python standard library in CubicWeb, the development team should get ready for it).
    • If there is a best way to do something, just do it and refrain from providing configurability and options.

    On the road to Bootstrap

    First, a few simple things could be done to simplify the UI code:

    • drop xhtml support: always return text/html content type, stop bothering with this stillborn stuff and use html5
    • move away everything that should not be in the framework: calendar?, embedding, igeocodable, isioc, massmailing, owl?, rdf?, timeline, timetable?, treeview?, vcard, wdoc?, xbel, xmlrss?

    Then we should probably move the default UI into some cubes (i.e. the content of cw.web.views and cw.web.data). Besides making the move to Bootstrap easier, this should also have the benefit of making clearer that this is the default way to build an (automatic) UI in CubicWeb, but one may use other, more usual, strategies (such as using a template language).

    At a first glance, we should start with the following core cubes:

    • corelayout, the default interface layout and generic components. Modules to backport there: application (not an appobject yet), basetemplates, error, boxes, basecomponents, facets, ibreadcrumbs, navigation, undohistory.
    • coreviews, the default generic views and forms. Modules to backport there: actions, ajaxedit, baseviews, autoform, dotgraphview, editcontroller, editforms, editviews, forms, formrenderers, primary, json, pyviews, tableview, reledit, tabs.
    • corebackoffice, the concrete views for the default back-office that let you handle users, sources, debugging, etc. through the web. Modules to backport here: cwuser, debug, bookmark, cwproperties, cwsources, emailaddress, management, schema, startup, workflow.
    • coreservices, the various services, not directly related to display of something. Modules to backport here: ajaxcontroller, apacherewrite, authentication, basecontrollers, csvexport, idownloadable, magicsearch, sessions, sparql, sessions, staticcontrollers, urlpublishing, urlrewrite.

    This is a first draft that will need some adjustements. Some of the listed modules should be split (e.g. actions, boxes,) and their content moved to different core cubes. Also some modules in cubicweb.web packages may be moved to the relevant cube.

    Each cube should provide an interface so that one could replace it with another one. For instance, move from the default coreviews and corelayout cube to bootstrap based ones. This should allow a nice migration path from the current UI to a Bootstrap based UI. Bootstrap should probably be introduced bottom-up: start using it for tables, lists, etc. then go up until the layout defined in the main template. The Orbui experience should greatly help us by pointing at hot spots that will have to be tackled, as well as by providing a nice code base from which we should start.

    Regarding current implementation, we should take care that Contextual components are a powerful way to build "pluggable" UI, but we should probably add an intermediate layer that would make more obvious / explicit:

    • what the available components are
    • what the available slots are
    • which component should go in which slot when possible

    Also at some point, we should take care to separate view's logic from HTML generation: our experience with client works shows that a common need is to use the logic but produce a different HTML. Though we should wait for more use of Bootstrap and related HTML simplification to see if the CSS power doesn't somewhat fulfill that need.

    On the road to proper tasks management

    The current looping task / repo thread mecanism is used for various sort of things and has several problems:

    • tasks don't behave similarly in a multi-instances configuration (some should be executed in a single instance, some in a subset); the tasks system has been originally written in a single instance context; as of today this is (sometimes) handled using configuration options (that will have to be properly set in each instance configuration file);
    • tasks is a repository only api but we also need web-side tasks;
    • there is probably some abuse of the system that may lead to unnecessary resources usage.

    Analyzing a sample http://www.logilab.org/ instance, below are the running looping task by categories. Tasks that have to run on each web instance:

    • clean_sessions, automatically closes unused repository sessions. Notice cw.etwist.server also records a twisted task to clean web sessions. Some changes are imminent on this, they will be addressed in the upcoming refactoring session (that will become more and more necessary to move on several points listed here).
    • regular_preview_dir_cleanup (preview cube), cleanup files in the preview filesystem directory. Could be executed by a (some of the) web instance(s) provided that the preview directory is shared.

    Tasks that should run on a single instance:

    • update_feeds, update copy based sources (e.g. datafeed, ldapfeed). Controlled by 'synchronize' source configuration (persistent source attribute that may be overridden by instance using CWSourceHostConfig entities)
    • expire_dataimports, delete CWDataImport entities older than an amount of time specified in the 'logs-lifetime' configuration option. Not controlled yet.
    • cleanup_auth_cookies (rememberme cube), delete CWAuthCookie entities whose life-time is exhausted. Not controlled yet.
    • cleaning_revocation_key (forgotpwd cube), delete Fpasswd entities with past revocation_date. Not controlled yet.
    • cleanup_plans (narval cube), delete Plan entities instance older than an amount of time specified in the configuration. If 'plan-cleanup-delay' is set to an empty value, the task isn't started.
    • refresh_local_repo_caches (vcsfile cube), pull or clone vcs repositories cache if the Repository entity ask to import_revision_content (hence web instance should have up to date cache to display files content) or if 'repository-import' configuration option is set to 'yes'; import vcs repository content as entities if 'repository-import' configuration option and it is coming from the system source.

    Some deeper thinking is needed here so we can improve things. That includes thinking about:

    • the inter-instances messages bus based on zmq and introduced in 3.15,
    • the Celery project (http://celeryproject.org/), an asynchronous task queue, widely used and written in Python,

    Remember the more cw independent the tasks are, the better it is. Though we still want an 'all-integrated' approach, e.g. not relying on external configuration of Unix specific tools such as CRON. Also we should see if a hard-dependency on Celery or a similar tool could be avoided, and if not if it should be considered as a problem (for devops).

    On the road to an easier configuration

    First, we should drop the different behaviour according to presence of a '.hg' in cubicweb's directory. It currently changes the location where cubicweb external resources (js, css, images, gettext catalogs) are searched for. Speaking of implementation:

    • shared_dir returns the cubicweb.web package path instead of the path to the shared cube,
    • i18n_lib_dir returns the cubicweb/i18n directory path instead of the path to the shared/i18n cube,
    • migration_scripts_dir returns the cubicweb/misc/migration directory path instead of share/cubicweb/migration.

    Moving web related objects as proposed in the Bootstrap section would resolve the problem for the content web/data and most of i18n (though some messages will remain and additional efforts will be needed here). By going further this way, we may also clean up some schema code by moving cubicweb/schemas and cubicweb/misc/migration to a cube (though only a small benefit is to be expected here).

    We should also have fewer environment variables... Let's see what we have today:

    • CW_INSTANCES_DIR, where to look for instances configuration
    • CW_INSTANCES_DATA_DIR, where to look for instances persistent data files
    • CW_RUNTIME_DIR, where to look for instances run-time data files
    • CW_MODE, set to 'system' or 'user' will predefine above environment variables differently
    • CW_CUBES_PATH, additional directories where to look for cubes
    • CW_CUBES_DIR, location of the system 'cubes' directory
    • CW_INSTALL_PREFIX, installation prefix, from which we can compute path to 'etc', 'var', 'share', etc.

    I would propose the following changes:

    • CW_INSTANCES_DIR is turned into CW_INSTANCES_PATH, and defaults to ~/etc/cubicweb.d if it exists and /etc/cubicweb.d (on Unix platforms) otherwise;
    • CW_INSTANCES_DATA_DIR and CW_RUNTIME_DIR are replaced by configuration file options, with smart values generated at instance creation time;
    • the above change should make CW_MODE useless;
    • CW_CUBES_DIR is to be dropped, CW_CUBES_PATH should be enough;
    • regarding CW_INSTALL_PREFIX, I'm lacking experience with non-hg-or-debian installations and don't know if this can be avoided or not.

    Last but not least (for the moment), the 'web' / 'repo' / 'all-in-one' configurations, and the fact that the associated configuration file changes stinks. Ideas to stop doing this:

    • one configuration file per instance, with all options provided by installed parts of the framework used by the application.
    • activate 'services' (or not): web server, repository, zmq server, pyro server. Default services to be started are stored in the configuration file.

    There is probably more that can be done here (less configuration options?), but that would already be a great step forward.

    On the road to...

    The following projects should be investigated to see if we could benefit from them:

    Discussion

    Remember the following goals: migration of legacy code should go smoothly. In a perfect world every application should be able to run with CubicWeb 4.0 until the backwards compatibility code is removed (and CubicWeb 4.0 will probably be released as 4.0 at that time).

    Please provide feedbacks:

    • do you think choices proposed above are good/bad choices? Why?
    • do you know some additional libraries that should be investigated?
    • do you have other changes in mind that could/should be done in cw 4.0?

  • Follow up of IRI conference about Museums and the Web #museoweb

    2012/04/12 by Arthur Lutz

    I attented the conference organised by IRI in a series of conferences about "Muséologie, muséographie et nouvelles formes d’adresse au public" (hashtag #museoweb). This particular occurence was about "Le Web devient audiovisuel" (the web is also audio and video content). Here are a few notes and links we gathered. The event was organised by Alexandre Monnin @aamonnz.

    http://polemictweet.com/2011-2012-museo-audiovisuel/images/slide4_museo_fr.png

    Yves Raimond from the BBC

    Yves Raimond @moustaki made a presentation about his work at the BBC around semantic web technologies and speech recognition over large quantities of digitized archives. Parts of the BCC web sites use semantic web data as the database and do mashups with external sources of data (musicbrainz, dbpedia, wikipedia). For example Tom Waits has an html web page : http://www.bbc.co.uk/music/artists/c3aeb863-7b26-4388-94e8-5a240f2be21b add .rdf at the end of the URL http://www.bbc.co.uk/music/artists/c3aeb863-7b26-4388-94e8-5a240f2be21b.rdf

    He also made an introduction about the ABC-IP The Automatic Broadcast Content Interlinking Project and the Kiwi-API project that uses CMU Sphinx on Amazon Web Services to process large quantities of archives. A screenshot of Kiwi-API is shown on the BBC R&D blog. The code should be open sourced soon and should appear on the BBC R&D github page.

    Following his presentation, the question was asked if using Wikipedia content on an institutional web site would be possible in France, I pointed to the use of Wikipedia on http://data.bnf.fr , for example at the bottom of the Victor Hugo page.

    Raphaël Troncy about Media Fragments

    Raphaël Troncy @rtroncy made a presentation about "Media Fragments" which will enable sharing parts of a video on the web. Two major features : the sharing of specific extracts and the optimization of bandwith use when streaming the extract (usefull for mobile devices for example). It is a W3C working draft : http://www.w3.org/TR/media-frags-reqs/. Here are a few links of demos and players :

    Part of the presentation was about the ACAV project done jointly with Dailymotion : http://www.capdigital.com/projet-acav/

    The slides of his presentation are available here : http://www.slideshare.net/troncy/addressing-and-annotating-multimedia-fragments

    IRI presentation

    Vincent Puig @vincentpuig and Raphaël Velt @raphv made a presentation of various projects led by IRI :

    http://www.iri.centrepompidou.fr/wp-content/themes/IRI-Theme/images/logo-iri-petit_fr_fr.png

    Final words

    The technologies seen during this conference are often related to semantic web technologies or at least web standards. Some of the visualizations are quite impressive and could mean new uses of the Web and an inspiration for CubicWeb projects.

    A few of the people present at the conference will be attending or presenting talks at SemWeb.Pro which will take place in Paris on the 2nd and 3rd of may 2012.


  • CubicWeb Sprint report for the "BugSquash" team

    2012/03/16 by Nicolas Chauvat

    Beginners fixed core bugs

    The first day of the CubicWeb sprint was dedicated to an introduction to a group of four beginners that included two people that do not work at Logilab. At the end of day, this team knew about Entity, Views and Schema and was ready to dive into the core in order to squash some bugs.

    The first steps into the CubicWeb core were not so easy, but these brave beginners, assisted by a skilled developer, managed to fix some bugs and add a few useful features, including one from a windows user that made it into the stable branch.

    The gen-static-datadir command

    We had a look at cubicweb-ctl gen-static-datadir, a feature that copies in a directory all the files that could be cached by a "front" web server instead of being served by cubicweb.

    Testing the feature

    At first run, we found that not all files where copied. We alas were unable to reproduce. So we need to keep an eye on this. On next tests, we tried several configuration. The files that were copied were always the ones containd in the "deepest" cube in the tree of cubes. So we can say that the command is working well.

    Approach used by the feature

    In the code, we browse all cubes used by the master cube to gather all filenames that we want to copy and afterwards we use "config.locate_resource(resource)" to find the best location for this file.

    Doing this, we sometimes copy a file from the cache. If we do not want to use the cache, we could be sort the cubes recursively copy the whole data folder and sometimes overwrite files with files located nearer to the master cube.

    New option

    We added a -r option that erases the target directory before launching the command.


  • Undoing changes in CubicWeb

    2012/02/29 by Anthony Truchet

    Many desktop applications offer the possibility for the user to undo the recent changes : a similar undo feature has now been integrated into the CubicWeb framework.

    Because a semantic web application and a common desktop application are not the same thing at all, especially as far as undoing is concerned, we will first introduce what is the undo feature for now.

    What's undoing in a CubicWeb application

    A CubicWeb application acts upon an Entity-Relationship model, described by a schema. This ensures some data integrity properties. It also implies that changes are made by group called transaction : so as to insure the data integrity the transaction is completely applied or none of it is applied. What may appear as a simple atomic action to a user can actually consist in several actions for the framework. The end-user has no need to know the details of all actions in those transactions. Only the so-called public actions will appear in the description of the an undoable transaction.

    Lets take a simple example: posting a "comment" for a blog entry will create the entity itself and the link to the blog entry.

    The undo feature for CubicWeb end-users

    For now there are two ways to access the undo feature when it has been activated in the instance configuration file with the option undo-support=yes. Immediately after having done something the undo** link appears in the "creation" message.

    Screenshot of the undo link in the message

    Otherwise, one can access at any time the undo-history view accessible from the start-up page.

    Screenshot of the undo link in the message

    This view shows the transactions, and each provides its own undo link. Only the transactions the user has permissions to see and undo will be shown.

    Screenshot of the **undo** link in the message

    If the user attempts to undo a transaction which can't be undone or whose undoing fails, then a message will explain the situation and no partial undoing will be left behind.

    What's next

    The undo feature is functional but the interface and configuration options are quite limited. One major, planned, improvement would be enable the user to filter which transactions or actions he sees in the undo-history view. Another critical improvement would be to selectively enable the undo feature on part of the entity-relationship schema to avoid storing too much data and reduce the underlying overhead.

    Feedback on this undo feature for specific CubicWeb applications is welcome. More detailed information regarding the undo feature will be published in the CubicWeb book when the patches make it through the review process.


  • CubicWeb Sprint report for the "ZMQ" team

    2012/02/27 by Julien Cristau

    There has been a growing interest in ZMQ in the past months, due to its ability to efficiently deal with message passing, while being light and robust. We have worked on introducing ZMQ in the CubicWeb framework for various uses :

    • As a replacement/alternative to the Pyro source, that is used to connect to distant instances. ZMQ may be used as a lighter and more efficient alternative to Pyro. The main idea here is to use the send_pyobj/recv_pyobj API of PyZMQ (python wrapper of ZMQ) to execute methods on the distant Repository in a totally transparent way for CubicWeb.
    http://www.cubicweb.org/file/2219158?vid=download
    • As a JSONServer. Indeed, ZMQ could be used to share data between a server and any requests done through ZMQ. The request is just a string of RQL, and the response is the result set formatted in Json.
    • As the building block for a simple notification (publish/subscribe) system between CubicWeb instances. A component can register its interest in a particular topic, and receive a callback whenever a corresponding message is received. At this point, this mechanism is used in CubicWeb to notify other instances that they should invalidate their caches when an entity is deleted.

  • CubicWeb Sprint report for the "WSGI" team

    2012/02/20 by Pierre-Yves David

    Cubicweb has had WSGI support for several years, but this support was incomplete.

    The WSGI team was in charge of turning WSGI support into a full featured backend that could replace Twisted in real production scenarii.

    Because we only had first class support for Twisted, some of the CubicWeb logic related to HTTP handling was implemented on the twisted side with twisted concepts. Our first task was to move this logic in CubicWeb itself. The handling of HTTP status in our response was improved in the process.

    Our second task was to focus on the "non-HTTP" part of CubicWeb (because the repository also manages background tasks). The developement mode for WSGI is now able to handle and run such tasks. For this purpose we have begun a process that aims to remove server related code from the repository object.

    We also Tested several WSGI middleware. One of the most promising is Firepython, integrating python logging and debugging feature with Firebug. werkzeug debugger seems neat too.

    http://www.cubicweb.org/file/2194267?vid=download

    All these improvements open the road to a simple and efficient multi-process architecture in CubicWeb.


  • CubicWeb Sprint report for the "Benchmarks" team

    2012/02/17 by Arthur Lutz

    One team during the CubicWeb sprint looked at issues around monitoring benchmark values for CubicWeb development. This is a huge task, so we tried to stay focused on a few aspects:

    • production reponse times (using tools such as smokeping and munin)
    • response times of test executions in continuous integration tests
    • response times of test instances runinng in continuous integration

    We looked at using cpu.clock() instead of cpu.time() in the xunit files that report test results so as to be a bit more independent of the load of the machine (but subprocesses won't be counted for).

    Graphing test times in hudson/jenkins already exists (/job/PROJECT/BUILDID/testReport/history/?) and can also be graphed by TestClass and by individual test. What is missing so far is a specific dashboard were one could select the significant graphs to look at.

    By the end of the first day we had a "lorem ipsum" test instance that is created on the fly on each hudson/jenkins build and a jmeter bench running on it, it's results processed by the performance plugin.

    http://www.cubicweb.org/file/2184036?vid=download

    By the end of the second day we had some visualisation of existing data collected by apycot using jqplot javascript visulation (cubicweb-jqplot):

    http://www.cubicweb.org/file/2184035?vid=download

    By the end of the sprint, we got patches submitted for the following cubes :

    • apycot
    • cubicweb-jqplot
    • the original jqplot library (update : patch accepted a few days later)

    On the last hour of the sprint, since we had a "lorem ipsum" test application running each time the tests went through the continuous integration, we hacked up a proof of concept to get automatic screenshots of this temporary test application. So far, we get screenshots for firefox only, but it opens up possibilities for other browsers. Inspiration could be drawn from https://browsershots.org/


  • "Data Fast-food": quick interactive exploratory processing and visualization of complex datasets with CubicWeb

    2012/01/19 by Vincent Michel

    With the emergence of the semantic web in the past few years, and the increasing number of high quality open data sets (cf the lod diagram), there is a growing interest in frameworks that allow to store/query/process/mine/visualize large data sets.

    We have seen in previous blog posts how CubicWeb may be used as an efficient knowledge management system for various types of data, and how it may be used to perform complex queries. In this post, we will see, using Geonames data, how CubicWeb may perform simple or complex data mining and machine learning procedures on data, using the datamining cube. This cube adds powerful tools to CubicWeb that make it easy to interactively process and visualize datasets.

    At this point, it is not meant to be used on massive datasets, for it is not fully optimized yet. If you try to perform a TF-IDF (term frequency–inverse document frequency) with a hierarchical clustering on the full dbpedia abstracts dataset, be prepared to wait. But it is a promising way to enrich the user experience while playing with different datasets, for quick interactive exploratory datamining processing (what I've called the "Data fast-food"). This cube is based on the scikit-learn toolbox that has recently gained a huge popularity in the machine learning and Python community. The release of this cube drastically increases the interest of CubicWeb for data management.

    The Datamining cube

    For a given query, similarly to SQL, CubicWeb returns a result set. This result set may be presented by a view to display a table, a map, a graph, etc (see documentation and previous blog posts).

    The datamining cube introduces the possibility to process the result set before presenting it, for example to apply machine learning algorithms to cluster the data.

    The datamining cube is based on two concepts:

    • the concept of processor: basically, a processor transforms a result set in a numpy array, given some criteria defining the mathematical processing, and the columns/rows of the result set to be taken into account. The numpy-array is a polyvalent structure that is widely used for numerical computation. This array could thus be efficiently used with any kind of datamining algorithms. Note that, in our context of knowledge management, it is more convenient to return a numpy array with additional meta-information, such as indices or labels, the result being stored in what we call a cw-array. Meta-information may be useful for display, but is not compulsory.
    • the concept of array-view: the "views" are basic components of CubicWeb, distinguish querying and displaying the data is key in this framework. So, on a given result set, many different views can be applied. In the datamining cube, we simply overload the basic view of CubicWeb, so that it works with cw-array instead of result sets. These array-views are associated to some machine learning or datamining processes. For example, one can apply the k-means (clustering process) view on a given cw-array.

    A very important feature is that the processor and the array-view are called directly through the URL using the two related parameters arid (for ARray ID) and vid (for View ID, standard in CubicWeb).

    http://www.cubicweb.org/file/2154793?vid=download

    Processors

    We give some examples of basic processors that may be found in the datamining cube:

    • AttributesAsFloatArrayProcessor (arid='attr-asfloat'): This processor turns all Int, BigInt and Float attributes in the result set to floats, and returns the corresponding array. The number of rows is equal to the number of rows in the result set, and the number of columns is equal to the number of convertible attributes in the result set.
    • EntityAsFloatArrayProcessor (arid='entity-asfloat'): This processor performs similarly to the AttributesAsFloatArrayProcessor, but keeps the reference to the entities used to create the numpy-array. Thus, this information could be used for display (map, label, ...).
    • AttributesAsTokenArrayProcessor (arid='attr-astoken'): This processor turns all String attributes in the result set in a numpy array, based on a Word-n-gram analyze. This may be used to tokenize a set of strings.
    • PivotTableCountArrayProcessor (arid='pivot-table-count'): This processor is used to create a pivot table, with a count function. Other functions, such as sum or product also exist. This may be used to create some spreadsheet-like views.
    • UndirectedRelationArrayProcessor (arid='undirected-rel'): This processor creates a binary numpy array of dimension (nb_entities, nb_entities), that represents the relations (or corelations) between entities. This may be used for graph-based vizualisation.

    We are also planning to extend the concept of processor to sparse matrix (scipy.sparse), in order to deal with very high dimensional data.

    Array Views

    The array views that are found in the datamining cube, are, for most of them, used for simple visualization. We used HTML-based templates and the Protovis Javascript Library.

    We will not detail all the views, but rather show some examples. Read the reference documentation for a complete and detailed description.

    Examples on numerical data

    Histogram

    The request:

    Any LO, LA WHERE X latitude LA, NOT X latitude NULL, X longitude LO,  NOT X longitude NULL,
    X country C, NOT X elevation NULL, C name "France"
    

    that may be translated as:

    All couples (latitude, longitude) of the locations in France, with an elevation not null
    

    and, using vid=protovis-hist and arid=attr-asfloat

    http://www.cubicweb.org/file/2154795?vid=download

    Scatter plot

    Using the notion of view, we can display differently the same result set, for example using a scatter plot (vid=protovis-scatterplot).

    http://www.cubicweb.org/file/2156233?vid=download

    Another example with the request:

    Any P, E WHERE X is Location, X elevation E, X elevation >1, X population P,
    X population >10, X country CO, CO name "France"
    

    that may be translated as:

    All couples (population, elevation) of locations in France,
    with a population higher than 10 (inhabitants),and an elevation higher than 1 (meter)
    

    and, using the same vid (vid=protovis-scatterplot) and the same arid (arid=attr-asfloat)

    http://www.cubicweb.org/file/2154802?vid=download

    If a third column is given in the result set (and thus in the numpy array), it will be encoded in the size/color of each dot of the scatter plot. For example with the request:

    Any LO, LA, E WHERE X latitude LA, NOT X latitude NULL, X longitude LO,  NOT X longitude NULL,
    X country C, NOT X elevation NULL, X elevation E, C name "France"
    

    that may be translated as:

    All tuples (latitude, longitude, elevation) of the locations in France, with an elevation not null
    

    and, using the same vid (vid=protovis-scatterplot) and the same arid (arid=attr-asfloat), we can visualize the elevation on a map, encoded in size/color

    http://www.cubicweb.org/file/2154805?vid=download

    Another example with the request:

    Any LO, LA LIMIT 50000 WHERE X is Location, X population  >1000, X latitude LA, X longitude LO,
    X country CO, CO name "France"
    

    that may be translated as:

    All couples (latitude, longitude) of 50000 locations in France, with a population higher than 100 (inhabitants)
    
    http://www.cubicweb.org/file/2156095?vid=download

    There also exist some AreaChart view, LineArray view, ...

    Examples on relational data

    Relational Matrix (undirected graph)

    The request:

    Any X,Y WHERE X continent CO, CO name "North America", X neighbour_of Y
    

    that may be translated as:

    All neighbour countries in North America
    

    and using the vid='protovis-binarymap' and arid='undirected-rel'

    http://www.cubicweb.org/file/2154796?vid=download

    Relational Matrix (directed graph)

    If we do not want a symmetric matrix, i.e. if we want to keep the direction of a link (X,Y is not the same relation as Y,X), we can use the directed*rel array processor. For example, with the following request:

    Any X,Y LIMIT 20 WHERE X continent Y
    

    that may be translated as:

    20 countries and their continent
    

    and using the vid='protovis-binarymap' and arid='directed-rel'

    http://www.cubicweb.org/file/2154797?vid=download

    Force directed graph

    For a dynamic representation of relations, we can use a force directed graph. The request:

    Any X,Y WHERE X neighbour_of Y
    

    that may be translated as:

    All neighbour countries in the World.
    

    and using the vid='protovis-forcedirected' and arid='undirected-rel', we can see the full graph, with small independent components (e.g. UK and Ireland)

    http://www.cubicweb.org/file/2154800?vid=download

    Again, a third column in the result set could be used to encode some labeling information, for example the continent.

    The request:

    Any X,Y,CO WHERE X neighbour_of Y, X continent CO
    

    that may be translated as:

    All neighbour countries in the World, and their corresponding continent.
    

    and again, using the vid='protovis-forcedirected' and arid='undirected-rel', we can see the full graph with the continents encoded in color (Americas in green, Africa in dark blue, ...)

    http://www.cubicweb.org/file/2154801?vid=download

    Dendrogram

    For hierarchical information, one can use the Dendrogram view. For example, with the request:

    Any X,Y WHERE X continent Y
    

    that may be translated as:

    All couple (country, continent) in the World
    

    and using vid='protovis-dendrogram' and arid='directed-rel', we have the following dendrogram (we only show a part due to lack of space)

    http://www.cubicweb.org/file/2154806?vid=download

    Unsupervised Learning

    We have also developed some machine learning view for unsupervised learning. This is more a proof of concept than a fully optimized development, but we can already do some cool stuff. Each machine learning processing is referenced by a mlid. For example, with the request:

    Any LO, LA WHERE X is Location, X elevation E, X elevation >1, X latitude LA, X longitude LO,
    X country CO, CO name "France"
    

    that may be translated as:

    All couples (latitude, longitude) of the locations in France, with an elevation higher than 1
    

    and using vid='protovis-scatterplot' arid='attr-asfloat' and mlid='kmeans', we can construct a scatter plot of all couples of latitude and longitude in France, and create 10 clusters using the kmeans clustering. The labeling information is thus encoded in color/size:

    http://www.cubicweb.org/file/2154804?vid=download

    Download

    Finally, we have also implement a download view, based on the Pickle of the numpy-array. It is thus possible to access remotely any data within a Python shell, allowing to process them as you want. Changing the request can be done very easily by changing the rql parameter in the URL. For example:

    import pickle, urllib
    data = pickle.loads(urllib.open('http://mydomain?rql=my request&vid=array-numpy&arid=attr-asfloat'))
    

  • CubicWeb sprint in Paris - 2012/02/07-10

    2011/12/21 by Nicolas Chauvat

    Topics

    To be decided. Some possible topics are :

    • optimization (still)
    • porting cubicweb to python3
    • porting cubicweb to pypy
    • persistent sessions
    • finish twisted / wsgi refactoring
    • inter-instance communication bus
    • use subprocesses to handle datafeeds
    • developing more debug-tools (debug console, view profiling, etc.)
    • pluggable / unpluggable external sources (as needed for the cubipedia and semantic family)
    • client-side only applications (javascript + http)
    • mercurial storage backend: see this thread of the mailing list
    • mercurial-server integration: see this email to the mailing list

    other ideas are welcome, please bring them up on cubicweb@lists.cubicweb.org

    Location

    This sprint will take place from in february 2012 from tuesday the 7th to friday the 10th. You are more than welcome to come along, help out and contribute. An introduction is planned for newcomers.

    Network resources will be available for those bringing laptops.

    Address : 104 Boulevard Auguste-Blanqui, Paris. Ring "Logilab" (googlemap)

    Metro : Glacière

    Contact : http://www.logilab.fr/contact

    Dates : 07/02/2012 to 10/02/2012


  • Geonames in CubicWeb !

    2011/12/14 by Vincent Michel

    CubicWeb is a semantic web framework written in Python that has been succesfully used in large-scale projects, such as data.bnf.fr (French National Library's opendata) or Collections des musées de Haute-Normandie (museums of Haute-Normandie).

    CubicWeb provides a high-level query language, called RQL, operating over a relational database (PostgreSQL in our case), and allows to quickly instantiate an entity-relationship data-model. By separating in two distinct steps the query and the display of data, it provides powerful means for data retrieval and processing.

    In this blog, we will demonstrate some of these capabilities on the Geonames data.

    Geonames

    Geonames is an open-source compilation of geographical data from various sources:

    "...The GeoNames geographical database covers all countries and contains over eight million placenames that are available for download free of charge..." (http://www.geonames.org)

    The data is available as a dump containing different CSV files:

    • allCountries: main file containing information about 8,000,000 places in the world. We won't detail the various attributes of each location, but we will focus on some important properties, such as population and elevation. Moreover, admin_code_1 and admin_code_2 will be used to link the different locations to the corresponding AdministrativeRegion, and feature_code will be used to link the data to the corresponding type.
    • admin1CodesASCII.txt and admin2Codes.txt detail the different administrative regions, that are parts of the world such as region (Ile-de-France), department (Department of Yvelines), US counties...
    • featureCodes.txt details the different types of location that may be found in the data, such as forest(s), first-order administrative division, aqueduct, research institute, ...
    • timeZones.txt, countryInfo.txt, iso-languagecodes.txt are additional files prodividing information about timezones, countries and languages. They will be included in our CubicWeb database but won't be explained in more details here.

    The Geonames website also provides some ways to browse the data: by Countries, by Largest Cities, by Highest mountains, by postal codes, etc. We will see that CubicWeb could be used to automatically create such ways of browsing data while allowing far deeper queries. There are two main challenges when dealing with such data:

    • the number of entries: with 8,000,000 placenames, we have to use efficient tools for storing and querying them.
    • the structure of the data: the different types of entries are separated in different files, but should be merged for efficient queries (i.e. we have to rebuild the different links between entities, e.g Location to Country or Location to AdministrativeRegion).

    Data model

    With CubicWeb, the data model of the application is written in Python. It defines different entity classes with their attributes, as well as the relationships between the different entity classes. Here is a sample of the schema.py that we have used for Geonames data:

    class Location(EntityType):
        name = String(maxsize=1024, indexed=True)
        uri = String(unique=True, indexed=True)
        geonameid = Int(indexed=True)
        latitude = Float(indexed=True)
        longitude = Float(indexed=True)
        feature_code = SubjectRelation('FeatureCode', cardinality='?*', inlined=True)
        country = SubjectRelation('Country', cardinality='?*', inlined=True)
        main_administrative_region = SubjectRelation('AdministrativeRegion',
                                  cardinality='?*', inlined=True)
        timezone = SubjectRelation('TimeZone', cardinality='?*', inlined=True)
        ...
    

    This indicates that the main Location class has a name attribute (string), an uri (string), a geonameid (integer), a latitude and a longitude (both floats), and some relation to other entity classes such as FeatureCode (the relation is named feature_code), Country (the relation is named country), or AdministrativeRegion called main_administrative_region.

    The cardinality of each relation is classically defined in a similar way as RDBMS, where * means any number, ? means zero or one and 1 means one and only one.

    We give below a visualisation of the schema (obtained using the /schema relative url)

    http://www.cubicweb.org/file/2124618?vid=download

    Import

    The data contained in the CSV files could be pushed and stored without any processing, but it is interesting to reconstruct the relations that may exist between different entities and entity classes, so that queries will be easier and faster.

    Executing the import procedure took us 80 minutes on regular hardware, which seems very reasonable given the amount of data (~7,000,000 entities, 920MB for the allCountries.txt file), and the fact that we are also constructing many indexes (on attributes or on relations) to improve the queries. This import procedure uses some low-level SQL commands to load the data into the underlying relational database.

    Queries and views

    As stated before, queries are performed in CubicWeb using RQL (Relational Query Language), which is similar to SPARQL, but with a syntax that is closer to SQL. This language may be used to query directly the concepts while abstracting the physical structure of the underlying database. For example, one can use the following request:

    Any X LIMIT 10 WHERE X is Location, X population > 1000000,
        X country C, C name "France"
    

    that means:

    Give me 10 locations that have a population greater than 1000000, and that are in a country named "France"

    The corresponding SQL query is:

    SELECT _X.cw_eid FROM cw_Country AS _C, cw_Location AS _X
    WHERE _X.cw_population>1000000
          AND _X.cw_country=_C.cw_eid AND _C.cw_name="France"
    LIMIT 10
    

    We can see that RQL is higher-level than SQL and abstracts the details of the tables and the joins.

    A query returns a result set (a list of results), that can be displayed using views. A main feature of CubicWeb is to separate the two steps of querying the data and displaying the results. One can query some data and visualize the results in the standard web framework, download them in different formats (JSON, RDF, CSV,...), or display them in some specific view developed in Python.

    In particular, we will use the mapstraction.map which is based on the Mapstraction and the OpenLayers libraries to display information on maps using data from OpenStreetMap. This mapstraction.map view uses a feature of CubicWeb called adapter. An adapter adapts a class of entity to some interface, hence views can rely on interfaces instead of types and be able to display entities with different attributes and relations. In our case, the IGeocodableAdapter returns a latitude and a longitude for a given class of entity (here, the mapping is trivial, but there are more complex cases... :) ):

    class IGeocodableAdapter(EntityAdapter):
          __regid__ = 'IGeocodable'
          __select__ = is_instance('Location')
          @property
          def latitude(self):
              return self.entity.latitude
          @property
          def longitude(self):
              return self.entity.longitude
    

    We will give some results of queries and views later. It is important to notice that the following screenshoots are taken without any modification of the standard web interface of CubicWeb. It is possible to write specific views and to define a specific CSS, but we only wanted to show how CubicWeb could handle such data. However, the default web template of CubicWeb is sufficient for what we want to do, as it dynamically creates web pages showing attributes and relations, as well as some specific forms and javascript applets adapted directly to the data (e.g. map-based tools). Last but not least, the query and the view could be defined within the url, and thus open a world of new possibilities to the user:

    http://baseurl:port/?rql=The query that I want&vid=Identifier-of-the-view
    

    Facets

    We will not get into too much details about Facets, but let's just say that this feature may be used to determine some filtering axis on the data, and thus may be used to post-filter a result set. In this example, we have defined four different facets: on the population, on the elevation, one the feature_code and one the main_administrative_region. We will see illustration of these facets below.

    We give here an example of the definition of a Facet:

    class LocationPopulationFacet(facet.RangeFacet):
        __regid__ = 'population-facet'
        __select__ = is_instance('Location')
        order = 2
        rtype = 'population'
    

    where __select__ defines which class(es) of entities are targeted by this facet, order defines the order of display of the different facets, and rtype defines the target attribute/relation that will be used for filtering.

    Geonames in CubicWeb

    The main page of the Geoname application is illustrated in the screenshot below. It provides general information on the database, in particular the number of entities in the different classes:

    • 7,984,330 locations.
    • 59,201 administrative regions (e.g. regions, counties, departments...)
    • 7,766 languages.
    • 656 features (e.g. types of location).
    • 410 time zones.
    • 252 countries.
    • 7 continents.
    http://www.cubicweb.org/file/2124617?vid=download

    Simple query

    We will first illustrate the possibilites of CubicWeb with the simple query that we have detailed before (that could be directly pasted in the url...):

    Any X LIMIT 10 WHERE X is Location, X population > 1000000,
        X country C, C name "France"
    

    We obtain the following page:

    http://www.cubicweb.org/file/2124615?vid=download

    This is the standard view of CubicWeb for displaying results. We can see (right box) that we obtain 10 locations that are indeed located in France, with a population of more than 1,000,000 inhabitants. The left box shows the search panel that could be used to launch queries, and the facet filters that may be used for filtering results, e.g. we may ask to keep only results with a population greater than 4,767,709 inhabitants within the previous results:

    http://www.cubicweb.org/file/2124616?vid=download

    and we obtain now only 4 results. We can also notice that the facets are linked: by restricting the result set using the population facet, the other facets also restricted their possibilities.

    Simple query (but with more information !)

    Let's say that we now want more information about the results that we have obtained previously (for example the exact population, the elevation and the name). This is really simple ! We just have to ask within the RQL query what we want (of course, the names N, P, E of the variables could be almost anything...):

    Any N, P, E LIMIT 10 WHERE X is Location,
        X population P, X population > 1000000,
        X elevation E, X name N, X country C, C name "France"
    
    http://www.cubicweb.org/file/2124619?vid=download

    The empty column for the elevation simply means that we don't have any information about elevation.

    Anyway, we can see that fetching particular information could not be simpler! Indeed, with more complex queries, we can access countless information from the Geonames database:

    Any N,E,LA,LO ORDERBY E DESC LIMIT 10  WHERE X is Location,
          X latitude LA, X longitude LO,
          X elevation E, NOT X elevation NULL, X name N,
          X country C, C name "France"
    

    which means:

    Give me the 10 highest locations (the 10 first when sorting by decreasing elevation) with their name, elevation, latitude and longitude that are in a country named "France"
    http://www.cubicweb.org/file/2124626?vid=download

    We can now use another view on the same request, e.g. on a map (view mapstraction.map):

    Any X ORDERBY E DESC LIMIT 10  WHERE X is Location,
           X latitude LA, X longitude LO, X elevation E,
           NOT X elevation NULL, X country C, C name "France"
    
    http://www.cubicweb.org/file/2124631?vid=download

    And now, we can add the fact that we want more results (20), and that the location should have a non-null population:

    Any N, E, P, LA, LO ORDERBY E DESC LIMIT 20  WHERE X is Location,
           X latitude LA, X longitude LO,
           X elevation E, NOT X elevation NULL, X population P,
           X population > 0, X name N, X country C, C name "France"
    
    http://www.cubicweb.org/file/2124632?vid=download

    ... and on a map ...

    http://www.cubicweb.org/file/2124633?vid=download

    Conclusion

    In this blog, we have seen how CubicWeb could be used to store and query complex data, while providing (among other...) Web-based views for data vizualisation. It allows the user to directly query data within the URL and may be used to interact with and explore the data in depth. In a next blog, we will give more complex queries to show the full possibilities of the system.


  • Importing thousands of entities into CubicWeb within a few seconds with dataimport

    2011/12/09 by Adrien Di Mascio

    In most cubicweb projects I've been developing on, there always comes a time where I need to import legacy data in the new application. CubicWeb provides Store and Controller objects in the dataimport module. I won't talk here about the recommended general procedure described in the module's docstring (I find it a bit convoluted for simple cases) but I will focus on Store objects. Store objects in this module are more or less a thin layer around session objects, they provide high-level helpers such as create_entity(), relate() and keep track of what was inserted, errors occurred, etc.

    In a recent project, I had to create a somewhat fair amount (a few million) of simple entities (strings, integers, floats and dates) and relations. Default object store (i.e. cubicweb.dataimport.RQLObjectStore) is painfully slow, the reason being all integrity / security / metadata hooks that are constantly selected and executed. For large imports, dataimport also provides the cubicweb.dataimport.NoHookRQLObjectStore. This store bypasses all hooks and uses the underlying system source primitives directly, making it around two-times faster than the standard store. The problem is that we're still doing each sql query sequentially and we're talking here of millions of INSERT / UPDATE queries.

    My idea was to create my own ObjectStore class inheriting from NoHookRQLObjectStore that would try to use executemany or even copy_from when possible [1]. It is actually not hard to make groups of similar SQL queries since create_entity() generates the same query for a given set of parameters. For instance:

    create_entity('Person', firstname='John', surname='Doe')
    create_entity('Person', firstname='Tim', surname='BL')
    

    will generate the following sql queries:

    INSERT INTO cw_Person ( cw_cwuri, cw_eid, cw_modification_date,
                            cw_creation_date, cw_firstname, cw_surname )
           VALUES ( %(cw_cwuri)s, %(cw_eid)s, %(cw_modification_date)s,
                    %(cw_creation_date)s, %(cw_firstname)s, %(cw_surname)s )
    INSERT INTO cw_Person ( cw_cwuri, cw_eid, cw_modification_date,
                            cw_creation_date, cw_firstname, cw_surname )
           VALUES ( %(cw_cwuri)s, %(cw_eid)s, %(cw_modification_date)s,
                    %(cw_creation_date)s, %(cw_firstname)s, %(cw_surname)s )
    

    The only thing that will differ is the actual data inserted. Well ... ahem ... CubicWeb actually also generates a "few" extra sql queries to insert metadata for each entity:

    INSERT INTO is_instance_of_relation(eid_from,eid_to) VALUES (%s,%s)
    INSERT INTO is_relation(eid_from,eid_to) VALUES (%s,%s)
    INSERT INTO cw_source_relation(eid_from,eid_to) VALUES (%s,%s)
    INSERT INTO owned_by_relation ( eid_to, eid_from ) VALUES ( %(eid_to)s, %(eid_from)s )
    INSERT INTO created_by_relation ( eid_to, eid_from ) VALUES ( %(eid_to)s, %(eid_from)s )
    

    Those extra queries are actually even exactly the same for each entity insterted, whatever the entity type is, hence craving for executemany or copy_from. Grouping together SQL queries is not that hard [2] but has a drawback : as you don't have an intermediate state (the data is actually inserted only at the very end of the process), you loose the ability to query your database to fetch the entities you've just created during the import.

    Now, a few benchmarks ...

    To create those benchmarks, I decided to use the workorder cube which is a simple cube, yet complete enough : it provides only two entity types (WorkOrder and Order), a relation between them (Order split_into WorkOrder) and uses different kind of attributes (String, Date, Float).

    Once the cube was instantiated, I ran the following script to populate the database with my 3 different stores:

    import sys
    from datetime import date
    from random import choice
    from itertools import count
    
    from logilab.common.decorators import timed
    
    from cubicweb import cwconfig
    from cubicweb.dbapi import in_memory_repo_cnx
    
    def workorders_data(n, seq=count()):
        for i in xrange(n):
            yield {'title': u'wo-title%s' % seq.next(), 'description': u'foo',
                   'begin_date': date.today(), 'end_date': date.today()}
    
    def orders_data(n, seq=count()):
        for i in xrange(n):
            yield {'title': u'o-title%s' % seq.next(), 'date': date.today(), 'budget': 0.8}
    
    def split_into(orders, workorders):
        for workorder in workorders:
            yield choice(orders), workorder
    
    def initial_state(session, etype):
        return session.execute('Any S WHERE S is State, WF initial_state S, '
                               'WF workflow_of ET, ET name %(etn)s', {'etn': etype})[0][0]
    
    
    @timed
    def populate(store, nb_workorders, nb_orders, set_state=False):
        orders = [store.create_entity('Order', **attrs)
                  for attrs in orders_data(nb_orders)]
        workorders = [store.create_entity('WorkOrder', **attrs)
                      for attrs in workorders_data(nb_workorders)]
        ## in_state is set by a hook, so NoHookObjectStore will need
        ## to set the relation manually
        if set_state:
            order_state = initial_state(store.session, 'Order')
            workorder_state = initial_state(store.session, 'WorkOrder')
            for order in orders:
                store.relate(order.eid, 'in_state', order_state)
            for workorder in workorders:
                store.relate(workorder.eid, 'in_state', workorder_state)
        for order, workorder in split_into(orders, workorders):
            store.relate(order.eid, 'split_into', workorder.eid)
        store.commit()
    
    
    if __name__ == '__main__':
        config = cwconfig.instance_configuration(sys.argv[1])
        nb_orders = int(sys.argv[2])
        nb_workorders = int(sys.argv[3])
        repo, cnx = in_memory_repo_cnx(config, login='admin', password='admin')
        session = repo._get_session(cnx.sessionid)
        from cubicweb.dataimport import RQLObjectStore, NoHookRQLObjectStore
        from cubes.mycube.dataimport.store import CopyFromRQLObjectStore
        print 'testing RQLObjectStore'
        store = RQLObjectStore(session)
        populate(store, nb_workorders, nb_orders)
        print 'testing NoHookRQLObjectStore'
        store = NoHookRQLObjectStore(session)
        populate(store, nb_workorders, nb_orders, set_state=True)
        print 'testing CopyFromRQLObjectStore'
        store = CopyFromRQLObjectStore(session)
    

    I ran the script and asked to create 100 Order entities, 1000 WorkOrder entities and to link each created WorkOrder to a parent Order

    adim@esope:~/tmp/bench_cwdi$ python bench_cwdi.py bench_cwdi 100 1000
    testing RQLObjectStore
    populate clock: 24.590000000 / time: 46.169721127
    testing NoHookRQLObjectStore
    populate clock: 8.100000000 / time: 25.712352991
    testing CopyFromRQLObjectStore
    populate clock: 0.830000000 / time: 1.180006981
    

    My interpretation of the above times is :

    • The clock time indicates the time spent on CubicWeb server side (i.e. hooks and data pre/postprocessing around SQL queries). The time time should be the sum of clock time + time spent in postgresql.
    • RQLObjectStore is slow ;-). Nothing new here, but the clock/time ratio means that we're speding a lot of time on the python side (i.e. hooks as I told earlier) and a fair amount of time in postgresql.
    • NoHookRQLObjectStore really takes down the time spent on the python side, the time in postgresql remains about the same as for RQLObjectStore, this is not surprising, queries performed are the same in both cases.
    • CopyFromRQLObjectStore seems blazingly fast in comparison (inserting a few thousands of elements in postgresql with a COPY FROM statement is not a problem). And ... yes, I checked the data was actually inserted, and I even a ran a cubicweb-ctl db-check on the instance afterwards.

    This probably opens new perspective for massive data imports since the client API remains the same as before for the programmer. It's still a bit experimental, can only be used for "dummy", brute-force import scenario where you can preprocess your data in Python before updating the database, but it's probably worth having such a store in the the dataimport module.

    [1]The idea is to promote an executemany('INSERT INTO ...', data) statement into a COPY FROM whenever possible (i.e. simple data types, easy enough to escape). In that case, the underlying database and python modules have to provide support for this functionality. For the record, the psycopg2 module exposes a copy_from() method and soon logilab-database will provide an additional high-level helper for this functionality (see this ticket).
    [2]The code will be posted later or even integrated into CubicWeb at some point. For now, it requires a bit of monkey patching around one or two methods in the source so that SQL is not executed but just recorded for later executions.

  • Reusing OpenData from Data.gouv.fr with CubicWeb in 2 hours

    2011/12/07 by Vincent Michel

    Data.gouv.fr is great news for the OpenData movement!

    Two days ago, the French government released thousands of data sets on http://data.gouv.fr/ under an open licensing scheme that allows people to access and play with them. Thanks to the CubicWeb semantic web framework, it took us only a couple hours to put some of that open data to good use. Here is how we mapped the french railway system.

    http://www.cubicweb.org/file/2110281?vid=download

    Train stations in french Britany

    Source Datasets

    We used two of the datasets available on data.gouv.fr:

    • Train stations : description of the 6442 train stations in France, including their name, type and geographic coordinates. Here is a sample of the file

      441000;St-Germain-sur-Ille;Desserte Voyageur;48,23955;-1,65358
      441000;Montreuil-sur-Ille;Desserte Voyageur-Infrastructure;48,3072;-1,6741
      
    • LevelCrossings : description of the 18159 level crossings on french railways, including their type and location. Here is a sample of the file

      558000;PN privé pour voitures avec barrières sans passage piétons accolé;48,05865;1,60697
      395000;PN privé pour voitures avec barrières avec passage piétons accolé public;;48,82544;1,65795
      

    Data Model

    Given the above datasets, we wrote the following data model to store the data in CubicWeb:

    class Location(EntityType):
        name = String(indexed=True)
        latitude = Float(indexed=True)
        longitude = Float(indexed=True)
        feature_type = SubjectRelation('FeatureType', cardinality='?*')
        data_source = SubjectRelation('DataGovSource', cardinality='1*', inlined=True)
    
    class FeatureType(EntityType):
        name = String(indexed=True)
    
    class DataGovSource(EntityType):
        name = String(indexed=True)
        description = String()
        uri = String(indexed=True)
        icon = String()
    

    The Location object is used for both train stations and level crossings. It has a name (text information), a latitude and a longitude (numeric information), it can be linked to multiple FeatureType objects and to a DataGovSource. The FeatureType object is used to store the type of train station or level crossing and is defined by a name (text information). The DataGovSource object is defined by a name, a description and a uri used to link back to the source data on data.gouv.fr.

    http://www.cubicweb.org/file/2110311?vid=download

    Schema of the data model

    Data Import

    We had to write a few lines of code to benefit from the massive data import feature of CubicWeb before we could load the content of the CSV files with a single command:

    $ cubicweb-ctl import-datagov-location datagov_geo gare.csv-fr.CSV  --source-type=gare
    $ cubicweb-ctl import-datagov-location datagov_geo passage_a_niveau.csv-fr.CSV  --source-type=passage
    

    In less than a minute, the import was completed and we had:

    • 2 DataGovSource objects, corresponding to the two data sets,
    • 24 FeatureType objects, corresponding to the different types of locations that exist (e.g. Non exploitée, Desserte Voyageur, PN public isolé pour piétons avec portillons or PN public pour voitures avec barrières gardé avec passage piétons accolé manoeuvré à distance),
    • 24601 Locations, corresponding to the different train stations and level crossings.

    Data visualization

    CubicWeb allows to build complex applications by assembling existing components (called cubes). Here we used a cube that wraps the Mapstraction and the OpenLayers libraries to display information on maps using data from OpenStreetMap.

    In order for the Location type defined in the data model to be displayable on a map, it is sufficient to write the following adapter:

    class IGeocodableAdapter(EntityAdapter):
          __regid__ = 'IGeocodable'
          __select__ = is_instance('Location')
          @property
          def latitude(self):
              return self.entity.latitude
          @property
          def longitude(self):
              return self.entity.longitude
    

    That was it for the development part! The next step was to use the application to browse the structure of the french train network on the map.

    Train stations in use:

    http://www.cubicweb.org/file/2110279?vid=download

    Train stations not in use:

    http://www.cubicweb.org/file/2110280?vid=download

    Zooming on some parts of the map, for example Brittany, we get to see more details and clicking on the train icons gives more information on the corresponding Location.

    Train stations in use:

    http://www.cubicweb.org/file/2110281?vid=download

    Train stations not in use:

    http://www.cubicweb.org/file/2110282?vid=download

    Since CubicWeb separates querying the data and displaying the result of a query, we can switch the view to display the same data in tables or to export it back to a CSV file.

    http://www.cubicweb.org/file/2110313?vid=download

    Querying Data

    CubicWeb implements a query langage very similar to SPARQL, that makes the data available without the need to learn a specific API.

    • Example 1: http:/some.url.demo/?rql=Any X WHERE X is Location, X name LIKE "%miny"

      This request gives all the Location with a name that ends with "miny". It returns only one element, the Firminy train station.

    http://www.cubicweb.org/file/2110286?vid=download
    • Example 2: http:/some.url.demo/?rql=Any X WHERE X is Location, X name LIKE "%ny"

      This request gives all the Location with a name that ends with "ny", and return 112 trainstations.

    http://www.cubicweb.org/file/2110287?vid=download
    • Example 3: http:/some.url.demo/?rql=Any X WHERE X latitude < 47.8, X latitude>47.6, X longitude >-1.9, X longitude<-1.8

      This request gives all the Location that have a latitude between 47.6 and 47.8, and a longitude between -1.9 and -1.8.

      We obtain 11 Location (9 levelcrossings and 2 trainstations). We can map them using the view mapstraction.map that we describe previously.

      http://www.cubicweb.org/file/2110288?vid=download
    • Example 4: http:/domainname:8080/?rql=Any X WHERE X latitude < 47.8, X latitude>47.6, X longitude >-1.9, X longitude<-1.8, X feature_type F, F name "Desserte Voyageur"

      Will limit the previous results set to train stations that are used for passenger service:

      http://www.cubicweb.org/file/2110289?vid=download
    • Example 5: http:/domainname:8080/?rql=Any X WHERE X feature_type F, F name "PN public pour voitures sans barrières sans SAL"&vid=mapstraction.map

      Finally, one can map all the level crossings for vehicules without barriers (there are 3704):

      http://www.cubicweb.org/file/2110290?vid=download http://www.cubicweb.org/file/2110291?vid=download

    As you could see in the last URL, the map view was chosen directly with the parameter vid, meaning that the URL is shareable and can be easily included in a blog with a iframe for example.

    Data sharing

    The result of a query can also be "displayed" in RDF, thus allowing users to download a semantic version of the information, without having to do the preprocessing themselves:

    <rdf:Description rdf:about="cwuri24684b3a955d4bb8830b50b4e7521450">
      <rdf:type rdf:resource="http://ns.cubicweb.org/cubicweb/0.0/Location"/>
      <cw:cw_source rdf:resource="http://some.url.demo/"/>
      <cw:longitude rdf:datatype="http://www.w3.org/2001/XMLSchema#float">-1.89599</cw:longitude>
      <cw:latitude rdf:datatype="http://www.w3.org/2001/XMLSchema#float">47.67778</cw:latitude>
      <cw:feature_type rdf:resource="http://some.url.demo/7222"/>
      <cw:data_source rdf:resource="http://some.url.demo/7206"/>
    </rdf:Description>
    

    Conclusion

    For someone who knows the CubicWeb framework, a couple hours are enough to create a CubicWeb application that stores, displays, queries and shares data downloaded from http://www.data.gouv.fr/

    The full source code for the above will be released before the end of the week.

    If you want to see more of CubicWeb in action, browse http://data.bnf.fr or learn how to develop your own application at http://docs.cubicweb.org/


  • Roundup of "Powered by Cubicweb" websites

    2011/11/15 by Arthur Lutz

    Here is a (incomplete) list of public websites powered by Cubicweb. A lot of CubicWeb technology is used for private web applications in large companies that we can not list here.

    Demos are listed here : http://www.cubicweb.org/card/demo

    You can also find a list of the companies providing services for Cubicweb (with a few extra examples) : https://www.cubicweb.org/card/CubicWebServiceProviders


  • What's new in CubicWeb 3.14?

    2011/11/10 by Sylvain Thenault

    The development of CubicWeb 3.14 was rather long and included a lot of API changes detailed here. As usual backward compatibility is provided for public APIs.

    Please note this release depends on yams 0.34 (which is incompatible with prior cubicweb releases regarding instance re-creation).

    API changes

    • Entity.fetch_rql the restriction argument has been deprecated and should be replaced with a call to the new Entity.fetch_rqlst method, get the returned value (a rql Select node) and use the RQL syntax tree API to include the above-mentioned restrictions.

      Backward compat is kept with proper warning.

    • Entity.fetch_order and Entity.fetch_unrelated_order class methods have been replaced by Entity.cw_fetch_order and Entity.cw_fetch_unrelated_order with a different prototype:

      • instead of taking (attr, var) as two string argument, they now take (select, attr, var) where select is the rql syntax tree being constructed and var the variable node.
      • instead of returning some string to be inserted in the 'ORDERBY' clause, it has to modify the syntax tree

      Backward compat is kept with proper warning, except if:

      • custom order method returns something else the a variable name with or without the sorting order (e.g. cases where you sort on the value of a registered procedure as it was done in the tracker for instance). In such case, an error is logged telling that this sorting is ignored until API upgrade.
      • client code uses direct access to one of those methods on an entity (no code known to do that).
    • Entity._rest_attr_info class method has been renamed to Entity.cw_rest_attr_info

      No backward compat since this is a protected method an no code is known to use it outside cubicweb itself.

    • AnyEntity.linked_to has been removed as part of a refactoring of this functionality (link a entity to another one at creation step). It was replaced by a EntityFieldsForm.linked_to property.

      In the same refactoring, cubicweb.web.formfield.relvoc_linkedto, cubicweb.web.formfield.relvoc_init and cubicweb.web.formfield.relvoc_unrelated were removed and replaced by RelationField methods with the same names, that take a form as a parameter.

      No backward compatibility yet. It's still time to cry for it. Cubes known to be affected: tracker, vcsfile, vcreview.

    • CWPermission entity type and its associated require_permission relation type (abstract) and require_group relation definitions have been moved to a new localperms cube. Some functions from the cubicweb.schemas package as well as some views where moved too. This makes cubicweb itself smaller while you get all the local permissions stuff into a single and documented place.

      Backward compat is kept for existing instances, though you should have installed the localperms cubes. A proper error should be displayed when trying to migrate to 3.14 an instance the use CWPermission without the new cube installed. For new instances / test, you should add a dependancy on the new cube in cubes using this feature, along with a dependancy on cubicweb >= 3.14.

    • jQuery has been updated to 1.6.4 and jquery-tablesorter to 2.0.5. No backward compat issue known.

    • Table views refactoring : new RsetTableView and EntityTableView, as well as rewritten an enhanced version of PyValTableView on the same bases, with logic moved to some column renderers and a layout. Those should be well documented and deprecates former TableView, EntityAttributesTableView and CellView, which are however kept for backward compat, with some warnings that may not be very clear unfortunatly (you may see your own table view subclass name here, which doesn't make the problem that clear). Notice that _cw.view('table', rset, *kwargs) will be routed to the new RsetTableView or to the old TableView depending on given extra arguments. See #1986413.

    • display_name don't call .lower() anymore. This may leads to changes in your user interface. Different msgid for upper/lower cases version of entity type names, as this is the only proper way to handle this with some languages.

    • IEditControlAdapter has been deprecated in favor of EditController overloading, which was made easier by adding dedicated selectors called match_edited_type and match_form_id.

    • Pre 3.6 API backward compat has been dropped, though data migration compatibility has been kept. You may have to fix errors due to old API usage for your instance before to be able to run migration, but then you should be able to upgrade even a pre 3.6 database.

    • Deprecated cubicweb.web.views.iprogress in favor of new iprogress cube.

    • Deprecated cubicweb.web.views.flot in favor of new jqplot cube.

    Unintrusive API changes

    • Refactored properties forms (eg user preferences and site wide properties) as well as pagination components to ease overridding.

    • New cubicweb.web.uihelper module with high-level helpers for uicfg.

    • New anonymized_request decorator to temporary run stuff as an anonymous user, whatever the currently logged in user.

    • New 'verbatimattr' attribute view.

    • New facet and form widget for Integer used to store binary mask.

    • New js_href function to generated proper javascript href.

    • match_kwargs and match_form_params selectors both accept a new once_is_enough argument.

    • printable_value is now a method of request, and may be given dict of formatters to use.

    • [Rset]TableView allows to set None in 'headers', meaning the label should be fetched from the result set as done by default.

    • Field vocabulary computation on entity creation now takes __linkto information into accounet.

    • Started a cubicweb.pylintext pylint plugin to help pylint analyzing cubes: you should now use

      pylint --load-plugins=cubicweb.pylintext
      

      to analyse your cubicweb code.

    RQL

    User interface changes

    • Datafeed source now present an history of the latest import's log, including global status and debug/info/warning/error messages issued during imports. Import logs older than a configurable amount of time are automatically deleted.
    • Breadcrumbs component is properly kept when creating an entity with '__linkto'.
    • users and groups management now really lead to that (i.e. includes groups management).
    • New 'jsonp' controller with 'jsonexport' and 'ejsonexport' views.

    Configuration

    • Added option 'resources-concat' to make javascript/css files concatenation optional, making JS debugging a lot easier when needed.

    As usual, the 3.14 also includes a bunch of other minor changes, and bug fixes, though this time an effort has been done so that every API changes / new API should be listed here. Please download and install CubicWeb 3.14 and report any problem on the tracker and/or the mailing-list!

    Enjoy!


  • ensure that 2 boolean attributes of an entity never have the same value

    2011/09/08

    I want to implement an entity with 2 boolean attributes, and a requirement is that these two attributes never have the same boolean value (think of some kind of radio buttons).

    Let's start with a simple schema example:

    # in schema.py
    class MyEntity(EntityType):
       use_option1 = Boolean(required=True, default=True)
       use_option2 = Boolean(required=True, default=False)
    

    So new entities will be conform to the spec.

    To do this, you need two things:

    • a constraint in the entity schema which will ring if both attributes have the same value
    • a hook which will toggle the other attribute when one attribute is changed.

    RQL constraints are generally meant to be used on relations, but you can use them on attributes too. Simply use 'S' to denote the entity, and write the constraint normally. You need to have the same constraint on both attributes, because the constraint evaluation is triggered by the modification of the attribute.

    # in schema.py
    class MyEntity(EntityType):
       use_option1 = Boolean(required=True, default=True,
                             constraints = [
                                  RQLConstraint('S use_option1 O1, S use_option2 != O1')
                                           ])
       use_option2 = Boolean(required=True, default=False,
                             constraints = [
                                  RQLConstraint('S use_option1 O1, S use_option2 != O1')
                                           ])
    

    With this update, it is no longer possible to have both options set to True or False (you will get a ValidationError). The nice thing to have is to get the other option to be updated when one of the two attributes is changed, which means that you don't have to take care of this when editing the entity in the web interface (which you cannot do anyway if you are using reledit for instance).

    A nice way of writing the hook is to use Python's sets to avoid tedious logic code:

    class RadioButtonUpdateHook(Hook):
       '''ensure use_option1 = not use_option2 (and conversely)'''
       __regid__ = 'mycube.radiobuttonhook'
       events = ('before_update_entity', 'before_add_entity')
       __select__ = Hook.__select__ & is_instance('MyEntity')
       # we prebuild the set of boolean attribute names
       _flag_attributes = set(('use_option1', 'use_option2'))
       def __call__(self):
           entity = self.entity
           edited = set(entity.cw_edited)
           attributes = self._flag_attributes
           if attributes.issubset(edited):
               # both were changed, let the integrity hooks do their job
               return
           if not attributes & edited:
               # none of our attributes where changed, do nothing
               return
           # find which attribute was modified
           modified_set = attributes & edited
           # find the name of the other attribute
           to_change = (attributes - modified_set).pop()
           modified_name = modified_set.pop()
           # set the value of that attribute
           entity.cw_edited[to_change] = not entity.cw_edited[modified_name]
    

    That's it!


  • What's new in CubicWeb 3.13?

    2011/07/21 by Sylvain Thenault

    CubicWeb 3.13 has been developed for a while and includes some cool stuff:

    • generate and handle Apache's modconcat compatible URLs, to minimize the number of HTTP requests necessary to retrieve JS and CSS files, along with a new cubicweb-ctl command to generate a static 'data' directory that can be served by a front-end instead of CubicWeb
    • major facet enhancements:
      • nicer layout and visual feedback when filtering is in-progress
      • new RQLPathFacet to easily express new filters that are more than one hop away from the filtered entities
      • a more flexibile API, usable in cases where it wasn't previously possible
    • some form handling refactorings and cleanups, notably introduction of a new method to process posted content, and updated documentation
    • support for new base types : BigInt, TZDateTime and TZTime (in 3.12 actually for those two)
    • write queries optimization, and several RQL fixes on complex queries (e.g. using HAVING, sub-queries...), as well as new support for CAST() function and REGEXP operator
    • datafeed source and default CubicWeb xml parsers:
      • refactored into smaller and overridable chunks
      • easier to configure
      • make it work

    As usual, the 3.13 also includes a bunch of other minor enhancements, refactorings and bug fixes. Please download and install CubicWeb 3.13 and report any problem on the tracker and/or the mailing-list!

    Enjoy!


  • CubicWeb sprint in Paris / Need for Speed

    2011/03/22 by Adrien Di Mascio

    Logilab is hosting a CubicWeb sprint - 3 days in our Paris offices.

    The general focus will be on speed :

    • on cubicweb-server side : improve performance of massive insertions / deletions
    • on cubicweb-client side : cache implementation, HTTP server, massive parallel usage, etc.

    This sprint will take place from in April 2011 from tuesday the 26th to thursday the 28th. You are more than welcome to come along and help out, contribute, but unlike previous sprints, at least basic knowledge of CubicWeb will be required for participants since no introduction is planned.

    Network resources will be available for those bringing laptops.

    Address : 104 Boulevard Auguste-Blanqui, Paris. Ring "Logilab" (googlemap)

    Metro : Glacière

    Contact : http://www.logilab.fr/contact

    Dates : 26/04/2011 to 28/04/2011


  • What's new in CubicWeb 3.11?

    2011/02/18 by Sylvain Thenault

    Unlike recent major version of CubicWeb, the 3.11 doesn't come with many API changes or refactorings and introduces a fairly small set of new features. But those are important features!

    • 'pyrorql' sources mapping is now stored in the database instead of a python file in the instance's home. This eases the deployment and maintenance of distributed aplications.

    • A new 'datafeed' source was introduced, inspired by the soon to be deprecated datafeed cube. It needs polishing but sets the foundation for advanced semantic web applications that import content from others site using simple http request.

      A 'datafeed' source is associated to a parser that analyses the imported data and then creates/updates entities accordingly. There is currently a single parser in the core that imports CubicWeb-generated xml and needs to be configured with a mapping information that defines how relations are to be followed. It provides a viable alternative to 'pyrorql' sources. Other parsers to import RDF, RSS, etc should come soon.

      A new facet to filter entities based on the source they came from is now available.

    • The management interface for users, groups, sources and site preferences was simplified so it should be more intuitive to newbies (and others). Most items have been dropped from the user drop-down menu and the simpler views were made available through the '/manage' url.

    • The default 'index' / 'manage' view has been simplified to deprecate features that rely on external folder and card cubes. That's almost the only deprecation warning you'll get in upgrading to 3.11. Just this one won't hurt!

    • The old_calendar module has been dropped in favor of jQuery's fullcalendar powered views. That's a great news for applications using calendar features. Since it was added to the exising calendar module, you shouldn't have to change anything to get it working, unless you were using old_calendar in which case you may have to update a few things. This work was initiated by our mexican friends from Crealibre.

    As usual, the 3.11 also includes a bunch of other minor enhancements, refactorings and bug fixes. Please download and install CubicWeb 3.11 and report any problem to the mailing-list!

    Enjoy!


  • A simple scalable web server HA architecture suitable for medium sized projects

    2011/02/14 by Florent Cayré

    Having deployed and maintained several public medium sized web sites running CubicWeb when I worked at SecondWeb, I was asked by my friends from Logilab to write a blog post describing how we managed our deployment while working with the customer and the hosting company.

    Non technical (albeit important) considerations

    Customers that want to run such a medium traffic web site either tell you which hosting company they partner with, or ask you to find one, so you have no other choice to deal with an external hosting structure to manage the servers. I prefer this by the way because:

    1. High Availability (HA) hosting really requires skills and hardware that are neither common nor cheap;
    2. HA hosting requires 24/7/365 availability that SecondWeb could not (and did not even want to) offer.

    It is clearly difficult for all parties (try to put yourself in the shoes of the customer...) to manage a website with 3 partners involved, each with their own goals. From the development leader point of view, you will notice that the technical people of the hosting company continuously change and you keep seeing the same operational errors even if you provide and keep improving high quality documentation. The software upgrade documentation has to be particularly clear as it greatly influences the overall web site availability. You also have to keep an history of the interventions on the servers yourself and maintain an up-to-date copy of the configuration files.

    The overall architecture proposed here partly benefits from this experience with managed hosting company, in that we tried to keep it simple.

    Which traffic size ? Why not bigger ?

    The architecture proposed here has been successfully tested with sites delivering web pages to up to 2 millions unique visitors per month. It should scale further up depending on your site database access needs: if you need very fresh data and have a lot of write operations to the database, you will need to distribute database access amongst several servers, which is beyond the scope of this post.

    This is the main limitation of the proposed architecture and the reason why it is not well-suited for a bigger traffic.

    Design choices

    Load balancing - Preserve user sessions

    To achieve very high availability for your web site, you must have no single point of failure in the whole architecture, which can be far from reasonable from the costs point of view. However, hosting companies can share costs between their customers and have them benefit from a double network infrastructure all along the way from the Internet to your web servers, themselves hosted on two distant locations. You may then choose an even number of web servers, half of them hosted on each network infrastructure.

    The important thing is that you must preserve user sessions. As of CubicWeb 3.10, DB persistent sessions have not been implemented yet (it will soon, there is a ticket planned for this functionality), thus you must preserve session cookies by always directing a given user to the same web server, which is usually achieved by configuring the load balancer(s) in IP hash mode (it is faster than balancing on the session cookie, which implies reaching the http stack rather than staying at the TCP/IP level).

    Squid caching, processor load balancing

    Now if you have multi-processor web servers (which is very likely these times) you will need to use one CubicWeb application instance per processor or the Python GIL will limit the CPU of your application to a fraction of the available power. This is pretty easy, you just have to duplicate configuration directories from /etc/cubicweb.d, changing instance names and ports. You can use a simple sed-based script to generate these copies automatically and keep them in sync.

    Now that we have one instance per processor, the problem of preserving sessions is back. It can be elegantly solved using Squid, which can of course deliver cached objects (in particular images, more on this later), but also listen on several ports and distribute incoming requests evenly among the CubicWeb instances based on their port of origin. Note that the load balancer must be set up to balance between ports of the web servers, one port for each processor. The Squid configuration file to achieve this, looks like:

    http_port 81 defaultsite=www.example.org vhost
    acl portA myport 81
    
    http_port 82 defaultsite=www.example.org vhost
    acl portB myport 82
    
    acl site1 dstdomain www.example.org
    
    cache_peer 127.0.0.1 parent 8081 0 no-query originserver default name=server_1
    cache_peer_access server_1 allow portA site1
    cache_peer_access server_1 deny all
    
    cache_peer 127.0.0.1 parent 8082 0 no-query originserver default name=server_2
    cache_peer_access server_2 allow portB site1
    cache_peer_access server_2 deny all
    

    This is a way to setup Squid to listen to ports 81 and 82 and distribute requests for www.example.org to ports 8081 and 8082 respectively. This way, requests should be evenly balanced between the processors a on bi-processor web server.

    You can now setup Squid more classically to achieve what it is initially done for: caching. See Squid docs for this, particularly the refresh_pattern directive. Note you do not need to force any HTTP cache standard feature in Squid, as CubicWeb enables you to fine tune caching using simple HTTPCacheManager classes found in cubicweb/web/httpcache.py (at the end of this file, you will also find default cache manager configuration for the entity and startup views).

    CubicWeb with Apache frontend

    This is controversial but it did not hurt for me: I like to put an Apache frontend between Squid and the Twisted-based CubicWeb application, because the hosting companies are usually pretty good at setting it up, like to use server status for monitoring, mod_deflate for textual content compression, mod_rewrite and other modules to customize, monitor or fine tune the web servers.

    It can however be argued that Apache is a huge piece of software for such a restrictive usage, and its memory footprint would be better used for caching.

    No shared disk

    This is an interesting part that simplifies the overall setup: if you want to save data on disk, it is likely that you also want to keep it in sync between the web servers, or use a highly secure network storage solution.

    As we already have a data store accessible from the web servers, namely the database itself, I often choose to use it even for images. This looks like the nightmare of every sysadmin, but if you make sure the images are not fetched every second from the database, by using fine tuned cache settings, it will not hurt. And this way you still benefit from the flexibility of a database and the easier maintenance of a single data store. We can use CubicWeb cache settings to allow squid caching images for 1 hour for example. If you have a very dynamic web site however, you will then need to force a URL change when an image is edited. This can easily be achieved in CubicWeb using a custom edit controller that creates a new image when the data attribute of an Image instance was edited, as illustrated here:

    from cubicweb import typed_eid
    from cubicweb.selectors import yes
    from cubicweb.web.views.editcontroller import EditController
    
    
    class CustomEditController(EditController):
        __select__ = EditController.__select__ & yes()
    
        def handle_updated_image(self, old_eid):
            'modify submitted form to change old_eid into a new entity eid in all key/ values'
            old_eid = unicode(old_eid)
            form = self._cw.form
            new_eid = self._cw.varmaker.next()
            # handle image eid
            del form['__type:%s' % old_eid]
            form['__type:%s' % new_eid] = u'Image'
            # handle eid list
            index = form['eid'].index(old_eid)
            form['eid'] = form['eid'][:index] + [new_eid] + form['eid'][index+1:]
            # handle attribute and relations
            for (k, v) in form.iteritems():
                if v == old_eid:
                    form[k] = new_eid
                if k.endswith(u':%s' % old_eid):
                    form[k[:-len(old_eid)] + new_eid] = v
                    del form[k]
    
        def _default_publish(self):
            # implement image creation when data image was updated, so that we can use
            # a far expiry date cache on download view
            images = []
            for (k, v) in self._cw.form.iteritems():
                if v != 'Image' or not k.startswith('__type') or k == self._cw.form['__maineid']:
                    continue
                try:
                    eid = typed_eid(k[7:])
                except ValueError:
                    continue
                if self._cw.form.get('data-subject:%s' % eid, None):
                    self.handle_updated_image(eid)
                    images.append(eid)
            super(CustomEditController, self)._default_publish()
            for eid in images:
                self._cw.execute('DELETE Image I WHERE I eid %(eid)s', {'eid': eid})
    

    To add the 1 hour expiry date for image download view, you can use:

    from cubicweb.selectors import yes
    from cubicweb.web import httpcache
    from cubicweb.web.views.idownloadable import DownloadView
    
    class CustomDownloadView(DownloadView):
        __select__ = DownloadView.__select__ & yes()
        http_cache_manager = httpcache.MaxAgeHTTPCacheManager
        cache_max_age = 3600
    

    Database server

    Hosting companies now often have a pretty good knowledge of PostgreSQL, the favorite DB back end for CubicWeb. They usually propose to replicate the database for data safety at a low cost, using PostgreSQL log shipping feature. Note that new PostgreSQL 9 versions should make it easier to setup replication modes that could be useful to improve performance and scalability, but there is still a lack of production level experience for the moment. Please share if you have, because it is the main issue to deal with to scale up further.

    Pre-production

    This is worth mentioning you need a pre-production server hosted by the same company on the same hardware (or virtual machine), because:

    • software upgrade will run smoother if the technical staff of the hosting company has already performed the same upgrade operation once: check the same person does both within a short timeframe if possible;
    • you will feel better if your migration scripts have successfully run on a fresh copy of the production data: ask for a db copy before a pre-production upgrade; this is much easier to do if you do not have to copy the database dumps remotely.
    • the pre-production server can host its own database server and the replication of the production one.

    Monitoring

    When you experience a web site downtime, it is much too late to take a look at the available monitoring. It is important to prepare the tools you need to diagnose a problem, get used to read the graphs and have the orders of magnitude of the values and their variations in mind.

    Even the simplest graphs, like CPU usage, need to be correctly interpreted. In a recent setup, I did not realize that only one CPU was used on a bi-pro server, delivering half the power it should... When you cannot access the machine and use top, you only see the information of the monitoring graphs, so you must know how to read them !

    Apart from the classical CPU, CPU load, (detailed) memory usage, and network traffic, ask for PostgreSQL, Squid, and Apache specific graphs (plug-ins for them are easy to find and install for classic monitoring solutions).

    For CubicWeb web sites, it is also worth setting up following views and use them for automatic alerts:

    • a software / db version consistency monitoring
    • a db pool size monitoring
    • a simple db connection check view
    • a view writing the server host name is not interesting for automatic alerts but to see on which server your IP is directed to: this is needed when you do not reproduce the behaviour the customer is complaining about...

    There are some classes I use for these tasks. Feel free to reuse and adapt them to your needs:

    from socket import gethostname
    
    from cubicweb.view import View
    
    
    class _MonitoringView(View):
        __abstract__ = True
        __select__ = yes()
        content_type = 'text/plain'
        templatable = False
    
    
    class PoolMonitoringView(_MonitoringView):
        __regid__ = 'monitor_pool'
    
        def call(self):
            repo = self._cw.cnx._repo
            max_pool = self._cw.vreg.config['connections-pool-size']
            percent = ((max_pool - repo._available_pools.qsize()) * 100.0) / max_pool
            self.w(u'%s%%' % percent)
    
    
    class DBMonitoringView(_MonitoringView):
        __regid__ = 'monitor_db'
    
        def call(self):
            try:
                count = self._cw.execute('Any COUNT(X) WHERE X is CWUser')[0][0]
                self.w(u'ServiceOK : %s users in DB' % count)
            except:
                self.w(u'ServiceKO')
    
    
    class VersionMonitoringView(_MonitoringView):
        __regid__ = 'monitor_version'
    
        def versions_text(self, versions):
            return u' | '.join(cube + u': ' + u'.'.join(unicode(x) for x in version)
                               for (cube, version) in versions)
    
        def call(self):
            config = self._cw.vreg.config
            vc_config = config.vc_config()
            db_config = [('cubicweb', vc_config.get('cubicweb', '?'))]
            fs_config = [('cubicweb', config.cubicweb_version())]
            for cube in sorted(config.cubes()):
                db_config.append((cube, vc_config.get(cube, '?')))
                try:
                    fs_version = config.cube_version(cube)
                except:
                    fs_version = '?'
                fs_config.append((cube, fs_version))
            db_config = self.versions_text(db_config)
            fs_config = self.versions_text(fs_config)
            if db_config == fs_config:
                self.w(u'ServiceOK : FS config %s == DB config %s' % (fs_config, db_config))
            else:
                self.w(u'ServiceKO : FS config %s !$ DB config %s' % (fs_config, db_config))
    
    
    class HostnameMonitoringView(_MonitoringView):
        __regid__ = 'monitor_hostname'
    
        def call(self):
            self.w(unicode(gethostname()))
    

    Sketch of the architecture and conclusion

    There is a sketch of the proposed architecture. Please comment on it and share your experience on the topic, I would be happy to learn your tips and tricks.

    I would conclude with an important remark regarding performance: a good scalable architecture is of great help to run a busy web site smoothly, however the performance boost you get by optimizing your software performance is usually worth it and must be seriously considered before any hardware upgrade, may it seem costly at first glance.

    /file/1521968?vid=download

  • Building my photos web site with CubicWeb part V: let's make it even more user friendly

    2011/01/24 by Sylvain Thenault

    We'll now see how to benefit from features introduced in 3.9 and 3.10 releases of CubicWeb

    Step 1: tired of the default look?

    OK... Now our site has its most desired features. But... I would like to make it look somewhat like my website. It is not www.cubicweb.org after all. Let's tackle this first!

    The first thing we can to is to change the logo. There are various way to achieve this. The easiest way is to put a logo.png file into the cube's data directory. As data files are looked at according to cubes order (CubicWeb resources coming last), that file will be selected instead of CubicWeb's one.

    Note

    As the location for static resources are cached, you'll have to restart your instance for this to be taken into account.

    Though there are some cases where you don't want to use a logo.png file. For instance if it's a JPEG file. You can still change the logo by defining in the cube's uiprops.py file:

    LOGO = data('logo.jpg')
    

    The uiprops machinery has been introduced in CubicWeb 3.9. It is used to define some static file resources, such as the logo, default Javascript / CSS files, as well as CSS properties (we'll see that later).

    Note

    This file is imported specifically by CubicWeb, with a predefined name space, containing for instance the data function, telling the file is somewhere in a cube or CubicWeb's data directory.

    One side effect of this is that it can't be imported as a regular python module.

    The nice thing is that in debug mode, change to a uiprops.py file are detected and then automatically reloaded.

    Now, as it's a photos web-site, I would like to have a photo of mine as background... After some trials I won't detail here, I've found a working recipe explained here. All I've to do is to override some stuff of the default CubicWeb user interface to apply it as explained.

    The first thing to to get the <img/> tag as first element after the <body> tag. If you know a way to avoid this by simply specifying the image in the CSS, tell me! The easiest way to do so is to override the HTMLPageHeader view, since that's the one that is directly called once the <body> has been written. How did I find this? By looking in the cubiweb.web.views.basetemplates module, since I know that global page layouts sits there. I could also have grep the "body" tag in cubicweb.web.views... Finding this was the hardest part. Now all I need is to customize it to write that img tag, as below:

    class HTMLPageHeader(basetemplates.HTMLPageHeader):
        # override this since it's the easier way to have our bg image
        # as the first element following <body>
        def call(self, **kwargs):
            self.w(u'<img id="bg-image" src="%sbackground.jpg" alt="background image"/>'
                   % self._cw.datadir_url)
            super(HTMLPageHeader, self).call(**kwargs)
    
    
    def registration_callback(vreg):
        vreg.register_all(globals().values(), __name__, (HTMLPageHeader))
        vreg.register_and_replace(HTMLPageHeader, basetemplates.HTMLPageHeader)
    

    As you may have guessed, my background image is in a background.jpg file in the cube's data directory, but there are still some things to explain to newcomers here:

    • The call method is there the main access point of the view. It's called by the view's render method. It is not the only access point for a view, but this will be detailed later.
    • Calling self.w writes something to the output stream. Except for binary views (which do not generate text), it must be passed an Unicode string.
    • The proper way to get a file in data directory is to use the datadir_url attribute of the incoming request (e.g. self._cw).

    I won't explain again the registration_callback stuff, you should understand it now! If not, go back to previous posts in the series :)

    Fine. Now all I've to do is to add a bit of CSS to get it to behave nicely (which is not the case at all for now). I'll put all this in a cubes.sytweb.css file, stored as usual in our data directory:

    /* fixed full screen background image
     * as explained on http://webdesign.about.com/od/css3/f/blfaqbgsize.htm
     *
     * syt update: set z-index=0 on the img instead of z-index=1 on div#page & co to
     * avoid pb with the user actions menu
     */
    img#bg-image {
        position: fixed;
        top: 0;
        left: 0;
        width: 100%;
        height: 100%;
        z-index: 0;
    }
    
    div#page, table#header, div#footer {
        background: transparent;
        position: relative;
    }
    
    /* add some space around the logo
     */
    img#logo {
        padding: 5px 15px 0px 15px;
    }
    
    /* more dark font for metadata to have a chance to see them with the background
     *  image
     */
    div.metadata {
        color: black;
    }
    

    You can see here stuff explained in the cited page, with only a slight modification explained in the comments, plus some additional rules to make things somewhat cleaner:

    • a bit of padding around the logo
    • darker metadata which appears by default below the content (the white frame in the page)

    To get this CSS file used everywhere in the site, I have to modify the uiprops.py file introduced above:

    STYLESHEETS = sheet['STYLESHEETS'] + [data('cubes.sytweb.css')]
    

    Note

    sheet is another predefined variable containing values defined by already process uiprops.py file, notably the CubicWeb's one.

    Here we simply want our CSS in addition to CubicWeb's base CSS files, so we redefine the STYLESHEETS variable to existing CSS (accessed through the sheet variable) with our one added. I could also have done:

    sheet['STYLESHEETS'].append(data('cubes.sytweb.css'))
    

    But this is less interesting since we don't see the overriding mechanism...

    At this point, the site should start looking good, the background image being resized to fit the screen.

    http://www.cubicweb.org/file/1440508?vid=download

    The final touch: let's customize CubicWeb's CSS to get less orange... By simply adding

    contextualBoxTitleBg = incontextBoxTitleBg = '#AAAAAA'
    

    and reloading the page we've just seen, we know have a nice greyed box instead of the orange one:

    http://www.cubicweb.org/file/1440510?vid=download

    This is because CubicWeb's CSS include some variables which are expanded by values defined in uiprops file. In our case we controlled the properties of the CSS background property of boxes with CSS class contextualBoxTitleBg and incontextBoxTitleBg.

    Step 2: configuring boxes

    Boxes present to the user some ways to use the application. Let's first do a few user interface tweaks in our views.py file:

    from cubicweb.selectors import none_rset
    from cubicweb.web.views import bookmark
    from cubes.zone import views as zone
    from cubes.tag import views as tag
    
    # change bookmarks box selector so it's only displayed on startup views
    bookmark.BookmarksBox.__select__ = bookmark.BookmarksBox.__select__ & none_rset()
    # move zone box to the left instead of in the context frame and tweak its order
    zone.ZoneBox.context = 'left'
    zone.ZoneBox.order = 100
    # move tags box to the left instead of in the context frame and tweak its order
    tag.TagsBox.context = 'left'
    tag.TagsBox.order = 102
    # hide similarity box, not interested
    tag.SimilarityBox.visible = False
    

    The idea is to move all boxes in the left column, so we get more space for the photos. Now, serious things: I want a box similar to the tags box but to handle the Person displayed_on File relation. We can do this simply by adding a AjaxEditRelationCtxComponent subclass to our views, as below:

    from logilab.common.decorators import monkeypatch
    from cubicweb import ValidationError
    from cubicweb.web import uicfg, component
    from cubicweb.web.views import basecontrollers
    
    # hide displayed_on relation using uicfg since it will be displayed by the box below
    uicfg.primaryview_section.tag_object_of(('*', 'displayed_on', '*'), 'hidden')
    
    class PersonBox(component.AjaxEditRelationCtxComponent):
        __regid__ = 'sytweb.displayed-on-box'
        # box position
        order = 101
        context = 'left'
        # define relation to be handled
        rtype = 'displayed_on'
        role = 'object'
        target_etype = 'Person'
        # messages
        added_msg = _('person has been added')
        removed_msg = _('person has been removed')
        # bind to js_* methods of the json controller
        fname_vocabulary = 'unrelated_persons'
        fname_validate = 'link_to_person'
        fname_remove = 'unlink_person'
    
    
    @monkeypatch(basecontrollers.JSonController)
    @basecontrollers.jsonize
    def js_unrelated_persons(self, eid):
        """return tag unrelated to an entity"""
        rql = "Any F + ' ' + S WHERE P surname S, P firstname F, X eid %(x)s, NOT P displayed_on X"
        return [name for (name,) in self._cw.execute(rql, {'x' : eid})]
    
    
    @monkeypatch(basecontrollers.JSonController)
    def js_link_to_person(self, eid, people):
        req = self._cw
        for name in people:
            name = name.strip().title()
            if not name:
                continue
            try:
                firstname, surname = name.split(None, 1)
            except:
                raise ValidationError(eid, {('displayed_on', 'object'): 'provide <first name> <surname>'})
            rset = req.execute('Person P WHERE '
                               'P firstname %(firstname)s, P surname %(surname)s',
                               locals())
            if rset:
                person = rset.get_entity(0, 0)
            else:
                person = req.create_entity('Person', firstname=firstname,
                                                surname=surname)
            req.execute('SET P displayed_on X WHERE '
                        'P eid %(p)s, X eid %(x)s, NOT P displayed_on X',
                        {'p': person.eid, 'x' : eid})
    
    @monkeypatch(basecontrollers.JSonController)
    def js_unlink_person(self, eid, personeid):
        self._cw.execute('DELETE P displayed_on X WHERE P eid %(p)s, X eid %(x)s',
                         {'p': personeid, 'x': eid})
    

    You basically subclass to configure with some class attributes. The fname_* attributes give the name of methods that should be defined on the json control to make the AJAX part of the widget work: one to get the vocabulary, one to add a relation and another to delete a relation. These methods must start by a js_ prefix and are added to the controller using the @monkeypatch decorator. In my case, the most complicated method is the one which adds a relation, since it tries to see if the person already exists, and else automatically create it, assuming the user entered "firstname surname".

    Let's see how it looks like on a file primary view:

    http://www.cubicweb.org/file/1440509?vid=download

    Great, it's now as easy for me to link my pictures to people than to tag them. Also, visitors get a consistent display of these two pieces of information.

    Note

    The ui component system has been refactored in CubicWeb 3.10, which also introduced the AjaxEditRelationCtxComponent class.

    Step 3: configuring facets

    The last feature we'll add today is facet configuration. If you access to the '/file' url, you'll see a set of 'facets' appearing in the left column. Facets provide an intuitive way to build a query incrementally, by proposing to the user various way to restrict the result set. For instance CubicWeb proposes a facet to restrict based on who created an entity; the tag cube proposes a facet to restrict based on tags; the zoe cube a facet to restrict based on geographical location, and so on. In that gist, I want to propose a facet to restrict based on the people displayed on the picture. To do so, there are various classes in the cubicweb.web.facet module which simply have to be configured using class attributes as we've done for the box. In our case, we'll define a subclass of RelationFacet.

    Note

    Since that's ui stuff, we'll continue to add code below to our views.py file. Though we begin to have a lot of various code their, so it's may be a good time to split our views module into submodules of a view package. In our case of a simple application (glue) cube, we could start using for instance the layout below:

    views/__init__.py   # uicfg configuration, facets
    views/layout.py     # header/footer/background stuff
    views/components.py # boxes, adapters
    views/pages.py      # index view, 404 view
    
    from cubicweb.web import facet
    
    class DisplayedOnFacet(facet.RelationFacet):
        __regid__ = 'displayed_on-facet'
        # relation to be displayed
        rtype = 'displayed_on'
        role = 'object'
        # view to use to display persons
        label_vid = 'combobox'
    

    Let's say we also want to filter according to the visibility attribute. This is even simpler as we just have to derive from the AttributeFacet class:

    class VisibilityFacet(facet.AttributeFacet):
        __regid__ = 'visibility-facet'
        rtype = 'visibility'
    

    Now if I search for some pictures on my site, I get the following facets available:

    http://www.cubicweb.org/file/1440517?vid=download

    Note

    By default a facet must be applyable to every entity in the result set and provide at leat two elements of vocabulary to be displayed (for instance you won't see the created_by facet if the same user has created all entities). This may explain why you don't see yours...

    Conclusion

    We started to see the power behind the infrastructure provided by the framework, both on the pure ui (CSS, Javascript) side and on the Python side (high level generic classes for components, including boxes and facets). We now have, with a few lines of code, a full-featured web site with a personalized look.

    Of course we'll probably want more as time goes, but we can now concentrate on making good pictures, publishing albums and sharing them with friends...


  • CubicWeb sprint in Paris on january 19/20/21 2011

    2010/12/03 by Sylvain Thenault
    http://farm1.static.flickr.com/183/419945378_4ead41a76d_m.jpg

    Almost everything is in the title: we'll hold a CubicWeb sprint in our Paris office after the first French Semantic Web conference, so on 19, 20 and 21 of january 2011.

    The main topic will be to enhance newcomers experience in installing and using CubicWeb.

    If you wish to come, you're welcome, that's a great way to meet us, learn the framework and share thoughts about it. Simply contact us so we can check there is still some room available.

    photo by Sebastian Mary under creative commons licence.


  • HTML5 features presented at Paris Web 2010 by Paul Rouget

    2010/10/19 by Arthur Lutz

    While at Paris Web 2010 we were all impressed by the presentation and demos by Paul Rouget on HTML5 (tech evangelist must be a hard job!). Here is my take and a few URLs on the things that were presented.

    http://hacks.mozilla.org/wp-content/themes/Hacks2010/img/mozilla.png
    • Websockets with persistent connections between the server and the browser. That way you can avoid pulling information every 5 seconds, the server can tell the web page a new info is available. The immediate uses we have for this are :
      • realtime feed display
      • jabber web chat rooms
      • in cubicweb's forge : new comment indication on a ticket
      • in cubicweb in general : notification that the edited element has been openned by another user (instead of a lock mechanism)
      • real time collaborative editing (etherpad style functionality)
    • File upload demo : http://demos.hacks.mozilla.org/openweb/uploadingFiles/
    • File EXIF extraction, client side resize or geolocalisation http://demos.hacks.mozilla.org/openweb/FileAPI/ . That could be very cool for things such as resizing an image before it is sent to the server (you know, for your mother who doesn't know how to resize that 2 Mbytes photo before sending it to the site). Reference : https://developer.mozilla.org/en/Using_files_from_web_applications
    • Using File IO, you can do some heavy Drag'n'drop from your computer to your browser directly in the browser (yes, you can get rid of that nasty java applet). Apparently Google implemented in Chromium a non-standard drag'n'drop the other way around : from the web app to your desktop, which could be cool as well.
    http://farm5.static.flickr.com/4147/5085028912_173337f0ba.jpg
    • XHR - XMLHttpRequest. Usually this type of requests is not possible cross-domain. Now they will be (with an authorization mechanism). That way, you will be able to post and control websites from the page in your browser.
    • Audio Data API : you can now access & modify audio files directly in your browser (before uploading them server side). This makes me think of the first time I realized people where implementing traditionally "heavy" applications (photo editing, music editing, even movie edition) in web applications. I was (and still am) very surprised and skeptic, but this kind of evolution makes me believe that there can be a day when you don't even need to send massive files to the server to edit them.
    http://farm1.static.flickr.com/191/513636061_98d07f7966_t.jpg

    Admittedly, you probably need to see the thrilling presentation and demos to be tempted to go and dip into these technologies. Reading the documentation will probably not encourage you to go and code some cool new features.

    One of the things that the audience commented about at the end of the presentation is that there was still a huge lack of "authoring tools" for HTML5. For some coders that never leave vim or emacs, this is heresy, but we have to admit that the adoption of flash and silverlight (apparently) is very much driven by simple click'n'program tools.

    http://www.mozilla.org/images/minefield_168.png

    During the presentation, I used a Chrome 6 that I had lying around on my Ubuntu, but by the end of the presentation I had installed Firefox4 using the mozilla PPA

    sudo add-apt-repository ppa:ubuntu-mozilla-daily/ppa
    sudo apt-get update
    sudo apt-get -uVf install firefox-4.0
    

    The PPA version keeps config files separate so you can easily switch between your "standard" Firefox3 profile and the cutting edge Firefox4 (obviously the big downside is not having all your cool extensions).

    The only thing missing from the presentation was the code... a request I hope Paul will grant to the community (a bunch of tweets about that followed the presentation).


  • What's new in CubicWeb 3.10?

    2010/10/18 by Sylvain Thenault

    The 3.10 development started during August, with two important patches: one on the repository / entity API, another one on the boxes / content navigation components unification (more on this later). Then it somewhat came to a halt, as more work was done on other projects and to stabilize the 3.9 branch. We finally got back on it during September, adding several other major changes or enhancements.

    • Cleanup of the repository side entity API, i.e. the API you may use when writing hooks. Beside simple namespace cleanup (a few renamings), the API has been modified to move out attributes being edited from the read cache. So now:

      • entities do not inherit from dict anymore; access to the dict protocol on an entity will raise deprecation warnings
      • the attributes cache is now a cw_attr_cache dictionary on the entity
      • edited attributes are in a cw_edited attribute special object, which is only available in hooks for a modified entity (i.e. '[before|after]_[add|update]_entity', you should use the dict protocol on that object to get modified attributes or to modify what is edited (in 'before' hooks only, and this is now enforced). This deprecates the former edited_attributes attribute.
    • Unification of 'boxes' / 'contentnavigation' registries and base classes, into "contextual components" stored in the 'ctxcomponents' registry. This implied the introduction of "layout" objects which are appobjects responsible of displaying the components according to the context they are displayed in.

      This separation of content / layout and some css cleanups allows us to move former boxes and content components into each other's place in the user interface: for instance, go to your preferences pages and try to move the search box. You now have many more different locations available. Though one component may not go anywhere, so forthcoming releases should tweak this to avoid proposing dumb choices. But the hot stuff is there!

      Also, a cache has been set on the registry to avoid recomputing possible components for each context (place in the ui).

    • Upgraded jQuery and jQuery UI respectively to version 1.4.2 and 1.8. Removed jquery.autocomplete.js since jQuery UI provides its own autocomplete plugin. A cwautocomplete plugin was added in order to keep widgets as backward compatible as possible. If you used custom autocomplete feature, you should take a look at this guide.

    • The RelationFacet base class now automatically proposes to search for entities without the relation if this is allowed by the schema and if there are some in current results. Example: search for tickets which are not planned in a version.

    • Data sources have been modeled as CubicWeb entity type CWSource. The 'sources' file is still there but will now only contains definition of the system source, as well as default manager account login and password. This implied changes in instance initialization commands, introduction of a new 'add-source' command to cubicweb-ctl, as well as change in the repository startup. Also, on a multi-sources instance, we can now search using a facet on the cw_source relation (a new mandatory metadata relation on each entities) to filter according to the data source entities are coming from.

    • Although introduced during 3.9 releases, it's worth mentioning the new support for multi-columns unicity constraint through yams's __unique_together__ entity type attribute, allowing for unicity constraint enforced by the underlying database instead of CubicWeb hooks. This is limited and doesn't work in every configuration, but is a must have when running several distributed CubicWeb instance of the same application (hence database).

    Also as usual, the 3.10 includes a bunch of other minor enhancements, refactorings and bug fixes. Every introduced change should be backward compatible, except probably some minor ui details due to the css box simplification. That's it.

    So please download and install CubicWeb 3.10 and report us any problem on the mailing-list!

    Enjoy!


  • CubicWeb presentation at the JDLL (Lyon)

    2010/10/07 by Arthur Lutz

    For the "Journées Du Logiciel Libre (JDLL)" in Lyon which will take place the 14th, 15th et 16th of octobre 2010, we will be presenting the semantic side of CubicWeb on Friday 15th. There will be a talk and a tutorial. Details can be found here and there.

    If you're around, come and see us!

    http://www.jdll.org/sites/default/files/banniere.png

  • Debugging a memory leak in a cube

    2010/09/24

    We recently discovered that the cubicweb.org site (the one you are probably visiting right now) was suffering from a memory leak. The munin graphs showed a memory consumption steadily increasing soon after the instance was started, and this would only stop when all the memory on the host was exhausted. This was clearly caused by a memory leak somewhere, either in CubicWeb itself or in a cube used by the instance.

    Munin graphs showing the memory leak in cubicweb.org

    Fig. 1: Munin graphs showing the memory leak in cubicweb.org

    Notice the associated service downtimes, and the stabilized memory consumption on Sept 23, after the leak was fixed.

    Since Python has a garbage collector, either the leak was occuring in a C extension, or it was caused by some objects which were not garbage collectable. A common cause for the latter, as explained in the gc module documentation, are objects with a __del__ method which are part of a cycle.

    We used the "gc" view, which is an administrative view in CubicWeb, reachable by appending "?vid=gc" at the end of the url of the root of your instance, if you are a member of the managers group. This view uses the gc module from the python standard library to see which objects are not garbage collected.

    This view showed thousands of instances of mercurial.url.httphandler. This class indeed has a __del__ method and instances have a cycle with urllib2.OpenerDirector. Mercurial is used by the vcsfile cube which regularly polls remote repository over HTTP, which causes httphandler to be instantiated (and a reference to be leaked). This problem had gone undetected in mercurial because most of the time, processes using mercurial over http are shortlived and the leaked memory is quickly collected by the operating system. Discussion ensued on the IRC forum #mercurial with the developers and a patch was submitted which fixes the leak. In order to avoid the problem with versions of mercurial up to the current one, a new version of vcsfile including a monkey patch for mercurial was released and deployed on cubicweb.org!


  • We'll be attending the Paris Web 2010 conference

    2010/09/09 by Arthur Lutz

    A few of us from Logilab will be attending the Paris Web 2010 conference. This is the fifth edition of the conference and we're looking forward to some of the conferences. It's a bit of a shame that the web site of the conference does not offer the attendees the possibility of building your own schedule by virtually "registering" to certain talks. This is one of the cool things about cubicweb-conference, a cubicweb application that helps build websites for organizing and hosting conferences.

    http://www.cubicweb.org/file/1251255?vid=download

    We're glad there will be talks about the semantic web, accessibility, html5, css, javascript, etc...

    I recently discovered Lanyrd which claims to be a social conference directory, with your twitter account you can track or show that you will be attending a conference. Twittter and identi.ca can be a good way of following a conference that you cannot physically attend. Check out Paris Web 2010 on Lanyrd : http://lanyrd.com/2010/parisweb/


  • Summer CubicWeb/Narval Sprint - Final report

    2010/08/18 by Sylvain Thenault

    For that last sprint day, each team made some nice achievements:

    • Steph & Alain worked on the mv/cp actions implementation to makes them working properly and supporting globs. Last but not least, with a full set of tests.
    • Alex & Charles got back what we call apycot 'full' tests, eg running test with coverage enabled, checking that code coverage is greater than a given threshold, but also running pylint and checking that its global evaluation is at least 7 (configurable, of course).
    • Katia & Aurélien provided a sharp implementation of recipe checking, so that we know we don't launch a recipe badly constructed, as well as informing the user nicely from what errors his recipe suffer.
    • Julien managed to set up a recipe managing from Debian package construction to Debian repository publication, going through lintian on the way
    • Pierre-Yves helped other teams to solve the narval related bugs they encountered, and finished by writing a thread-safe implementation of apycot's writer so we can run several checker simultaneously.
    • Celso continued working on a proof of concept blue-theme cube, wondering how to make CubicWeb looks nicer and be easily customisable in future versions.
    • Sylvain helped there and there and integrated patches...

    So we finally didn't get up to the demo. But we now have everything to set it up, so I've a good hope that we will have a beta version of our brand new production chain up and running before the end of August!

    Thanks to everyone for all this good work, and for this time spent all together!


  • CubicWeb gets press coverage at SemanticWeb.com

    2010/08/15 by Nicolas Chauvat

    Following the presentation of CubicWeb at OSCON 2010 in July, the editor of SemanticWeb.com wrote an article describing the CubicWeb framwork. Read the article and ask your questions on the mailing list!


  • Summer CubicWeb/Narval Sprint - Day 4

    2010/08/13 by Pierre-Yves David

    In this fourth day of the our Summer Sprint important progress have been made.

    • Stéphanie and Alain cleaned up the Apycot's bot sources from deprecated code and rewrite part of the test suite to follow the new way to launch apycot. They cleaned up the handling of VCS sources for tested project taking advantages of the new mercurial cache for vcsfile implemented by Katia and Aurélien last Tuesday. This feature keep a local clone of the remote repository and allow much faster checkout during test runs.
    • Julien made significant progress in the writing of the Debian recipe. A recipes can now successfully build Debian packages of a project and validate them with lintian and lgp. He later paired with Pierre-Yves and they improved the annotation of Apycot's Narval variable to enhance Input validation in Apycot's Narval recipes. For example, the action building a Debian package will explicitly refuse to run on a project not yet checked-out.
    • Aurelien first paired with Pierre-Yves to improve some views and the consistency of the database schema, then he worked on a dashboard displaying various indicators useful to the version publishing process.
    • Pierre-Yves spent some time improving the ability of Narval to recover on errors and to display meaningful logs about them.
    • Alexandre and Charles finished the re-implementation of the full python recipe.They used options at the Narval level to run test suite with the coverage enabled and re-enabled the coverage checker to process the result, discovering some problems in Narval's engine on the way...
    • Celso finished Spanish translation of Cubicweb's core and started to work on a new css theme
    • Sylvain helped several groups along the day and reviewed patches from them.

  • Summer CubicWeb/Narval Sprint - Day 3

    2010/08/13

    CubicWeb/Narval Sprint is going on !

    The third day of our sprint focused on the following points:

    • Pierre-Yves worked to prevent duplicate test executions (eg running several time the same test with the same version configuration),
    • Celso has terminated the spanish translation of CubicWeb. He's now working on various cubes translation,
    • Stéphanie and Alain spent some time on the narval bot view. They also modified ProjectEnvironement's attributes in order to use similar information available on the vcsfile repository, hence simplifying the configuration (more to do on this!),
    • Julien worked on the debian package recipe,
    • Katia and Aurélien worked on recipe security (using CWPermission),
    • Alexandre and Charles produced a first template of a full test recipe using pyunit and pycoverage,
    • Finally, our captain, Sylvain, is at the helm !

    We'll hopefuly be able to present a functionnal demo at the end of the week.

    Narval/Cubicweb left off !


  • Summer CubicWeb/Narval Sprint - Day 2

    2010/08/11 by Katia Saurfelt

    During the second day of our Summer CubicWeb/Narval Sprint, several tasks started on the first day were completed and new tasks started:

    • Charles, Alexandre and Julien finished writing the "copy" and "move" Narval actions, and then started transforming existing apycot checkers into Narval actions.
    • Pierre-Yves managed to improve Narval reports with more explicit and relevant content.
    • Stéphanie and Alain finished the bot status view as well as the recipe graph view.
    • Katia and Aurélien finished writing the new mercurial cache solution for vcsfile and started improving the security of Narval recipes (i.e. who can start which recipe).
    • Celso kept on his life-long work of translating CubicWeb to Spanish.
    • Sylvain wrote some Narval views, improved Narval execution logs handling and kept on reviewing patches and helping various people...

  • Summer CubicWeb/Narval Sprint - Day 1

    2010/08/10 by Sylvain Thenault

    We started this first day by several presentations by Sylvain about Logilab's current development process workflow, and compared it to what it should be after the sprint. Sylvain also introduced Narval.

    We then set up a dev environment on everyone's computer: a working forge with a local Narval agent that can be used for tests during the week.

    Regarding more concrete tasks:

    • Charles and Alexandre started writing some basic Narval actions such as move, to move a file from a place to another, and had to grasp narval's concepts on the way.
    • Pierre-Yves dug into the code to understand how exceptions are propagated in the Narval engine, his goal was to get better reports.
    • Stéphanie and Alain worked on a nice bot status view.
    • Katia, Aurélien studied the new mercurial cache solution for vcsfile
    • Julien started some piece of documentation.
    • Celso, our Mexican friend, discovered some new features of recent cubicweb releases and setup his environment to later work on Spanish translation, CSS, etc.
    • Sylvain came with a basically working narval implementation on top of cubicweb, and spent the day helping various people...

  • Summer CubicWeb/Narval Sprint

    2010/08/10 by Sylvain Thenault

    Although this week is normally the regular annual holidays here at Logilab, some of us will sprint in Paris exceptionally.

    Focus

    We're starting this week with an exciting goal: integrating all our release process into our continuous integration suite (through the apycot cube). Including Debian repository management, pypi registration, etc...

    The hot stuff to achieve this is the third resurrection of Narval, the project Logilab was originaly based on, but this time it is built on top of CubicWeb framework. Narval will be used to rewrite some parts of apycot, in order to make it more flexible and powerful.

    It is not just a refactoring or a simple upgrade! We hope to automate common tasks, simplify maintenance, and thus enhance release quality, but also gain a lot of functionality in near future.

    Sprint roadmap

    • merge Apycotbot process manager into a new Narval incarnation, and rewrite it as Narval actions and recipes
    • improve vcsfile cube with a new cache system for mercurial
    • define Logilab's release process as new Narval recipes, triggered by actions such as adding release tag into the source repository

    More detailed stuff will come with the sprint reports that we'll try to issue each day.

    Information

    This sprint is taking place in Logilab's offices in Paris from Monday the 9th to the 13th of August 2010.


  • HOWTO change the value of a variable in all-in-one.conf with a migration script

    2010/08/03

    Here is a sample migration script (see also the cubicweb documentation on that topic) which changes the variable 'sender-addr'. There is an additional twist in that the variable is only updated if the instance is configured with a known value for that variable.

    wrong_addr = 'cubicweb@loiglab.fr' # known wrong address
    fixed_addr = 'cubicweb@logilab.fr'
    configured_addr = config.get('sender-addr')
    # check that the address has not been hand fixed by a sysadmin
    if configured_addr == wrong_addr:
        config['sender-addr'] = fixed-addr
        config.save()
    

    This is very useful in cases such as:

    • automatically changing the value of a variable which used a default value set by cubicweb-ctl create
    • changing the configuration of an instance with limited intervention from the local sysadmin (because asking him to hand edit the config file is error prone): he just has to deploy the new release and run cubicweb-ctl upgrade
    • fixing issues caused by settings in the all-in-one.conf file (e.g. changing the value of max-post-length)

show 133 results