subscribe to this blog

CubicWeb Blog

News about the framework and its uses.

What's new in CubicWeb 3.13?

2011/07/21 by Sylvain Thenault

CubicWeb 3.13 has been developed for a while and includes some cool stuff:

  • generate and handle Apache's modconcat compatible URLs, to minimize the number of HTTP requests necessary to retrieve JS and CSS files, along with a new cubicweb-ctl command to generate a static 'data' directory that can be served by a front-end instead of CubicWeb
  • major facet enhancements:
    • nicer layout and visual feedback when filtering is in-progress
    • new RQLPathFacet to easily express new filters that are more than one hop away from the filtered entities
    • a more flexibile API, usable in cases where it wasn't previously possible
  • some form handling refactorings and cleanups, notably introduction of a new method to process posted content, and updated documentation
  • support for new base types : BigInt, TZDateTime and TZTime (in 3.12 actually for those two)
  • write queries optimization, and several RQL fixes on complex queries (e.g. using HAVING, sub-queries...), as well as new support for CAST() function and REGEXP operator
  • datafeed source and default CubicWeb xml parsers:
    • refactored into smaller and overridable chunks
    • easier to configure
    • make it work

As usual, the 3.13 also includes a bunch of other minor enhancements, refactorings and bug fixes. Please download and install CubicWeb 3.13 and report any problem on the tracker and/or the mailing-list!

Enjoy!


CubicWeb sprint in Paris / Need for Speed

2011/03/22 by Adrien Di Mascio

Logilab is hosting a CubicWeb sprint - 3 days in our Paris offices.

The general focus will be on speed :

  • on cubicweb-server side : improve performance of massive insertions / deletions
  • on cubicweb-client side : cache implementation, HTTP server, massive parallel usage, etc.

This sprint will take place from in April 2011 from tuesday the 26th to thursday the 28th. You are more than welcome to come along and help out, contribute, but unlike previous sprints, at least basic knowledge of CubicWeb will be required for participants since no introduction is planned.

Network resources will be available for those bringing laptops.

Address : 104 Boulevard Auguste-Blanqui, Paris. Ring "Logilab" (googlemap)

Metro : Glacière

Contact : http://www.logilab.fr/contact

Dates : 26/04/2011 to 28/04/2011


What's new in CubicWeb 3.11?

2011/02/18 by Sylvain Thenault

Unlike recent major version of CubicWeb, the 3.11 doesn't come with many API changes or refactorings and introduces a fairly small set of new features. But those are important features!

  • 'pyrorql' sources mapping is now stored in the database instead of a python file in the instance's home. This eases the deployment and maintenance of distributed aplications.

  • A new 'datafeed' source was introduced, inspired by the soon to be deprecated datafeed cube. It needs polishing but sets the foundation for advanced semantic web applications that import content from others site using simple http request.

    A 'datafeed' source is associated to a parser that analyses the imported data and then creates/updates entities accordingly. There is currently a single parser in the core that imports CubicWeb-generated xml and needs to be configured with a mapping information that defines how relations are to be followed. It provides a viable alternative to 'pyrorql' sources. Other parsers to import RDF, RSS, etc should come soon.

    A new facet to filter entities based on the source they came from is now available.

  • The management interface for users, groups, sources and site preferences was simplified so it should be more intuitive to newbies (and others). Most items have been dropped from the user drop-down menu and the simpler views were made available through the '/manage' url.

  • The default 'index' / 'manage' view has been simplified to deprecate features that rely on external folder and card cubes. That's almost the only deprecation warning you'll get in upgrading to 3.11. Just this one won't hurt!

  • The old_calendar module has been dropped in favor of jQuery's fullcalendar powered views. That's a great news for applications using calendar features. Since it was added to the exising calendar module, you shouldn't have to change anything to get it working, unless you were using old_calendar in which case you may have to update a few things. This work was initiated by our mexican friends from Crealibre.

As usual, the 3.11 also includes a bunch of other minor enhancements, refactorings and bug fixes. Please download and install CubicWeb 3.11 and report any problem to the mailing-list!

Enjoy!


A simple scalable web server HA architecture suitable for medium sized projects

2011/02/15 by Florent Cayré

Having deployed and maintained several public medium sized web sites running CubicWeb when I worked at SecondWeb, I was asked by my friends from Logilab to write a blog post describing how we managed our deployment while working with the customer and the hosting company.

Non technical (albeit important) considerations

Customers that want to run such a medium traffic web site either tell you which hosting company they partner with, or ask you to find one, so you have no other choice to deal with an external hosting structure to manage the servers. I prefer this by the way because:

  1. High Availability (HA) hosting really requires skills and hardware that are neither common nor cheap;
  2. HA hosting requires 24/7/365 availability that SecondWeb could not (and did not even want to) offer.

It is clearly difficult for all parties (try to put yourself in the shoes of the customer...) to manage a website with 3 partners involved, each with their own goals. From the development leader point of view, you will notice that the technical people of the hosting company continuously change and you keep seeing the same operational errors even if you provide and keep improving high quality documentation. The software upgrade documentation has to be particularly clear as it greatly influences the overall web site availability. You also have to keep an history of the interventions on the servers yourself and maintain an up-to-date copy of the configuration files.

The overall architecture proposed here partly benefits from this experience with managed hosting company, in that we tried to keep it simple.

Which traffic size ? Why not bigger ?

The architecture proposed here has been successfully tested with sites delivering web pages to up to 2 millions unique visitors per month. It should scale further up depending on your site database access needs: if you need very fresh data and have a lot of write operations to the database, you will need to distribute database access amongst several servers, which is beyond the scope of this post.

This is the main limitation of the proposed architecture and the reason why it is not well-suited for a bigger traffic.

Design choices

Load balancing - Preserve user sessions

To achieve very high availability for your web site, you must have no single point of failure in the whole architecture, which can be far from reasonable from the costs point of view. However, hosting companies can share costs between their customers and have them benefit from a double network infrastructure all along the way from the Internet to your web servers, themselves hosted on two distant locations. You may then choose an even number of web servers, half of them hosted on each network infrastructure.

The important thing is that you must preserve user sessions. As of CubicWeb 3.10, DB persistent sessions have not been implemented yet (it will soon, there is a ticket planned for this functionality), thus you must preserve session cookies by always directing a given user to the same web server, which is usually achieved by configuring the load balancer(s) in IP hash mode (it is faster than balancing on the session cookie, which implies reaching the http stack rather than staying at the TCP/IP level).

Squid caching, processor load balancing

Now if you have multi-processor web servers (which is very likely these times) you will need to use one CubicWeb application instance per processor or the Python GIL will limit the CPU of your application to a fraction of the available power. This is pretty easy, you just have to duplicate configuration directories from /etc/cubicweb.d, changing instance names and ports. You can use a simple sed-based script to generate these copies automatically and keep them in sync.

Now that we have one instance per processor, the problem of preserving sessions is back. It can be elegantly solved using Squid, which can of course deliver cached objects (in particular images, more on this later), but also listen on several ports and distribute incoming requests evenly among the CubicWeb instances based on their port of origin. Note that the load balancer must be set up to balance between ports of the web servers, one port for each processor. The Squid configuration file to achieve this, looks like:

http_port 81 defaultsite=www.example.org vhost
acl portA myport 81

http_port 82 defaultsite=www.example.org vhost
acl portB myport 82

acl site1 dstdomain www.example.org

cache_peer 127.0.0.1 parent 8081 0 no-query originserver default name=server_1
cache_peer_access server_1 allow portA site1
cache_peer_access server_1 deny all

cache_peer 127.0.0.1 parent 8082 0 no-query originserver default name=server_2
cache_peer_access server_2 allow portB site1
cache_peer_access server_2 deny all

This is a way to setup Squid to listen to ports 81 and 82 and distribute requests for www.example.org to ports 8081 and 8082 respectively. This way, requests should be evenly balanced between the processors a on bi-processor web server.

You can now setup Squid more classically to achieve what it is initially done for: caching. See Squid docs for this, particularly the refresh_pattern directive. Note you do not need to force any HTTP cache standard feature in Squid, as CubicWeb enables you to fine tune caching using simple HTTPCacheManager classes found in cubicweb/web/httpcache.py (at the end of this file, you will also find default cache manager configuration for the entity and startup views).

CubicWeb with Apache frontend

This is controversial but it did not hurt for me: I like to put an Apache frontend between Squid and the Twisted-based CubicWeb application, because the hosting companies are usually pretty good at setting it up, like to use server status for monitoring, mod_deflate for textual content compression, mod_rewrite and other modules to customize, monitor or fine tune the web servers.

It can however be argued that Apache is a huge piece of software for such a restrictive usage, and its memory footprint would be better used for caching.

No shared disk

This is an interesting part that simplifies the overall setup: if you want to save data on disk, it is likely that you also want to keep it in sync between the web servers, or use a highly secure network storage solution.

As we already have a data store accessible from the web servers, namely the database itself, I often choose to use it even for images. This looks like the nightmare of every sysadmin, but if you make sure the images are not fetched every second from the database, by using fine tuned cache settings, it will not hurt. And this way you still benefit from the flexibility of a database and the easier maintenance of a single data store. We can use CubicWeb cache settings to allow squid caching images for 1 hour for example. If you have a very dynamic web site however, you will then need to force a URL change when an image is edited. This can easily be achieved in CubicWeb using a custom edit controller that creates a new image when the data attribute of an Image instance was edited, as illustrated here:

from cubicweb import typed_eid
from cubicweb.selectors import yes
from cubicweb.web.views.editcontroller import EditController


class CustomEditController(EditController):
    __select__ = EditController.__select__ & yes()

    def handle_updated_image(self, old_eid):
        'modify submitted form to change old_eid into a new entity eid in all key/ values'
        old_eid = unicode(old_eid)
        form = self._cw.form
        new_eid = self._cw.varmaker.next()
        # handle image eid
        del form['__type:%s' % old_eid]
        form['__type:%s' % new_eid] = u'Image'
        # handle eid list
        index = form['eid'].index(old_eid)
        form['eid'] = form['eid'][:index] + [new_eid] + form['eid'][index+1:]
        # handle attribute and relations
        for (k, v) in form.iteritems():
            if v == old_eid:
                form[k] = new_eid
            if k.endswith(u':%s' % old_eid):
                form[k[:-len(old_eid)] + new_eid] = v
                del form[k]

    def _default_publish(self):
        # implement image creation when data image was updated, so that we can use
        # a far expiry date cache on download view
        images = []
        for (k, v) in self._cw.form.iteritems():
            if v != 'Image' or not k.startswith('__type') or k == self._cw.form['__maineid']:
                continue
            try:
                eid = typed_eid(k[7:])
            except ValueError:
                continue
            if self._cw.form.get('data-subject:%s' % eid, None):
                self.handle_updated_image(eid)
                images.append(eid)
        super(CustomEditController, self)._default_publish()
        for eid in images:
            self._cw.execute('DELETE Image I WHERE I eid %(eid)s', {'eid': eid})

To add the 1 hour expiry date for image download view, you can use:

from cubicweb.selectors import yes
from cubicweb.web import httpcache
from cubicweb.web.views.idownloadable import DownloadView

class CustomDownloadView(DownloadView):
    __select__ = DownloadView.__select__ & yes()
    http_cache_manager = httpcache.MaxAgeHTTPCacheManager
    cache_max_age = 3600

Database server

Hosting companies now often have a pretty good knowledge of PostgreSQL, the favorite DB back end for CubicWeb. They usually propose to replicate the database for data safety at a low cost, using PostgreSQL log shipping feature. Note that new PostgreSQL 9 versions should make it easier to setup replication modes that could be useful to improve performance and scalability, but there is still a lack of production level experience for the moment. Please share if you have, because it is the main issue to deal with to scale up further.

Pre-production

This is worth mentioning you need a pre-production server hosted by the same company on the same hardware (or virtual machine), because:

  • software upgrade will run smoother if the technical staff of the hosting company has already performed the same upgrade operation once: check the same person does both within a short timeframe if possible;
  • you will feel better if your migration scripts have successfully run on a fresh copy of the production data: ask for a db copy before a pre-production upgrade; this is much easier to do if you do not have to copy the database dumps remotely.
  • the pre-production server can host its own database server and the replication of the production one.

Monitoring

When you experience a web site downtime, it is much too late to take a look at the available monitoring. It is important to prepare the tools you need to diagnose a problem, get used to read the graphs and have the orders of magnitude of the values and their variations in mind.

Even the simplest graphs, like CPU usage, need to be correctly interpreted. In a recent setup, I did not realize that only one CPU was used on a bi-pro server, delivering half the power it should... When you cannot access the machine and use top, you only see the information of the monitoring graphs, so you must know how to read them !

Apart from the classical CPU, CPU load, (detailed) memory usage, and network traffic, ask for PostgreSQL, Squid, and Apache specific graphs (plug-ins for them are easy to find and install for classic monitoring solutions).

For CubicWeb web sites, it is also worth setting up following views and use them for automatic alerts:

  • a software / db version consistency monitoring
  • a db pool size monitoring
  • a simple db connection check view
  • a view writing the server host name is not interesting for automatic alerts but to see on which server your IP is directed to: this is needed when you do not reproduce the behaviour the customer is complaining about...

There are some classes I use for these tasks. Feel free to reuse and adapt them to your needs:

from socket import gethostname

from cubicweb.view import View


class _MonitoringView(View):
    __abstract__ = True
    __select__ = yes()
    content_type = 'text/plain'
    templatable = False


class PoolMonitoringView(_MonitoringView):
    __regid__ = 'monitor_pool'

    def call(self):
        repo = self._cw.cnx._repo
        max_pool = self._cw.vreg.config['connections-pool-size']
        percent = ((max_pool - repo._available_pools.qsize()) * 100.0) / max_pool
        self.w(u'%s%%' % percent)


class DBMonitoringView(_MonitoringView):
    __regid__ = 'monitor_db'

    def call(self):
        try:
            count = self._cw.execute('Any COUNT(X) WHERE X is CWUser')[0][0]
            self.w(u'ServiceOK : %s users in DB' % count)
        except:
            self.w(u'ServiceKO')


class VersionMonitoringView(_MonitoringView):
    __regid__ = 'monitor_version'

    def versions_text(self, versions):
        return u' | '.join(cube + u': ' + u'.'.join(unicode(x) for x in version)
                           for (cube, version) in versions)

    def call(self):
        config = self._cw.vreg.config
        vc_config = config.vc_config()
        db_config = [('cubicweb', vc_config.get('cubicweb', '?'))]
        fs_config = [('cubicweb', config.cubicweb_version())]
        for cube in sorted(config.cubes()):
            db_config.append((cube, vc_config.get(cube, '?')))
            try:
                fs_version = config.cube_version(cube)
            except:
                fs_version = '?'
            fs_config.append((cube, fs_version))
        db_config = self.versions_text(db_config)
        fs_config = self.versions_text(fs_config)
        if db_config == fs_config:
            self.w(u'ServiceOK : FS config %s == DB config %s' % (fs_config, db_config))
        else:
            self.w(u'ServiceKO : FS config %s !$ DB config %s' % (fs_config, db_config))


class HostnameMonitoringView(_MonitoringView):
    __regid__ = 'monitor_hostname'

    def call(self):
        self.w(unicode(gethostname()))

Sketch of the architecture and conclusion

There is a sketch of the proposed architecture. Please comment on it and share your experience on the topic, I would be happy to learn your tips and tricks.

I would conclude with an important remark regarding performance: a good scalable architecture is of great help to run a busy web site smoothly, however the performance boost you get by optimizing your software performance is usually worth it and must be seriously considered before any hardware upgrade, may it seem costly at first glance.

/file/1521968?vid=download

Building my photos web site with CubicWeb part V: let's make it even more user friendly

2011/01/24 by Sylvain Thenault

We'll now see how to benefit from features introduced in 3.9 and 3.10 releases of CubicWeb

Step 1: tired of the default look?

OK... Now our site has its most desired features. But... I would like to make it look somewhat like my website. It is not www.cubicweb.org after all. Let's tackle this first!

The first thing we can to is to change the logo. There are various way to achieve this. The easiest way is to put a logo.png file into the cube's data directory. As data files are looked at according to cubes order (CubicWeb resources coming last), that file will be selected instead of CubicWeb's one.

Note

As the location for static resources are cached, you'll have to restart your instance for this to be taken into account.

Though there are some cases where you don't want to use a logo.png file. For instance if it's a JPEG file. You can still change the logo by defining in the cube's uiprops.py file:

LOGO = data('logo.jpg')

The uiprops machinery has been introduced in CubicWeb 3.9. It is used to define some static file resources, such as the logo, default Javascript / CSS files, as well as CSS properties (we'll see that later).

Note

This file is imported specifically by CubicWeb, with a predefined name space, containing for instance the data function, telling the file is somewhere in a cube or CubicWeb's data directory.

One side effect of this is that it can't be imported as a regular python module.

The nice thing is that in debug mode, change to a uiprops.py file are detected and then automatically reloaded.

Now, as it's a photos web-site, I would like to have a photo of mine as background... After some trials I won't detail here, I've found a working recipe explained here. All I've to do is to override some stuff of the default CubicWeb user interface to apply it as explained.

The first thing to to get the <img/> tag as first element after the <body> tag. If you know a way to avoid this by simply specifying the image in the CSS, tell me! The easiest way to do so is to override the HTMLPageHeader view, since that's the one that is directly called once the <body> has been written. How did I find this? By looking in the cubiweb.web.views.basetemplates module, since I know that global page layouts sits there. I could also have grep the "body" tag in cubicweb.web.views... Finding this was the hardest part. Now all I need is to customize it to write that img tag, as below:

class HTMLPageHeader(basetemplates.HTMLPageHeader):
    # override this since it's the easier way to have our bg image
    # as the first element following <body>
    def call(self, **kwargs):
        self.w(u'<img id="bg-image" src="%sbackground.jpg" alt="background image"/>'
               % self._cw.datadir_url)
        super(HTMLPageHeader, self).call(**kwargs)


def registration_callback(vreg):
    vreg.register_all(globals().values(), __name__, (HTMLPageHeader))
    vreg.register_and_replace(HTMLPageHeader, basetemplates.HTMLPageHeader)

As you may have guessed, my background image is in a background.jpg file in the cube's data directory, but there are still some things to explain to newcomers here:

  • The call method is there the main access point of the view. It's called by the view's render method. It is not the only access point for a view, but this will be detailed later.
  • Calling self.w writes something to the output stream. Except for binary views (which do not generate text), it must be passed an Unicode string.
  • The proper way to get a file in data directory is to use the datadir_url attribute of the incoming request (e.g. self._cw).

I won't explain again the registration_callback stuff, you should understand it now! If not, go back to previous posts in the series :)

Fine. Now all I've to do is to add a bit of CSS to get it to behave nicely (which is not the case at all for now). I'll put all this in a cubes.sytweb.css file, stored as usual in our data directory:

/* fixed full screen background image
 * as explained on http://webdesign.about.com/od/css3/f/blfaqbgsize.htm
 *
 * syt update: set z-index=0 on the img instead of z-index=1 on div#page & co to
 * avoid pb with the user actions menu
 */
img#bg-image {
    position: fixed;
    top: 0;
    left: 0;
    width: 100%;
    height: 100%;
    z-index: 0;
}

div#page, table#header, div#footer {
    background: transparent;
    position: relative;
}

/* add some space around the logo
 */
img#logo {
    padding: 5px 15px 0px 15px;
}

/* more dark font for metadata to have a chance to see them with the background
 *  image
 */
div.metadata {
    color: black;
}

You can see here stuff explained in the cited page, with only a slight modification explained in the comments, plus some additional rules to make things somewhat cleaner:

  • a bit of padding around the logo
  • darker metadata which appears by default below the content (the white frame in the page)

To get this CSS file used everywhere in the site, I have to modify the uiprops.py file introduced above:

STYLESHEETS = sheet['STYLESHEETS'] + [data('cubes.sytweb.css')]

Note

sheet is another predefined variable containing values defined by already process uiprops.py file, notably the CubicWeb's one.

Here we simply want our CSS in addition to CubicWeb's base CSS files, so we redefine the STYLESHEETS variable to existing CSS (accessed through the sheet variable) with our one added. I could also have done:

sheet['STYLESHEETS'].append(data('cubes.sytweb.css'))

But this is less interesting since we don't see the overriding mechanism...

At this point, the site should start looking good, the background image being resized to fit the screen.

http://www.cubicweb.org/file/1440508?vid=download

The final touch: let's customize CubicWeb's CSS to get less orange... By simply adding

contextualBoxTitleBg = incontextBoxTitleBg = '#AAAAAA'

and reloading the page we've just seen, we know have a nice greyed box instead of the orange one:

http://www.cubicweb.org/file/1440510?vid=download

This is because CubicWeb's CSS include some variables which are expanded by values defined in uiprops file. In our case we controlled the properties of the CSS background property of boxes with CSS class contextualBoxTitleBg and incontextBoxTitleBg.

Step 2: configuring boxes

Boxes present to the user some ways to use the application. Let's first do a few user interface tweaks in our views.py file:

from cubicweb.selectors import none_rset
from cubicweb.web.views import bookmark
from cubes.zone import views as zone
from cubes.tag import views as tag

# change bookmarks box selector so it's only displayed on startup views
bookmark.BookmarksBox.__select__ = bookmark.BookmarksBox.__select__ & none_rset()
# move zone box to the left instead of in the context frame and tweak its order
zone.ZoneBox.context = 'left'
zone.ZoneBox.order = 100
# move tags box to the left instead of in the context frame and tweak its order
tag.TagsBox.context = 'left'
tag.TagsBox.order = 102
# hide similarity box, not interested
tag.SimilarityBox.visible = False

The idea is to move all boxes in the left column, so we get more space for the photos. Now, serious things: I want a box similar to the tags box but to handle the Person displayed_on File relation. We can do this simply by adding a AjaxEditRelationCtxComponent subclass to our views, as below:

from logilab.common.decorators import monkeypatch
from cubicweb import ValidationError
from cubicweb.web import uicfg, component
from cubicweb.web.views import basecontrollers

# hide displayed_on relation using uicfg since it will be displayed by the box below
uicfg.primaryview_section.tag_object_of(('*', 'displayed_on', '*'), 'hidden')

class PersonBox(component.AjaxEditRelationCtxComponent):
    __regid__ = 'sytweb.displayed-on-box'
    # box position
    order = 101
    context = 'left'
    # define relation to be handled
    rtype = 'displayed_on'
    role = 'object'
    target_etype = 'Person'
    # messages
    added_msg = _('person has been added')
    removed_msg = _('person has been removed')
    # bind to js_* methods of the json controller
    fname_vocabulary = 'unrelated_persons'
    fname_validate = 'link_to_person'
    fname_remove = 'unlink_person'


@monkeypatch(basecontrollers.JSonController)
@basecontrollers.jsonize
def js_unrelated_persons(self, eid):
    """return tag unrelated to an entity"""
    rql = "Any F + ' ' + S WHERE P surname S, P firstname F, X eid %(x)s, NOT P displayed_on X"
    return [name for (name,) in self._cw.execute(rql, {'x' : eid})]


@monkeypatch(basecontrollers.JSonController)
def js_link_to_person(self, eid, people):
    req = self._cw
    for name in people:
        name = name.strip().title()
        if not name:
            continue
        try:
            firstname, surname = name.split(None, 1)
        except:
            raise ValidationError(eid, {('displayed_on', 'object'): 'provide <first name> <surname>'})
        rset = req.execute('Person P WHERE '
                           'P firstname %(firstname)s, P surname %(surname)s',
                           locals())
        if rset:
            person = rset.get_entity(0, 0)
        else:
            person = req.create_entity('Person', firstname=firstname,
                                            surname=surname)
        req.execute('SET P displayed_on X WHERE '
                    'P eid %(p)s, X eid %(x)s, NOT P displayed_on X',
                    {'p': person.eid, 'x' : eid})

@monkeypatch(basecontrollers.JSonController)
def js_unlink_person(self, eid, personeid):
    self._cw.execute('DELETE P displayed_on X WHERE P eid %(p)s, X eid %(x)s',
                     {'p': personeid, 'x': eid})

You basically subclass to configure with some class attributes. The fname_* attributes give the name of methods that should be defined on the json control to make the AJAX part of the widget work: one to get the vocabulary, one to add a relation and another to delete a relation. These methods must start by a js_ prefix and are added to the controller using the @monkeypatch decorator. In my case, the most complicated method is the one which adds a relation, since it tries to see if the person already exists, and else automatically create it, assuming the user entered "firstname surname".

Let's see how it looks like on a file primary view:

http://www.cubicweb.org/file/1440509?vid=download

Great, it's now as easy for me to link my pictures to people than to tag them. Also, visitors get a consistent display of these two pieces of information.

Note

The ui component system has been refactored in CubicWeb 3.10, which also introduced the AjaxEditRelationCtxComponent class.

Step 3: configuring facets

The last feature we'll add today is facet configuration. If you access to the '/file' url, you'll see a set of 'facets' appearing in the left column. Facets provide an intuitive way to build a query incrementally, by proposing to the user various way to restrict the result set. For instance CubicWeb proposes a facet to restrict based on who created an entity; the tag cube proposes a facet to restrict based on tags; the zoe cube a facet to restrict based on geographical location, and so on. In that gist, I want to propose a facet to restrict based on the people displayed on the picture. To do so, there are various classes in the cubicweb.web.facet module which simply have to be configured using class attributes as we've done for the box. In our case, we'll define a subclass of RelationFacet.

Note

Since that's ui stuff, we'll continue to add code below to our views.py file. Though we begin to have a lot of various code their, so it's may be a good time to split our views module into submodules of a view package. In our case of a simple application (glue) cube, we could start using for instance the layout below:

views/__init__.py   # uicfg configuration, facets
views/layout.py     # header/footer/background stuff
views/components.py # boxes, adapters
views/pages.py      # index view, 404 view
from cubicweb.web import facet

class DisplayedOnFacet(facet.RelationFacet):
    __regid__ = 'displayed_on-facet'
    # relation to be displayed
    rtype = 'displayed_on'
    role = 'object'
    # view to use to display persons
    label_vid = 'combobox'

Let's say we also want to filter according to the visibility attribute. This is even simpler as we just have to derive from the AttributeFacet class:

class VisibilityFacet(facet.AttributeFacet):
    __regid__ = 'visibility-facet'
    rtype = 'visibility'

Now if I search for some pictures on my site, I get the following facets available:

http://www.cubicweb.org/file/1440517?vid=download

Note

By default a facet must be applyable to every entity in the result set and provide at leat two elements of vocabulary to be displayed (for instance you won't see the created_by facet if the same user has created all entities). This may explain why you don't see yours...

Conclusion

We started to see the power behind the infrastructure provided by the framework, both on the pure ui (CSS, Javascript) side and on the Python side (high level generic classes for components, including boxes and facets). We now have, with a few lines of code, a full-featured web site with a personalized look.

Of course we'll probably want more as time goes, but we can now concentrate on making good pictures, publishing albums and sharing them with friends...


CubicWeb sprint in Paris on january 19/20/21 2011

2010/12/03 by Sylvain Thenault
http://farm1.static.flickr.com/183/419945378_4ead41a76d_m.jpg

Almost everything is in the title: we'll hold a CubicWeb sprint in our Paris office after the first French Semantic Web conference, so on 19, 20 and 21 of january 2011.

The main topic will be to enhance newcomers experience in installing and using CubicWeb.

If you wish to come, you're welcome, that's a great way to meet us, learn the framework and share thoughts about it. Simply contact us so we can check there is still some room available.

photo by Sebastian Mary under creative commons licence.


HTML5 features presented at Paris Web 2010 by Paul Rouget

2010/10/19 by Arthur Lutz

While at Paris Web 2010 we were all impressed by the presentation and demos by Paul Rouget on HTML5 (tech evangelist must be a hard job!). Here is my take and a few URLs on the things that were presented.

http://hacks.mozilla.org/wp-content/themes/Hacks2010/img/mozilla.png
  • Websockets with persistent connections between the server and the browser. That way you can avoid pulling information every 5 seconds, the server can tell the web page a new info is available. The immediate uses we have for this are :
    • realtime feed display
    • jabber web chat rooms
    • in cubicweb's forge : new comment indication on a ticket
    • in cubicweb in general : notification that the edited element has been openned by another user (instead of a lock mechanism)
    • real time collaborative editing (etherpad style functionality)
  • File upload demo : http://demos.hacks.mozilla.org/openweb/uploadingFiles/
  • File EXIF extraction, client side resize or geolocalisation http://demos.hacks.mozilla.org/openweb/FileAPI/ . That could be very cool for things such as resizing an image before it is sent to the server (you know, for your mother who doesn't know how to resize that 2 Mbytes photo before sending it to the site). Reference : https://developer.mozilla.org/en/Using_files_from_web_applications
  • Using File IO, you can do some heavy Drag'n'drop from your computer to your browser directly in the browser (yes, you can get rid of that nasty java applet). Apparently Google implemented in Chromium a non-standard drag'n'drop the other way around : from the web app to your desktop, which could be cool as well.
http://farm5.static.flickr.com/4147/5085028912_173337f0ba.jpg
  • XHR - XMLHttpRequest. Usually this type of requests is not possible cross-domain. Now they will be (with an authorization mechanism). That way, you will be able to post and control websites from the page in your browser.
  • Audio Data API : you can now access & modify audio files directly in your browser (before uploading them server side). This makes me think of the first time I realized people where implementing traditionally "heavy" applications (photo editing, music editing, even movie edition) in web applications. I was (and still am) very surprised and skeptic, but this kind of evolution makes me believe that there can be a day when you don't even need to send massive files to the server to edit them.
http://farm1.static.flickr.com/191/513636061_98d07f7966_t.jpg

Admittedly, you probably need to see the thrilling presentation and demos to be tempted to go and dip into these technologies. Reading the documentation will probably not encourage you to go and code some cool new features.

One of the things that the audience commented about at the end of the presentation is that there was still a huge lack of "authoring tools" for HTML5. For some coders that never leave vim or emacs, this is heresy, but we have to admit that the adoption of flash and silverlight (apparently) is very much driven by simple click'n'program tools.

http://www.mozilla.org/images/minefield_168.png

During the presentation, I used a Chrome 6 that I had lying around on my Ubuntu, but by the end of the presentation I had installed Firefox4 using the mozilla PPA

sudo add-apt-repository ppa:ubuntu-mozilla-daily/ppa
sudo apt-get update
sudo apt-get -uVf install firefox-4.0

The PPA version keeps config files separate so you can easily switch between your "standard" Firefox3 profile and the cutting edge Firefox4 (obviously the big downside is not having all your cool extensions).

The only thing missing from the presentation was the code... a request I hope Paul will grant to the community (a bunch of tweets about that followed the presentation).


What's new in CubicWeb 3.10?

2010/10/18 by Sylvain Thenault

The 3.10 development started during August, with two important patches: one on the repository / entity API, another one on the boxes / content navigation components unification (more on this later). Then it somewhat came to a halt, as more work was done on other projects and to stabilize the 3.9 branch. We finally got back on it during September, adding several other major changes or enhancements.

  • Cleanup of the repository side entity API, i.e. the API you may use when writing hooks. Beside simple namespace cleanup (a few renamings), the API has been modified to move out attributes being edited from the read cache. So now:

    • entities do not inherit from dict anymore; access to the dict protocol on an entity will raise deprecation warnings
    • the attributes cache is now a cw_attr_cache dictionary on the entity
    • edited attributes are in a cw_edited attribute special object, which is only available in hooks for a modified entity (i.e. '[before|after]_[add|update]_entity', you should use the dict protocol on that object to get modified attributes or to modify what is edited (in 'before' hooks only, and this is now enforced). This deprecates the former edited_attributes attribute.
  • Unification of 'boxes' / 'contentnavigation' registries and base classes, into "contextual components" stored in the 'ctxcomponents' registry. This implied the introduction of "layout" objects which are appobjects responsible of displaying the components according to the context they are displayed in.

    This separation of content / layout and some css cleanups allows us to move former boxes and content components into each other's place in the user interface: for instance, go to your preferences pages and try to move the search box. You now have many more different locations available. Though one component may not go anywhere, so forthcoming releases should tweak this to avoid proposing dumb choices. But the hot stuff is there!

    Also, a cache has been set on the registry to avoid recomputing possible components for each context (place in the ui).

  • Upgraded jQuery and jQuery UI respectively to version 1.4.2 and 1.8. Removed jquery.autocomplete.js since jQuery UI provides its own autocomplete plugin. A cwautocomplete plugin was added in order to keep widgets as backward compatible as possible. If you used custom autocomplete feature, you should take a look at this guide.

  • The RelationFacet base class now automatically proposes to search for entities without the relation if this is allowed by the schema and if there are some in current results. Example: search for tickets which are not planned in a version.

  • Data sources have been modeled as CubicWeb entity type CWSource. The 'sources' file is still there but will now only contains definition of the system source, as well as default manager account login and password. This implied changes in instance initialization commands, introduction of a new 'add-source' command to cubicweb-ctl, as well as change in the repository startup. Also, on a multi-sources instance, we can now search using a facet on the cw_source relation (a new mandatory metadata relation on each entities) to filter according to the data source entities are coming from.

  • Although introduced during 3.9 releases, it's worth mentioning the new support for multi-columns unicity constraint through yams's __unique_together__ entity type attribute, allowing for unicity constraint enforced by the underlying database instead of CubicWeb hooks. This is limited and doesn't work in every configuration, but is a must have when running several distributed CubicWeb instance of the same application (hence database).

Also as usual, the 3.10 includes a bunch of other minor enhancements, refactorings and bug fixes. Every introduced change should be backward compatible, except probably some minor ui details due to the css box simplification. That's it.

So please download and install CubicWeb 3.10 and report us any problem on the mailing-list!

Enjoy!


CubicWeb presentation at the JDLL (Lyon)

2010/10/07 by Arthur Lutz

For the "Journées Du Logiciel Libre (JDLL)" in Lyon which will take place the 14th, 15th et 16th of octobre 2010, we will be presenting the semantic side of CubicWeb on Friday 15th. There will be a talk and a tutorial. Details can be found here and there.

If you're around, come and see us!

http://www.jdll.org/sites/default/files/banniere.png

Debugging a memory leak in a cube

2010/09/24

We recently discovered that the cubicweb.org site (the one you are probably visiting right now) was suffering from a memory leak. The munin graphs showed a memory consumption steadily increasing soon after the instance was started, and this would only stop when all the memory on the host was exhausted. This was clearly caused by a memory leak somewhere, either in CubicWeb itself or in a cube used by the instance.

Munin graphs showing the memory leak in cubicweb.org

Fig. 1: Munin graphs showing the memory leak in cubicweb.org

Notice the associated service downtimes, and the stabilized memory consumption on Sept 23, after the leak was fixed.

Since Python has a garbage collector, either the leak was occuring in a C extension, or it was caused by some objects which were not garbage collectable. A common cause for the latter, as explained in the gc module documentation, are objects with a __del__ method which are part of a cycle.

We used the "gc" view, which is an administrative view in CubicWeb, reachable by appending "?vid=gc" at the end of the url of the root of your instance, if you are a member of the managers group. This view uses the gc module from the python standard library to see which objects are not garbage collected.

This view showed thousands of instances of mercurial.url.httphandler. This class indeed has a __del__ method and instances have a cycle with urllib2.OpenerDirector. Mercurial is used by the vcsfile cube which regularly polls remote repository over HTTP, which causes httphandler to be instantiated (and a reference to be leaked). This problem had gone undetected in mercurial because most of the time, processes using mercurial over http are shortlived and the leaked memory is quickly collected by the operating system. Discussion ensued on the IRC forum #mercurial with the developers and a patch was submitted which fixes the leak. In order to avoid the problem with versions of mercurial up to the current one, a new version of vcsfile including a monkey patch for mercurial was released and deployed on cubicweb.org!