subscribe to this blog

CubicWeb Blog

News about the framework and its uses.

show 126 results
  • Reusing OpenData from Data.gouv.fr with CubicWeb in 2 hours

    2011/12/07 by Vincent Michel

    Data.gouv.fr is great news for the OpenData movement!

    Two days ago, the French government released thousands of data sets on http://data.gouv.fr/ under an open licensing scheme that allows people to access and play with them. Thanks to the CubicWeb semantic web framework, it took us only a couple hours to put some of that open data to good use. Here is how we mapped the french railway system.

    http://www.cubicweb.org/file/2110281?vid=download

    Train stations in french Britany

    Source Datasets

    We used two of the datasets available on data.gouv.fr:

    • Train stations : description of the 6442 train stations in France, including their name, type and geographic coordinates. Here is a sample of the file

      441000;St-Germain-sur-Ille;Desserte Voyageur;48,23955;-1,65358
      441000;Montreuil-sur-Ille;Desserte Voyageur-Infrastructure;48,3072;-1,6741
      
    • LevelCrossings : description of the 18159 level crossings on french railways, including their type and location. Here is a sample of the file

      558000;PN privé pour voitures avec barrières sans passage piétons accolé;48,05865;1,60697
      395000;PN privé pour voitures avec barrières avec passage piétons accolé public;;48,82544;1,65795
      

    Data Model

    Given the above datasets, we wrote the following data model to store the data in CubicWeb:

    class Location(EntityType):
        name = String(indexed=True)
        latitude = Float(indexed=True)
        longitude = Float(indexed=True)
        feature_type = SubjectRelation('FeatureType', cardinality='?*')
        data_source = SubjectRelation('DataGovSource', cardinality='1*', inlined=True)
    
    class FeatureType(EntityType):
        name = String(indexed=True)
    
    class DataGovSource(EntityType):
        name = String(indexed=True)
        description = String()
        uri = String(indexed=True)
        icon = String()
    

    The Location object is used for both train stations and level crossings. It has a name (text information), a latitude and a longitude (numeric information), it can be linked to multiple FeatureType objects and to a DataGovSource. The FeatureType object is used to store the type of train station or level crossing and is defined by a name (text information). The DataGovSource object is defined by a name, a description and a uri used to link back to the source data on data.gouv.fr.

    http://www.cubicweb.org/file/2110311?vid=download

    Schema of the data model

    Data Import

    We had to write a few lines of code to benefit from the massive data import feature of CubicWeb before we could load the content of the CSV files with a single command:

    $ cubicweb-ctl import-datagov-location datagov_geo gare.csv-fr.CSV  --source-type=gare
    $ cubicweb-ctl import-datagov-location datagov_geo passage_a_niveau.csv-fr.CSV  --source-type=passage
    

    In less than a minute, the import was completed and we had:

    • 2 DataGovSource objects, corresponding to the two data sets,
    • 24 FeatureType objects, corresponding to the different types of locations that exist (e.g. Non exploitée, Desserte Voyageur, PN public isolé pour piétons avec portillons or PN public pour voitures avec barrières gardé avec passage piétons accolé manoeuvré à distance),
    • 24601 Locations, corresponding to the different train stations and level crossings.

    Data visualization

    CubicWeb allows to build complex applications by assembling existing components (called cubes). Here we used a cube that wraps the Mapstraction and the OpenLayers libraries to display information on maps using data from OpenStreetMap.

    In order for the Location type defined in the data model to be displayable on a map, it is sufficient to write the following adapter:

    class IGeocodableAdapter(EntityAdapter):
          __regid__ = 'IGeocodable'
          __select__ = is_instance('Location')
          @property
          def latitude(self):
              return self.entity.latitude
          @property
          def longitude(self):
              return self.entity.longitude
    

    That was it for the development part! The next step was to use the application to browse the structure of the french train network on the map.

    Train stations in use:

    http://www.cubicweb.org/file/2110279?vid=download

    Train stations not in use:

    http://www.cubicweb.org/file/2110280?vid=download

    Zooming on some parts of the map, for example Brittany, we get to see more details and clicking on the train icons gives more information on the corresponding Location.

    Train stations in use:

    http://www.cubicweb.org/file/2110281?vid=download

    Train stations not in use:

    http://www.cubicweb.org/file/2110282?vid=download

    Since CubicWeb separates querying the data and displaying the result of a query, we can switch the view to display the same data in tables or to export it back to a CSV file.

    http://www.cubicweb.org/file/2110313?vid=download

    Querying Data

    CubicWeb implements a query langage very similar to SPARQL, that makes the data available without the need to learn a specific API.

    • Example 1: http:/some.url.demo/?rql=Any X WHERE X is Location, X name LIKE "%miny"

      This request gives all the Location with a name that ends with "miny". It returns only one element, the Firminy train station.

    http://www.cubicweb.org/file/2110286?vid=download
    • Example 2: http:/some.url.demo/?rql=Any X WHERE X is Location, X name LIKE "%ny"

      This request gives all the Location with a name that ends with "ny", and return 112 trainstations.

    http://www.cubicweb.org/file/2110287?vid=download
    • Example 3: http:/some.url.demo/?rql=Any X WHERE X latitude < 47.8, X latitude>47.6, X longitude >-1.9, X longitude<-1.8

      This request gives all the Location that have a latitude between 47.6 and 47.8, and a longitude between -1.9 and -1.8.

      We obtain 11 Location (9 levelcrossings and 2 trainstations). We can map them using the view mapstraction.map that we describe previously.

      http://www.cubicweb.org/file/2110288?vid=download
    • Example 4: http:/domainname:8080/?rql=Any X WHERE X latitude < 47.8, X latitude>47.6, X longitude >-1.9, X longitude<-1.8, X feature_type F, F name "Desserte Voyageur"

      Will limit the previous results set to train stations that are used for passenger service:

      http://www.cubicweb.org/file/2110289?vid=download
    • Example 5: http:/domainname:8080/?rql=Any X WHERE X feature_type F, F name "PN public pour voitures sans barrières sans SAL"&vid=mapstraction.map

      Finally, one can map all the level crossings for vehicules without barriers (there are 3704):

      http://www.cubicweb.org/file/2110290?vid=downloadhttp://www.cubicweb.org/file/2110291?vid=download

    As you could see in the last URL, the map view was chosen directly with the parameter vid, meaning that the URL is shareable and can be easily included in a blog with a iframe for example.

    Data sharing

    The result of a query can also be "displayed" in RDF, thus allowing users to download a semantic version of the information, without having to do the preprocessing themselves:

    <rdf:Description rdf:about="cwuri24684b3a955d4bb8830b50b4e7521450">
      <rdf:type rdf:resource="http://ns.cubicweb.org/cubicweb/0.0/Location"/>
      <cw:cw_source rdf:resource="http://some.url.demo/"/>
      <cw:longitude rdf:datatype="http://www.w3.org/2001/XMLSchema#float">-1.89599</cw:longitude>
      <cw:latitude rdf:datatype="http://www.w3.org/2001/XMLSchema#float">47.67778</cw:latitude>
      <cw:feature_type rdf:resource="http://some.url.demo/7222"/>
      <cw:data_source rdf:resource="http://some.url.demo/7206"/>
    </rdf:Description>
    

    Conclusion

    For someone who knows the CubicWeb framework, a couple hours are enough to create a CubicWeb application that stores, displays, queries and shares data downloaded from http://www.data.gouv.fr/

    The full source code for the above will be released before the end of the week.

    If you want to see more of CubicWeb in action, browse http://data.bnf.fr or learn how to develop your own application at http://docs.cubicweb.org/


  • Roundup of "Powered by Cubicweb" websites

    2011/11/15 by Arthur Lutz

    Here is a (incomplete) list of public websites powered by Cubicweb. A lot of CubicWeb technology is used for private web applications in large companies that we can not list here.

    Demos are listed here : http://www.cubicweb.org/card/demo

    You can also find a list of the companies providing services for Cubicweb (with a few extra examples) : https://www.cubicweb.org/card/CubicWebServiceProviders


  • What's new in CubicWeb 3.14?

    2011/11/10 by Sylvain Thenault

    The development of CubicWeb 3.14 was rather long and included a lot of API changes detailed here. As usual backward compatibility is provided for public APIs.

    Please note this release depends on yams 0.34 (which is incompatible with prior cubicweb releases regarding instance re-creation).

    API changes

    • Entity.fetch_rql the restriction argument has been deprecated and should be replaced with a call to the new Entity.fetch_rqlst method, get the returned value (a rql Select node) and use the RQL syntax tree API to include the above-mentioned restrictions.

      Backward compat is kept with proper warning.

    • Entity.fetch_order and Entity.fetch_unrelated_order class methods have been replaced by Entity.cw_fetch_order and Entity.cw_fetch_unrelated_order with a different prototype:

      • instead of taking (attr, var) as two string argument, they now take (select, attr, var) where select is the rql syntax tree being constructed and var the variable node.
      • instead of returning some string to be inserted in the 'ORDERBY' clause, it has to modify the syntax tree

      Backward compat is kept with proper warning, except if:

      • custom order method returns something else the a variable name with or without the sorting order (e.g. cases where you sort on the value of a registered procedure as it was done in the tracker for instance). In such case, an error is logged telling that this sorting is ignored until API upgrade.
      • client code uses direct access to one of those methods on an entity (no code known to do that).
    • Entity._rest_attr_info class method has been renamed to Entity.cw_rest_attr_info

      No backward compat since this is a protected method an no code is known to use it outside cubicweb itself.

    • AnyEntity.linked_to has been removed as part of a refactoring of this functionality (link a entity to another one at creation step). It was replaced by a EntityFieldsForm.linked_to property.

      In the same refactoring, cubicweb.web.formfield.relvoc_linkedto, cubicweb.web.formfield.relvoc_init and cubicweb.web.formfield.relvoc_unrelated were removed and replaced by RelationField methods with the same names, that take a form as a parameter.

      No backward compatibility yet. It's still time to cry for it. Cubes known to be affected: tracker, vcsfile, vcreview.

    • CWPermission entity type and its associated require_permission relation type (abstract) and require_group relation definitions have been moved to a new localperms cube. Some functions from the cubicweb.schemas package as well as some views where moved too. This makes cubicweb itself smaller while you get all the local permissions stuff into a single and documented place.

      Backward compat is kept for existing instances, though you should have installed the localperms cubes. A proper error should be displayed when trying to migrate to 3.14 an instance the use CWPermission without the new cube installed. For new instances / test, you should add a dependancy on the new cube in cubes using this feature, along with a dependancy on cubicweb >= 3.14.

    • jQuery has been updated to 1.6.4 and jquery-tablesorter to 2.0.5. No backward compat issue known.

    • Table views refactoring : new RsetTableView and EntityTableView, as well as rewritten an enhanced version of PyValTableView on the same bases, with logic moved to some column renderers and a layout. Those should be well documented and deprecates former TableView, EntityAttributesTableView and CellView, which are however kept for backward compat, with some warnings that may not be very clear unfortunatly (you may see your own table view subclass name here, which doesn't make the problem that clear). Notice that _cw.view('table', rset, *kwargs) will be routed to the new RsetTableView or to the old TableView depending on given extra arguments. See #1986413.

    • display_name don't call .lower() anymore. This may leads to changes in your user interface. Different msgid for upper/lower cases version of entity type names, as this is the only proper way to handle this with some languages.

    • IEditControlAdapter has been deprecated in favor of EditController overloading, which was made easier by adding dedicated selectors called match_edited_type and match_form_id.

    • Pre 3.6 API backward compat has been dropped, though data migration compatibility has been kept. You may have to fix errors due to old API usage for your instance before to be able to run migration, but then you should be able to upgrade even a pre 3.6 database.

    • Deprecated cubicweb.web.views.iprogress in favor of new iprogress cube.

    • Deprecated cubicweb.web.views.flot in favor of new jqplot cube.

    Unintrusive API changes

    • Refactored properties forms (eg user preferences and site wide properties) as well as pagination components to ease overridding.

    • New cubicweb.web.uihelper module with high-level helpers for uicfg.

    • New anonymized_request decorator to temporary run stuff as an anonymous user, whatever the currently logged in user.

    • New 'verbatimattr' attribute view.

    • New facet and form widget for Integer used to store binary mask.

    • New js_href function to generated proper javascript href.

    • match_kwargs and match_form_params selectors both accept a new once_is_enough argument.

    • printable_value is now a method of request, and may be given dict of formatters to use.

    • [Rset]TableView allows to set None in 'headers', meaning the label should be fetched from the result set as done by default.

    • Field vocabulary computation on entity creation now takes __linkto information into accounet.

    • Started a cubicweb.pylintext pylint plugin to help pylint analyzing cubes: you should now use

      pylint --load-plugins=cubicweb.pylintext
      

      to analyse your cubicweb code.

    RQL

    User interface changes

    • Datafeed source now present an history of the latest import's log, including global status and debug/info/warning/error messages issued during imports. Import logs older than a configurable amount of time are automatically deleted.
    • Breadcrumbs component is properly kept when creating an entity with '__linkto'.
    • users and groups management now really lead to that (i.e. includes groups management).
    • New 'jsonp' controller with 'jsonexport' and 'ejsonexport' views.

    Configuration

    • Added option 'resources-concat' to make javascript/css files concatenation optional, making JS debugging a lot easier when needed.

    As usual, the 3.14 also includes a bunch of other minor changes, and bug fixes, though this time an effort has been done so that every API changes / new API should be listed here. Please download and install CubicWeb 3.14 and report any problem on the tracker and/or the mailing-list!

    Enjoy!


  • ensure that 2 boolean attributes of an entity never have the same value

    2011/09/08

    I want to implement an entity with 2 boolean attributes, and a requirement is that these two attributes never have the same boolean value (think of some kind of radio buttons).

    Let's start with a simple schema example:

    # in schema.py
    class MyEntity(EntityType):
       use_option1 = Boolean(required=True, default=True)
       use_option2 = Boolean(required=True, default=False)
    

    So new entities will be conform to the spec.

    To do this, you need two things:

    • a constraint in the entity schema which will ring if both attributes have the same value
    • a hook which will toggle the other attribute when one attribute is changed.

    RQL constraints are generally meant to be used on relations, but you can use them on attributes too. Simply use 'S' to denote the entity, and write the constraint normally. You need to have the same constraint on both attributes, because the constraint evaluation is triggered by the modification of the attribute.

    # in schema.py
    class MyEntity(EntityType):
       use_option1 = Boolean(required=True, default=True,
                             constraints = [
                                  RQLConstraint('S use_option1 O1, S use_option2 != O1')
                                           ])
       use_option2 = Boolean(required=True, default=False,
                             constraints = [
                                  RQLConstraint('S use_option1 O1, S use_option2 != O1')
                                           ])
    

    With this update, it is no longer possible to have both options set to True or False (you will get a ValidationError). The nice thing to have is to get the other option to be updated when one of the two attributes is changed, which means that you don't have to take care of this when editing the entity in the web interface (which you cannot do anyway if you are using reledit for instance).

    A nice way of writing the hook is to use Python's sets to avoid tedious logic code:

    class RadioButtonUpdateHook(Hook):
       '''ensure use_option1 = not use_option2 (and conversely)'''
       __regid__ = 'mycube.radiobuttonhook'
       events = ('before_update_entity', 'before_add_entity')
       __select__ = Hook.__select__ & is_instance('MyEntity')
       # we prebuild the set of boolean attribute names
       _flag_attributes = set(('use_option1', 'use_option2'))
       def __call__(self):
           entity = self.entity
           edited = set(entity.cw_edited)
           attributes = self._flag_attributes
           if attributes.issubset(edited):
               # both were changed, let the integrity hooks do their job
               return
           if not attributes & edited:
               # none of our attributes where changed, do nothing
               return
           # find which attribute was modified
           modified_set = attributes & edited
           # find the name of the other attribute
           to_change = (attributes - modified_set).pop()
           modified_name = modified_set.pop()
           # set the value of that attribute
           entity.cw_edited[to_change] = not entity.cw_edited[modified_name]
    

    That's it!


  • What's new in CubicWeb 3.13?

    2011/07/21 by Sylvain Thenault

    CubicWeb 3.13 has been developed for a while and includes some cool stuff:

    • generate and handle Apache's modconcat compatible URLs, to minimize the number of HTTP requests necessary to retrieve JS and CSS files, along with a new cubicweb-ctl command to generate a static 'data' directory that can be served by a front-end instead of CubicWeb
    • major facet enhancements:
      • nicer layout and visual feedback when filtering is in-progress
      • new RQLPathFacet to easily express new filters that are more than one hop away from the filtered entities
      • a more flexibile API, usable in cases where it wasn't previously possible
    • some form handling refactorings and cleanups, notably introduction of a new method to process posted content, and updated documentation
    • support for new base types : BigInt, TZDateTime and TZTime (in 3.12 actually for those two)
    • write queries optimization, and several RQL fixes on complex queries (e.g. using HAVING, sub-queries...), as well as new support for CAST() function and REGEXP operator
    • datafeed source and default CubicWeb xml parsers:
      • refactored into smaller and overridable chunks
      • easier to configure
      • make it work

    As usual, the 3.13 also includes a bunch of other minor enhancements, refactorings and bug fixes. Please download and install CubicWeb 3.13 and report any problem on the tracker and/or the mailing-list!

    Enjoy!


  • CubicWeb sprint in Paris / Need for Speed

    2011/03/22 by Adrien Di Mascio

    Logilab is hosting a CubicWeb sprint - 3 days in our Paris offices.

    The general focus will be on speed :

    • on cubicweb-server side : improve performance of massive insertions / deletions
    • on cubicweb-client side : cache implementation, HTTP server, massive parallel usage, etc.

    This sprint will take place from in April 2011 from tuesday the 26th to thursday the 28th. You are more than welcome to come along and help out, contribute, but unlike previous sprints, at least basic knowledge of CubicWeb will be required for participants since no introduction is planned.

    Network resources will be available for those bringing laptops.

    Address : 104 Boulevard Auguste-Blanqui, Paris. Ring "Logilab" (googlemap)

    Metro : Glacière

    Contact : http://www.logilab.fr/contact

    Dates : 26/04/2011 to 28/04/2011


  • What's new in CubicWeb 3.11?

    2011/02/18 by Sylvain Thenault

    Unlike recent major version of CubicWeb, the 3.11 doesn't come with many API changes or refactorings and introduces a fairly small set of new features. But those are important features!

    • 'pyrorql' sources mapping is now stored in the database instead of a python file in the instance's home. This eases the deployment and maintenance of distributed aplications.

    • A new 'datafeed' source was introduced, inspired by the soon to be deprecated datafeed cube. It needs polishing but sets the foundation for advanced semantic web applications that import content from others site using simple http request.

      A 'datafeed' source is associated to a parser that analyses the imported data and then creates/updates entities accordingly. There is currently a single parser in the core that imports CubicWeb-generated xml and needs to be configured with a mapping information that defines how relations are to be followed. It provides a viable alternative to 'pyrorql' sources. Other parsers to import RDF, RSS, etc should come soon.

      A new facet to filter entities based on the source they came from is now available.

    • The management interface for users, groups, sources and site preferences was simplified so it should be more intuitive to newbies (and others). Most items have been dropped from the user drop-down menu and the simpler views were made available through the '/manage' url.

    • The default 'index' / 'manage' view has been simplified to deprecate features that rely on external folder and card cubes. That's almost the only deprecation warning you'll get in upgrading to 3.11. Just this one won't hurt!

    • The old_calendar module has been dropped in favor of jQuery's fullcalendar powered views. That's a great news for applications using calendar features. Since it was added to the exising calendar module, you shouldn't have to change anything to get it working, unless you were using old_calendar in which case you may have to update a few things. This work was initiated by our mexican friends from Crealibre.

    As usual, the 3.11 also includes a bunch of other minor enhancements, refactorings and bug fixes. Please download and install CubicWeb 3.11 and report any problem to the mailing-list!

    Enjoy!


  • A simple scalable web server HA architecture suitable for medium sized projects

    2011/02/15 by Florent Cayré

    Having deployed and maintained several public medium sized web sites running CubicWeb when I worked at SecondWeb, I was asked by my friends from Logilab to write a blog post describing how we managed our deployment while working with the customer and the hosting company.

    Non technical (albeit important) considerations

    Customers that want to run such a medium traffic web site either tell you which hosting company they partner with, or ask you to find one, so you have no other choice to deal with an external hosting structure to manage the servers. I prefer this by the way because:

    1. High Availability (HA) hosting really requires skills and hardware that are neither common nor cheap;
    2. HA hosting requires 24/7/365 availability that SecondWeb could not (and did not even want to) offer.

    It is clearly difficult for all parties (try to put yourself in the shoes of the customer...) to manage a website with 3 partners involved, each with their own goals. From the development leader point of view, you will notice that the technical people of the hosting company continuously change and you keep seeing the same operational errors even if you provide and keep improving high quality documentation. The software upgrade documentation has to be particularly clear as it greatly influences the overall web site availability. You also have to keep an history of the interventions on the servers yourself and maintain an up-to-date copy of the configuration files.

    The overall architecture proposed here partly benefits from this experience with managed hosting company, in that we tried to keep it simple.

    Which traffic size ? Why not bigger ?

    The architecture proposed here has been successfully tested with sites delivering web pages to up to 2 millions unique visitors per month. It should scale further up depending on your site database access needs: if you need very fresh data and have a lot of write operations to the database, you will need to distribute database access amongst several servers, which is beyond the scope of this post.

    This is the main limitation of the proposed architecture and the reason why it is not well-suited for a bigger traffic.

    Design choices

    Load balancing - Preserve user sessions

    To achieve very high availability for your web site, you must have no single point of failure in the whole architecture, which can be far from reasonable from the costs point of view. However, hosting companies can share costs between their customers and have them benefit from a double network infrastructure all along the way from the Internet to your web servers, themselves hosted on two distant locations. You may then choose an even number of web servers, half of them hosted on each network infrastructure.

    The important thing is that you must preserve user sessions. As of CubicWeb 3.10, DB persistent sessions have not been implemented yet (it will soon, there is a ticket planned for this functionality), thus you must preserve session cookies by always directing a given user to the same web server, which is usually achieved by configuring the load balancer(s) in IP hash mode (it is faster than balancing on the session cookie, which implies reaching the http stack rather than staying at the TCP/IP level).

    Squid caching, processor load balancing

    Now if you have multi-processor web servers (which is very likely these times) you will need to use one CubicWeb application instance per processor or the Python GIL will limit the CPU of your application to a fraction of the available power. This is pretty easy, you just have to duplicate configuration directories from /etc/cubicweb.d, changing instance names and ports. You can use a simple sed-based script to generate these copies automatically and keep them in sync.

    Now that we have one instance per processor, the problem of preserving sessions is back. It can be elegantly solved using Squid, which can of course deliver cached objects (in particular images, more on this later), but also listen on several ports and distribute incoming requests evenly among the CubicWeb instances based on their port of origin. Note that the load balancer must be set up to balance between ports of the web servers, one port for each processor. The Squid configuration file to achieve this, looks like:

    http_port 81 defaultsite=www.example.org vhost
    acl portA myport 81
    
    http_port 82 defaultsite=www.example.org vhost
    acl portB myport 82
    
    acl site1 dstdomain www.example.org
    
    cache_peer 127.0.0.1 parent 8081 0 no-query originserver default name=server_1
    cache_peer_access server_1 allow portA site1
    cache_peer_access server_1 deny all
    
    cache_peer 127.0.0.1 parent 8082 0 no-query originserver default name=server_2
    cache_peer_access server_2 allow portB site1
    cache_peer_access server_2 deny all
    

    This is a way to setup Squid to listen to ports 81 and 82 and distribute requests for www.example.org to ports 8081 and 8082 respectively. This way, requests should be evenly balanced between the processors a on bi-processor web server.

    You can now setup Squid more classically to achieve what it is initially done for: caching. See Squid docs for this, particularly the refresh_pattern directive. Note you do not need to force any HTTP cache standard feature in Squid, as CubicWeb enables you to fine tune caching using simple HTTPCacheManager classes found in cubicweb/web/httpcache.py (at the end of this file, you will also find default cache manager configuration for the entity and startup views).

    CubicWeb with Apache frontend

    This is controversial but it did not hurt for me: I like to put an Apache frontend between Squid and the Twisted-based CubicWeb application, because the hosting companies are usually pretty good at setting it up, like to use server status for monitoring, mod_deflate for textual content compression, mod_rewrite and other modules to customize, monitor or fine tune the web servers.

    It can however be argued that Apache is a huge piece of software for such a restrictive usage, and its memory footprint would be better used for caching.

    No shared disk

    This is an interesting part that simplifies the overall setup: if you want to save data on disk, it is likely that you also want to keep it in sync between the web servers, or use a highly secure network storage solution.

    As we already have a data store accessible from the web servers, namely the database itself, I often choose to use it even for images. This looks like the nightmare of every sysadmin, but if you make sure the images are not fetched every second from the database, by using fine tuned cache settings, it will not hurt. And this way you still benefit from the flexibility of a database and the easier maintenance of a single data store. We can use CubicWeb cache settings to allow squid caching images for 1 hour for example. If you have a very dynamic web site however, you will then need to force a URL change when an image is edited. This can easily be achieved in CubicWeb using a custom edit controller that creates a new image when the data attribute of an Image instance was edited, as illustrated here:

    from cubicweb import typed_eid
    from cubicweb.selectors import yes
    from cubicweb.web.views.editcontroller import EditController
    
    
    class CustomEditController(EditController):
        __select__ = EditController.__select__ & yes()
    
        def handle_updated_image(self, old_eid):
            'modify submitted form to change old_eid into a new entity eid in all key/ values'
            old_eid = unicode(old_eid)
            form = self._cw.form
            new_eid = self._cw.varmaker.next()
            # handle image eid
            del form['__type:%s' % old_eid]
            form['__type:%s' % new_eid] = u'Image'
            # handle eid list
            index = form['eid'].index(old_eid)
            form['eid'] = form['eid'][:index] + [new_eid] + form['eid'][index+1:]
            # handle attribute and relations
            for (k, v) in form.iteritems():
                if v == old_eid:
                    form[k] = new_eid
                if k.endswith(u':%s' % old_eid):
                    form[k[:-len(old_eid)] + new_eid] = v
                    del form[k]
    
        def _default_publish(self):
            # implement image creation when data image was updated, so that we can use
            # a far expiry date cache on download view
            images = []
            for (k, v) in self._cw.form.iteritems():
                if v != 'Image' or not k.startswith('__type') or k == self._cw.form['__maineid']:
                    continue
                try:
                    eid = typed_eid(k[7:])
                except ValueError:
                    continue
                if self._cw.form.get('data-subject:%s' % eid, None):
                    self.handle_updated_image(eid)
                    images.append(eid)
            super(CustomEditController, self)._default_publish()
            for eid in images:
                self._cw.execute('DELETE Image I WHERE I eid %(eid)s', {'eid': eid})
    

    To add the 1 hour expiry date for image download view, you can use:

    from cubicweb.selectors import yes
    from cubicweb.web import httpcache
    from cubicweb.web.views.idownloadable import DownloadView
    
    class CustomDownloadView(DownloadView):
        __select__ = DownloadView.__select__ & yes()
        http_cache_manager = httpcache.MaxAgeHTTPCacheManager
        cache_max_age = 3600
    

    Database server

    Hosting companies now often have a pretty good knowledge of PostgreSQL, the favorite DB back end for CubicWeb. They usually propose to replicate the database for data safety at a low cost, using PostgreSQL log shipping feature. Note that new PostgreSQL 9 versions should make it easier to setup replication modes that could be useful to improve performance and scalability, but there is still a lack of production level experience for the moment. Please share if you have, because it is the main issue to deal with to scale up further.

    Pre-production

    This is worth mentioning you need a pre-production server hosted by the same company on the same hardware (or virtual machine), because:

    • software upgrade will run smoother if the technical staff of the hosting company has already performed the same upgrade operation once: check the same person does both within a short timeframe if possible;
    • you will feel better if your migration scripts have successfully run on a fresh copy of the production data: ask for a db copy before a pre-production upgrade; this is much easier to do if you do not have to copy the database dumps remotely.
    • the pre-production server can host its own database server and the replication of the production one.

    Monitoring

    When you experience a web site downtime, it is much too late to take a look at the available monitoring. It is important to prepare the tools you need to diagnose a problem, get used to read the graphs and have the orders of magnitude of the values and their variations in mind.

    Even the simplest graphs, like CPU usage, need to be correctly interpreted. In a recent setup, I did not realize that only one CPU was used on a bi-pro server, delivering half the power it should... When you cannot access the machine and use top, you only see the information of the monitoring graphs, so you must know how to read them !

    Apart from the classical CPU, CPU load, (detailed) memory usage, and network traffic, ask for PostgreSQL, Squid, and Apache specific graphs (plug-ins for them are easy to find and install for classic monitoring solutions).

    For CubicWeb web sites, it is also worth setting up following views and use them for automatic alerts:

    • a software / db version consistency monitoring
    • a db pool size monitoring
    • a simple db connection check view
    • a view writing the server host name is not interesting for automatic alerts but to see on which server your IP is directed to: this is needed when you do not reproduce the behaviour the customer is complaining about...

    There are some classes I use for these tasks. Feel free to reuse and adapt them to your needs:

    from socket import gethostname
    
    from cubicweb.view import View
    
    
    class _MonitoringView(View):
        __abstract__ = True
        __select__ = yes()
        content_type = 'text/plain'
        templatable = False
    
    
    class PoolMonitoringView(_MonitoringView):
        __regid__ = 'monitor_pool'
    
        def call(self):
            repo = self._cw.cnx._repo
            max_pool = self._cw.vreg.config['connections-pool-size']
            percent = ((max_pool - repo._available_pools.qsize()) * 100.0) / max_pool
            self.w(u'%s%%' % percent)
    
    
    class DBMonitoringView(_MonitoringView):
        __regid__ = 'monitor_db'
    
        def call(self):
            try:
                count = self._cw.execute('Any COUNT(X) WHERE X is CWUser')[0][0]
                self.w(u'ServiceOK : %s users in DB' % count)
            except:
                self.w(u'ServiceKO')
    
    
    class VersionMonitoringView(_MonitoringView):
        __regid__ = 'monitor_version'
    
        def versions_text(self, versions):
            return u' | '.join(cube + u': ' + u'.'.join(unicode(x) for x in version)
                               for (cube, version) in versions)
    
        def call(self):
            config = self._cw.vreg.config
            vc_config = config.vc_config()
            db_config = [('cubicweb', vc_config.get('cubicweb', '?'))]
            fs_config = [('cubicweb', config.cubicweb_version())]
            for cube in sorted(config.cubes()):
                db_config.append((cube, vc_config.get(cube, '?')))
                try:
                    fs_version = config.cube_version(cube)
                except:
                    fs_version = '?'
                fs_config.append((cube, fs_version))
            db_config = self.versions_text(db_config)
            fs_config = self.versions_text(fs_config)
            if db_config == fs_config:
                self.w(u'ServiceOK : FS config %s == DB config %s' % (fs_config, db_config))
            else:
                self.w(u'ServiceKO : FS config %s !$ DB config %s' % (fs_config, db_config))
    
    
    class HostnameMonitoringView(_MonitoringView):
        __regid__ = 'monitor_hostname'
    
        def call(self):
            self.w(unicode(gethostname()))
    

    Sketch of the architecture and conclusion

    There is a sketch of the proposed architecture. Please comment on it and share your experience on the topic, I would be happy to learn your tips and tricks.

    I would conclude with an important remark regarding performance: a good scalable architecture is of great help to run a busy web site smoothly, however the performance boost you get by optimizing your software performance is usually worth it and must be seriously considered before any hardware upgrade, may it seem costly at first glance.

    /file/1521968?vid=download

  • Building my photos web site with CubicWeb part V: let's make it even more user friendly

    2011/01/24 by Sylvain Thenault

    We'll now see how to benefit from features introduced in 3.9 and 3.10 releases of CubicWeb

    Step 1: tired of the default look?

    OK... Now our site has its most desired features. But... I would like to make it look somewhat like my website. It is not www.cubicweb.org after all. Let's tackle this first!

    The first thing we can to is to change the logo. There are various way to achieve this. The easiest way is to put a logo.png file into the cube's data directory. As data files are looked at according to cubes order (CubicWeb resources coming last), that file will be selected instead of CubicWeb's one.

    Note

    As the location for static resources are cached, you'll have to restart your instance for this to be taken into account.

    Though there are some cases where you don't want to use a logo.png file. For instance if it's a JPEG file. You can still change the logo by defining in the cube's uiprops.py file:

    LOGO = data('logo.jpg')
    

    The uiprops machinery has been introduced in CubicWeb 3.9. It is used to define some static file resources, such as the logo, default Javascript / CSS files, as well as CSS properties (we'll see that later).

    Note

    This file is imported specifically by CubicWeb, with a predefined name space, containing for instance the data function, telling the file is somewhere in a cube or CubicWeb's data directory.

    One side effect of this is that it can't be imported as a regular python module.

    The nice thing is that in debug mode, change to a uiprops.py file are detected and then automatically reloaded.

    Now, as it's a photos web-site, I would like to have a photo of mine as background... After some trials I won't detail here, I've found a working recipe explained here. All I've to do is to override some stuff of the default CubicWeb user interface to apply it as explained.

    The first thing to to get the <img/> tag as first element after the <body> tag. If you know a way to avoid this by simply specifying the image in the CSS, tell me! The easiest way to do so is to override the HTMLPageHeader view, since that's the one that is directly called once the <body> has been written. How did I find this? By looking in the cubiweb.web.views.basetemplates module, since I know that global page layouts sits there. I could also have grep the "body" tag in cubicweb.web.views... Finding this was the hardest part. Now all I need is to customize it to write that img tag, as below:

    class HTMLPageHeader(basetemplates.HTMLPageHeader):
        # override this since it's the easier way to have our bg image
        # as the first element following <body>
        def call(self, **kwargs):
            self.w(u'<img id="bg-image" src="%sbackground.jpg" alt="background image"/>'
                   % self._cw.datadir_url)
            super(HTMLPageHeader, self).call(**kwargs)
    
    
    def registration_callback(vreg):
        vreg.register_all(globals().values(), __name__, (HTMLPageHeader))
        vreg.register_and_replace(HTMLPageHeader, basetemplates.HTMLPageHeader)
    

    As you may have guessed, my background image is in a background.jpg file in the cube's data directory, but there are still some things to explain to newcomers here:

    • The call method is there the main access point of the view. It's called by the view's render method. It is not the only access point for a view, but this will be detailed later.
    • Calling self.w writes something to the output stream. Except for binary views (which do not generate text), it must be passed an Unicode string.
    • The proper way to get a file in data directory is to use the datadir_url attribute of the incoming request (e.g. self._cw).

    I won't explain again the registration_callback stuff, you should understand it now! If not, go back to previous posts in the series :)

    Fine. Now all I've to do is to add a bit of CSS to get it to behave nicely (which is not the case at all for now). I'll put all this in a cubes.sytweb.css file, stored as usual in our data directory:

    /* fixed full screen background image
     * as explained on http://webdesign.about.com/od/css3/f/blfaqbgsize.htm
     *
     * syt update: set z-index=0 on the img instead of z-index=1 on div#page & co to
     * avoid pb with the user actions menu
     */
    img#bg-image {
        position: fixed;
        top: 0;
        left: 0;
        width: 100%;
        height: 100%;
        z-index: 0;
    }
    
    div#page, table#header, div#footer {
        background: transparent;
        position: relative;
    }
    
    /* add some space around the logo
     */
    img#logo {
        padding: 5px 15px 0px 15px;
    }
    
    /* more dark font for metadata to have a chance to see them with the background
     *  image
     */
    div.metadata {
        color: black;
    }
    

    You can see here stuff explained in the cited page, with only a slight modification explained in the comments, plus some additional rules to make things somewhat cleaner:

    • a bit of padding around the logo
    • darker metadata which appears by default below the content (the white frame in the page)

    To get this CSS file used everywhere in the site, I have to modify the uiprops.py file introduced above:

    STYLESHEETS = sheet['STYLESHEETS'] + [data('cubes.sytweb.css')]
    

    Note

    sheet is another predefined variable containing values defined by already process uiprops.py file, notably the CubicWeb's one.

    Here we simply want our CSS in addition to CubicWeb's base CSS files, so we redefine the STYLESHEETS variable to existing CSS (accessed through the sheet variable) with our one added. I could also have done:

    sheet['STYLESHEETS'].append(data('cubes.sytweb.css'))
    

    But this is less interesting since we don't see the overriding mechanism...

    At this point, the site should start looking good, the background image being resized to fit the screen.

    http://www.cubicweb.org/file/1440508?vid=download

    The final touch: let's customize CubicWeb's CSS to get less orange... By simply adding

    contextualBoxTitleBg = incontextBoxTitleBg = '#AAAAAA'
    

    and reloading the page we've just seen, we know have a nice greyed box instead of the orange one:

    http://www.cubicweb.org/file/1440510?vid=download

    This is because CubicWeb's CSS include some variables which are expanded by values defined in uiprops file. In our case we controlled the properties of the CSS background property of boxes with CSS class contextualBoxTitleBg and incontextBoxTitleBg.

    Step 2: configuring boxes

    Boxes present to the user some ways to use the application. Let's first do a few user interface tweaks in our views.py file:

    from cubicweb.selectors import none_rset
    from cubicweb.web.views import bookmark
    from cubes.zone import views as zone
    from cubes.tag import views as tag
    
    # change bookmarks box selector so it's only displayed on startup views
    bookmark.BookmarksBox.__select__ = bookmark.BookmarksBox.__select__ & none_rset()
    # move zone box to the left instead of in the context frame and tweak its order
    zone.ZoneBox.context = 'left'
    zone.ZoneBox.order = 100
    # move tags box to the left instead of in the context frame and tweak its order
    tag.TagsBox.context = 'left'
    tag.TagsBox.order = 102
    # hide similarity box, not interested
    tag.SimilarityBox.visible = False
    

    The idea is to move all boxes in the left column, so we get more space for the photos. Now, serious things: I want a box similar to the tags box but to handle the Person displayed_on File relation. We can do this simply by adding a AjaxEditRelationCtxComponent subclass to our views, as below:

    from logilab.common.decorators import monkeypatch
    from cubicweb import ValidationError
    from cubicweb.web import uicfg, component
    from cubicweb.web.views import basecontrollers
    
    # hide displayed_on relation using uicfg since it will be displayed by the box below
    uicfg.primaryview_section.tag_object_of(('*', 'displayed_on', '*'), 'hidden')
    
    class PersonBox(component.AjaxEditRelationCtxComponent):
        __regid__ = 'sytweb.displayed-on-box'
        # box position
        order = 101
        context = 'left'
        # define relation to be handled
        rtype = 'displayed_on'
        role = 'object'
        target_etype = 'Person'
        # messages
        added_msg = _('person has been added')
        removed_msg = _('person has been removed')
        # bind to js_* methods of the json controller
        fname_vocabulary = 'unrelated_persons'
        fname_validate = 'link_to_person'
        fname_remove = 'unlink_person'
    
    
    @monkeypatch(basecontrollers.JSonController)
    @basecontrollers.jsonize
    def js_unrelated_persons(self, eid):
        """return tag unrelated to an entity"""
        rql = "Any F + ' ' + S WHERE P surname S, P firstname F, X eid %(x)s, NOT P displayed_on X"
        return [name for (name,) in self._cw.execute(rql, {'x' : eid})]
    
    
    @monkeypatch(basecontrollers.JSonController)
    def js_link_to_person(self, eid, people):
        req = self._cw
        for name in people:
            name = name.strip().title()
            if not name:
                continue
            try:
                firstname, surname = name.split(None, 1)
            except:
                raise ValidationError(eid, {('displayed_on', 'object'): 'provide <first name> <surname>'})
            rset = req.execute('Person P WHERE '
                               'P firstname %(firstname)s, P surname %(surname)s',
                               locals())
            if rset:
                person = rset.get_entity(0, 0)
            else:
                person = req.create_entity('Person', firstname=firstname,
                                                surname=surname)
            req.execute('SET P displayed_on X WHERE '
                        'P eid %(p)s, X eid %(x)s, NOT P displayed_on X',
                        {'p': person.eid, 'x' : eid})
    
    @monkeypatch(basecontrollers.JSonController)
    def js_unlink_person(self, eid, personeid):
        self._cw.execute('DELETE P displayed_on X WHERE P eid %(p)s, X eid %(x)s',
                         {'p': personeid, 'x': eid})
    

    You basically subclass to configure with some class attributes. The fname_* attributes give the name of methods that should be defined on the json control to make the AJAX part of the widget work: one to get the vocabulary, one to add a relation and another to delete a relation. These methods must start by a js_ prefix and are added to the controller using the @monkeypatch decorator. In my case, the most complicated method is the one which adds a relation, since it tries to see if the person already exists, and else automatically create it, assuming the user entered "firstname surname".

    Let's see how it looks like on a file primary view:

    http://www.cubicweb.org/file/1440509?vid=download

    Great, it's now as easy for me to link my pictures to people than to tag them. Also, visitors get a consistent display of these two pieces of information.

    Note

    The ui component system has been refactored in CubicWeb 3.10, which also introduced the AjaxEditRelationCtxComponent class.

    Step 3: configuring facets

    The last feature we'll add today is facet configuration. If you access to the '/file' url, you'll see a set of 'facets' appearing in the left column. Facets provide an intuitive way to build a query incrementally, by proposing to the user various way to restrict the result set. For instance CubicWeb proposes a facet to restrict based on who created an entity; the tag cube proposes a facet to restrict based on tags; the zoe cube a facet to restrict based on geographical location, and so on. In that gist, I want to propose a facet to restrict based on the people displayed on the picture. To do so, there are various classes in the cubicweb.web.facet module which simply have to be configured using class attributes as we've done for the box. In our case, we'll define a subclass of RelationFacet.

    Note

    Since that's ui stuff, we'll continue to add code below to our views.py file. Though we begin to have a lot of various code their, so it's may be a good time to split our views module into submodules of a view package. In our case of a simple application (glue) cube, we could start using for instance the layout below:

    views/__init__.py   # uicfg configuration, facets
    views/layout.py     # header/footer/background stuff
    views/components.py # boxes, adapters
    views/pages.py      # index view, 404 view
    
    from cubicweb.web import facet
    
    class DisplayedOnFacet(facet.RelationFacet):
        __regid__ = 'displayed_on-facet'
        # relation to be displayed
        rtype = 'displayed_on'
        role = 'object'
        # view to use to display persons
        label_vid = 'combobox'
    

    Let's say we also want to filter according to the visibility attribute. This is even simpler as we just have to derive from the AttributeFacet class:

    class VisibilityFacet(facet.AttributeFacet):
        __regid__ = 'visibility-facet'
        rtype = 'visibility'
    

    Now if I search for some pictures on my site, I get the following facets available:

    http://www.cubicweb.org/file/1440517?vid=download

    Note

    By default a facet must be applyable to every entity in the result set and provide at leat two elements of vocabulary to be displayed (for instance you won't see the created_by facet if the same user has created all entities). This may explain why you don't see yours...

    Conclusion

    We started to see the power behind the infrastructure provided by the framework, both on the pure ui (CSS, Javascript) side and on the Python side (high level generic classes for components, including boxes and facets). We now have, with a few lines of code, a full-featured web site with a personalized look.

    Of course we'll probably want more as time goes, but we can now concentrate on making good pictures, publishing albums and sharing them with friends...


  • CubicWeb sprint in Paris on january 19/20/21 2011

    2010/12/03 by Sylvain Thenault
    http://farm1.static.flickr.com/183/419945378_4ead41a76d_m.jpg

    Almost everything is in the title: we'll hold a CubicWeb sprint in our Paris office after the first French Semantic Web conference, so on 19, 20 and 21 of january 2011.

    The main topic will be to enhance newcomers experience in installing and using CubicWeb.

    If you wish to come, you're welcome, that's a great way to meet us, learn the framework and share thoughts about it. Simply contact us so we can check there is still some room available.

    photo by Sebastian Mary under creative commons licence.


show 126 results