subscribe to this blog

CubicWeb Blog

News about the framework and its uses.

Continuous Integration platform for Mercurial with apycot

2010/03/15 by Arthur Lutz

Since the mercurial 1.5 sprint Pierre-Yves has been working on improving Continuous Integration for Mercurial. All developers are encouraged to run the test suites and code quality checkers but it's no always feasible to test every cases, different OS, different python versions, strange test dependencies, slow coverage run, etc. Moreover it's generally useful to keep track of the results of previous tests, especially for benchmarks.

At http://apycot.hg-scm.org/ you will find a production setup that now runs several variants of the tests-suite for all official repo and checks code style and documentation. Notification by email or RSS is available. For more details check out the FAQ.

apycot is open source and uses the cubicweb platform, if you want to set up one for your project, check out the step by step documentation.

http://www.cubicweb.org/image/749160?vid=download

CubicWeb 3.6 is (almost) out!

2010/02/10 by Sylvain Thenault

And that's great news, after several months of development (things started moving in the beginning of august 2009...), it should be available on our Debian repositories and ftp site in the next few hours.

So, we can say this release contains a (too) large set of improvements and refactorings. I'll talk about the most important ones here.

Appobject/Entity classes namespace cleanup

First of all, the namespace cleanup... 3.6 is a step towards cleaning the entity classes (hence more generally appobject), which are used for a lot of things, making it impossible to tell for sure what could be used or not as an attribute or relation name. We decided to declare identifiers starting with \_cw or cw\_ reserved for the core classes. A lot of methods have been deprecated to cleanup the base appobject class namespace. The remaining methods on entity classes will be removed in future version, by the introduction of an ORM for database related methods, and by the (most probable) introduction of ZCA adapters for other aspects. The most notable renaming are:

  • .req -> ._cw
  • .rset -> .cw_rset
  • .row -> .cw_row
  • .col -> .cw_col

This is probably what you'll see first when upgrading to 3.6: a huge stack of deprecation warnings on your screen :)

Another step towards a nice and powerful form system

  • cleaner reponsibilities separation between form, field and widget

  • fields and widgets are now responsible for handling POSTed values (the editcontroller was handling this, making things really unflexible). The editcontroller has been rewritten and now properly gets values from fields. Another benefit is that you can now easily have a widget handling multiple inputs (see the new datetime picker for instance, or the custom widget for Bookmark.path)

  • refactored automatic forms:

    • rewrite 'generic relations' as a field
    • inlined forms are now encapsulated into a field

    so you get much more control on these parts of automatic forms by using mechanism provided generally by fields

    • clearer form relations tags: removed autoform_is_inlined, more understandable autoform_field_section

Hooks refactoring

Hooks are now regular appobjects, with selectors (don't forget to reuse Hook.__select__, remember that !). They should simply implement __call__ with no argument (well, only self) and will get info previously passed as argument as instance attributes, according to the matching event.

Test API cleanup

EnvBasedTC, ControllerTC, WebTest, RepoBasedTC are all gone. Simply use CubicWebTC, with an unified API similar to what you use in cubicweb-ctl shell and in usual development.

The Bytes File System Storage

You can now specify a custom storage for attributes of entities stored in the system source. This mechanism is used to provide a way to store Bytes attributes (such as File.data for instance) as files on the file-system instead of BLOBs in the database. You can configure which attributes should use this storage for your instance and then everything is transparent.

Schema definition changes (yams 0.27)

In your schema definition file:

  • "symetric" should be correctly spelled "symmetric" :)
  • "permissions" was renamed to "__permissions__"

Also, permissions for relations are now supported per definition, not per type, at the cost of a visible impact when writing/reading the schema.

Note about backward compatibility

We worked hard to keep backward compatibility, but you shouldn't upgrade to 3.6 without checking that everything is fine... Check notably:

  • forms, if you're using custom forms by overriding internal methods
  • import for date functions from cubicweb.utils (they moved to logilab.common.date)

And also

CubicWeb 3.6 comes with a set of 37 cubes "3.6"-ready to avoid too much warnings!

Enjoy!


CubicWeb documentation mini-sprint report

2010/02/10 by Sylvain Thenault

We held a one day sprint last week in our Paris office, trying to improve CubicWeb's documentation.

There is a huge work to do on this, much more than we can do on a one day sprint, even with many people. But you have to begin with something :)

So, after a quick meeting to define priorities:

  • Stéphanie, Charles and later Sandrine (from her US home-office), began to add some documentation and screenshots to cubes. They started with the following cubes: addressbook, person, basket, tag, folder, forgotpwd, forge, tracker, vcsfile, keyword, blog and comment.
  • Julien explored sphinx abilities to build the index and extract docstrings. He applied this to improve the documentation of selectors.
  • Adrien (ach) and Celso, our friend from Mexico, tackled the task to improve the tutorial from a beginner's point of view.
  • Arthur added some pieces of documentation found in our intranet, mailing-list...
  • Pyves worked on a cubicweb-ctl command to generate schema images (png) for cubes, to include them in the cube's documentation.
  • Adrien (adim) and I helped the various teams.

Huum, I think I did not forgot anyone...

If there is still a lot to do (we need more doc sprints, stay tuned), this is really a nice start! This site should soon be updated to include more valuable cubes description and online documentation extracted from the contributed doc.


CubicWeb documentation sprint in feb. 2010

2010/01/22 by Nicolas Chauvat
http://farm4.static.flickr.com/3042/2871708248_950831962c_s.jpg

On February 2nd, 2010 Logilab will host in its head offices a one-day sprint dedicated to the improvement of the CubicWeb documentation.

Get in touch with Logilab if you want to participate in person or via the net: contact at logilab dot fr.

Photo by Adam Hyde from the FLOSS blog


MS SQL Server backuping gotcha

2010/01/19

While working on the port of CubicWeb to the Windows platform, including supporting MS Sql Server as the database backend, I got bitten by a weird behavior of that database engine. When working with cubicweb, most administrations command are wrappped by the cubicweb-ctl utility and database backups are performed by running cubicweb-ctl db-dump <instancename>. If the instance uses PostgreSQL as the backend, this will call the pg_dump utility.

When porting to Sql Server, I could not find such a utility, but I found that Transact SQL has a BACKUP DATABASE command, so I was able to call it using Python's pyodbc module. I tested it interactively, and was satisfied with the result:

>>> from logilab.common.db import get_connection
>>> cnx = get_connection(driver='sqlserver2005', database='mydb', host='localhost', extra_args='autocommit;trusted_connection')
>>> cursor = cnx.cursor()
>>> cursor.execute('BACKUP DATABASE ? TO DISK = ?', ('mydb', 'C:\\Data\\mydb.dump'))
>>> cnx.close()

However, testing that very same code through cubicweb-ctl produced no file in C:\\Data\\. To make a (quite) long story short, the thing is that the BACKUP DATABASE command is asynchronous (or maybe the odbc driver is) and the call to cursor.execute(...) will return immediately, before the backup actually starts. When running interactively, by the time I got to type cnx.close() the backup was finished but when running in a function, the connection was closed before the backup started (which effectively killed the backup operation).

I worked around this by monitoring the size of the backup file in a loop and waiting until that size gets stable before closing the connection:

import os
import time
from logilab.common.db import get_connection

filename = 'c:\\data\\toto.dump'
dbname = 'mydb'
cnx = get_connection(driver='sqlserver2005',
                     host='localhost',
                     database=dbname,
                     extra_args='autocommit;trusted_connection')
cursor = cnx.cursor()
cursor.execute("BACKUP DATABASE ? TO DISK= ? ", (dbname, filename,))
prev_size = -1
err_count = 0
same_size_count = 0
while err_count < 10 and same_size_count < 10:
    time.sleep(1)
    try:
        size = os.path.getsize(filename)
        print 'file size', size
    except OSError, exc:
        err_count +=1
        print exc
    if size > prev_size:
        same_size_count = 0
        prev_size = size
    else:
       same_size_count += 1
cnx.close()

I hope sharing this will save some people time...

Note: get_connection() comes from logilab.common.db which is a wrapper module which tries to simplify writing code for different database backends by handling once for all various idiosyncrasies. If you want pure pyodbc code, you can replace it with:

from pyodbc import connect
cnx = connect(driver='SQL Server Native Client 10.0',
              host='locahost',
              database=dbname,
              trusted_connection='yes',
              autocommit=True)

The autocommit=True part is especially important, because BACKUP DATABASE will fail if run from within a transaction.


Distributed scalable architecture using CubicWeb

2010/01/14 by Arthur Lutz

Here is a small example of one the things you can do with cubicweb's scalable architecture when serving a large number of users.

http://www.cubicweb.org/image/619085?vid=download

Obviously you can easily add machines hosting CubicWeb to the middle bit to scale up. Adding multiple postgres servers is possible but more tricky. In a later blog I will also show a way of split CubicWeb servers onto multiple servers (separate the web engine from the data repository part). Debian is one of the possible host systems, you can use something else, it's just easier with debian...

If you want a more detailed explanation of how we setup such an environment, please comment and we'll try to find the time to document it.

As a systems administrator, I can then enjoy the use of the following tools :

  • clusterssh - to access all machines at once and do common task by only typing it once (a must!)
  • htop - to monitor resources in a nicer way than the simple top
  • iotop - to monitor input/output load
  • varnishist - to check varnish is properly caching some content
  • apachetop - to watch in real time what is being accessed on the apache server
  • jnettop - to watch network flows
  • apt-get (on debian) to install all this in a a few simple commands...

CubicWeb 3.6 sprint report

2009/12/14 by Sylvain Thenault

Last week we held a cubicweb sprint in our new Paris office !

We were a nice number of people: 7 from the Logilab's crew, including Sandrine, our US representative, Celso and Carlos from Mexico, plus some others guests and colleagues working on (cubicweb based of course) customer projects.

The objective of the sprint was to kick out the 3.6 version of cubicweb, a big refactoring release started by Adrien and I a few months ago. Unfortunatly we had been preempted by some other projects and the cubicweb development branch was simply painfully following changes done in the stable branch.

Also, we decided to start using mq as a basis for code review. The sprint was a nice opportunity to test and see if it was actually usable for both developer and code reviewer. But more on this latter :)

The tasks to achieve to get this release out were:

  1. resurrect the default branch after 3 months of nasty bugs introduced by simply merging from the stable branch without any time to test
  2. update main cubes to the new test / uicfg / hooks / members api
  3. finish the editcontroller (which handle post of most web forms) refactoring
  4. finish the relation permissions change, including migration
  5. update the documentation
  6. test real applications

Of course this was ambitious :) Among those point 0. and 1. and 3. took us much more time than I expected. The editcontroller work (2.) has not been finished yet, and we didn't find any time for the documentation (4.).

Besides this, everyone (well, me at least ;) enjoyed its time while working hard all together in our new meeting room! The 3.6 version still needs a little work before being released, but the development branch is definitly back, with a great bunch of cubes ready. Among them : comment, tag, blog, keyword, tracker, forge, card, nosylist, etc...

So many thanks to everyone, and particularly to our Mexican friends Carlos and Celso... Tequila! ;)

By the way the good news is that we plan to do more sprints like this now that we've some room for it!


Customizing search box with magicsearch

2009/12/13 by Adrien Di Mascio

During last cubicweb sprint, I was asked if it was possible to customize the search box CubicWeb comes with. By default, you can use it to either type RQL queries, plain text queries or standard shortcuts such as <EntityType> or <EntityType> <attrname> <value>.

Ultimately, all queries are translated to rql since it's the only language understood on the server (data) side. To transform the user query into RQL, CubicWeb uses the so-called magicsearch component which in turn delegates to a number of query preprocessor that are responsible of interpreting the user query and generating corresponding RQL.

The code of the main processor loop is easy to understand:

for proc in self.processors:
    try:
        return proc.process_query(uquery, req)
    except (RQLSyntaxError, BadRQLQuery):
        pass

The idea is simple: for each query processor, try to translate the query. If it fails, try with the next processor, if it succeeds, we're done and the RQL query will be executed.

Now that the general mechanism is understood, here's an example of code that could be used in a forge-based cube to add a new search shortcut to find tickets. We'd like to use the project_name:text syntax to search for tickets of project_name containing text (e.g pylint:warning).

Here's the corresponding preprocessor code:

from cubicweb.web.views.magicsearch import BaseQueryProcessor

class MyCustomQueryProcessor(BaseQueryProcessor):
    priority = 0 # controls order in which processors are tried

    def preprocess_query(self, uquery, req):
        """
        :param uqery: the query as sent by the browser
        :param req: the standard, omnipresent, cubicweb's req object
        """
        try:
            project_name, text = uquery.split(':')
        except ValueError:
            return None # the shortcut doesn't apply
        return (u'Any T WHERE T is Ticket, T concerns P, P name %(p)s, '
                u'T has_text %(t)s', {'p': project_name, 't': text})

The code is rather self-explanatory, but here's a few additional comments:

  • the class is registered with the standard vregistry mechanism and should be defined along the views
  • the priority attribute is used to sort and define the order in which processors will be tried in the main processor loop
  • the preprocess_query returns None or raise an exception if the query can't be processed

To summarize, if you want to customize the search box, you have to:

  1. define a new query preprocessor component
  2. define its priority wrt other standard processors
  3. implement the preprocess_query method

and CubicWeb will do the rest !


Using gettext on windows

2009/12/01
http://www.gnu.org/graphics/gnu-head-sm.jpg

CubicWeb relies on gnu gettext for its translation management. However, the binary installers easily found for gettext (such as the one in python(x,y)) are for older versions, and compiling it is not that easy (especially in the Python world where people do not necessarily have a C compiler at hand).

We did the job and a binary installer for gnu gettext 0.17 is available on our ftp server.


Browsing the Semantic Web

2009/10/31 by Nicolas Chauvat
http://www.cubicweb.org/image/502157?vid=download

Now that the Web of Data has become a reality, innovative applications are springing up everywhere. Here is a selection of web apps that help you browse the semantic web.

  • Parallax is a faceted browser that is demonstrated by displaying the content of Freebase.
  • Neofonie demonstrates its faceted browser by displaying the content of DBpedia at dbpedia.neofonie.de
  • VisiNav is a search engine that allows to refine searches in a way that reminds of facets.
  • Falcons is a search engine that indexes RDF data.
  • Sindice is a search engine that indexes RDF data as well as data extracted from Microformats. It offers public Sindice API that can be used to retrieve the search results as RDF, json or Atom.
  • SameAs is a service that returns all the equivalent URIs for a search term or a given URI.
  • When you enter search terms, Sig.ma collates the data from the resources included in the results of a search on Sindice.
  • When you publish your product data according to the GoodRelations ontology, informations like the price show up in Yahoo's search results.

More and more services will appear in the coming months that make use of these new resources. Just for tagging, you may look at CommonTag, Zemanta and OpenCalais and imagine new ways to automate and facilitate the process of publishing information on the web.