Blog entries

  • Unittesting with CubicWeb

    2009/02/17 by Arthur Lutz

    In test driven developpement (TDD), you write the test before you write the code. On a web application, number of levels can be tested. Here are a few hints at how we manage some of the testing with CubicWeb.

    We use pytest (which is an extension of python's unittest framework available in logilab-common) to execute all tests across the cubes. Even in the core of cubicweb the tests are spread out across the server, web part, repository, common tools... so a simple pytest command crawls though all theses tests and runs them.

    http://www.sqlite.org/images/SQLite.gif

    The problem : One of the tricky things with testing CubicWeb is that the structure of the data is imported into the database (which enables us to easily modify the schema on running data), and that test data can be long to generate and fake for a web application that is used to talk to a proper database server (postgres). So we though of inserting test data into an sqlite database. After a bit of work on compatibility, it was up an running. But setting up that database was (and still is) quite long, testing was becoming way too long, TDD (with frequent testing) was becoming impossible.

    The solution : we ended up storing the sqlite database in a temporary file which is used up if it's not too old, TDD was back in the loop. So if you're developing for CubicWeb don't worry about those test/tmpdb files, on the contrary, that means you're running tests. For writing tests, check out the content about it in the book.


  • Building my photos web site with CubicWeb part II: security, testing and migration

    2010/04/13 by Sylvain Thenault

    This post will cover various topics:

    • configuring security
    • migrating an existing instance
    • writing some unit tests

    Goal

    Here are the read permissions I want:

    • folders, files, images and comments should have one of the following visibility rules:
      • 'public', everyone can see it
      • 'authenticated', only authenticated users can see it
      • 'restricted', only a subset of authenticated users can see it
    • managers (e.g. me) can see everything
    • only authenticated users can see people
    • everyone can see classifier entities (tag and zone)

    Also, unless explicity specified, the visibility of an image should be the same as the visibility of its parent folder and the visibility of a comment should be the same as the one of the commented entity. If there is no parent entity, the default visibility is 'authenticated'.

    Regarding write permissions, that's much easier:

    • the anonymous user can't write
    • authenticated users can only add comment
    • managers will add the remaining stuff

    Now, let's implement that!

    Proper security in CubicWeb is done at the schema level, so you don't have to bother with it in the views, for the users will only see what they have access to.

    Step 1: adding permissions to the schema

    In the schema, you can grant access according to groups or RQL expressions (users get access if the expression return some results). To implements the read security defined above, groups are not enough, we'll need to use RQL expressions. Here is the idea:

    • add a visibility attribute on folder, image and comment, with a vocabulary ('public', 'authenticated', 'restricted', 'parent')
    • add a may_be_read_by relation that links folder, image or comment to users,
    • add hooks to propagate permission changes.

    So the first thing to do is to modify the schema.py of my cube to define these relations:

    from yams.constraints import StaticVocabularyConstraint
    
    class visibility(RelationDefinition):
        subject = ('Folder', 'File', 'Image', 'Comment')
        object = 'String'
        constraints = [StaticVocabularyConstraint(('public', 'authenticated',
                                                   'restricted', 'parent'))]
        default = 'parent'
        cardinality = '11' # required
    
    class may_be_read_by(RelationDefinition):
        subject = ('Folder', 'File', 'Image', 'Comment',)
        object = 'CWUser'
    

    We can note the following points:

    • we've added a new visibility attribute to folder, file, image and comment using a RelationDefinition
    • cardinality = '11' means this attribute is required. This is usually hidden under the required argument given to the String constructor, but we can rely on this here (same thing for StaticVocabularyConstraint, which is usually hidden by the vocabulary argument)
    • the 'parent' possible value will be used for visibility propagation

    Now, we should be able to define security rules in the schema, based on these new attribute and relation. Here is the code to add to schema.py:

    from cubicweb.schema import ERQLExpression
    
    VISIBILITY_PERMISSIONS = {
        'read':   ('managers',
                   ERQLExpression('X visibility "public"'),
                   ERQLExpression('X visibility "authenticated", U in_group G, G name "users"'),
                   ERQLExpression('X may_be_read_by U')),
        'add':    ('managers',),
        'update': ('managers', 'owners',),
        'delete': ('managers', 'owners'),
        }
    AUTH_ONLY_PERMISSIONS = {
            'read':   ('managers', 'users'),
            'add':    ('managers',),
            'update': ('managers', 'owners',),
            'delete': ('managers', 'owners'),
            }
    CLASSIFIERS_PERMISSIONS = {
            'read':   ('managers', 'users', 'guests'),
            'add':    ('managers',),
            'update': ('managers', 'owners',),
            'delete': ('managers', 'owners'),
            }
    
    from cubes.folder.schema import Folder
    from cubes.file.schema import File, Image
    from cubes.comment.schema import Comment
    from cubes.person.schema import Person
    from cubes.zone.schema import Zone
    from cubes.tag.schema import Tag
    
    Folder.__permissions__ = VISIBILITY_PERMISSIONS
    File.__permissions__ = VISIBILITY_PERMISSIONS
    Image.__permissions__ = VISIBILITY_PERMISSIONS
    Comment.__permissions__ = VISIBILITY_PERMISSIONS.copy()
    Comment.__permissions__['add'] = ('managers', 'users',)
    Person.__permissions__ = AUTH_ONLY_PERMISSIONS
    Zone.__permissions__ = CLASSIFIERS_PERMISSIONS
    Tag.__permissions__ = CLASSIFIERS_PERMISSIONS
    

    What's important in there:

    • VISIBILITY_PERMISSIONS provides read access to an entity:
      • if user is in the 'managers' group,
      • or if visibility attribute's value is 'public',
      • or if visibility attribute's value is 'authenticated' and user (designed by the 'U' variable in the expression) is in the 'users' group (all authenticated users are expected to be in this group)
      • or if user is linked to the entity (the 'X' variable) through the may_be_read_by permission
    • we modify permissions of the entity types we use by importing them and modifying their __permissions__ attribute
    • notice the .copy(): we only want to modify 'add' permission for Comment, not for all entity types using VISIBILITY_PERMISSIONS!
    • remaning parts of the security model is done using regular groups:
      • 'users' is the group to which all authenticated users will belong
      • 'guests' is the group of anonymous users

    Step 2: security propagation in hooks

    To fullfill our requirements, we have to implement:

    Also, unless explicity specified, the visibility of an image should be the same as
    the visibility of its parent folder and the visibility of a comment should be the same as the
    one of the commented entity. If there is no parent entity, the default visibility is
    'authenticated'.
    

    This kind of 'active' rule will be done using CubicWeb's hook system. Hooks are triggered on database event such as addition of new entity or relation.

    The tricky part of the requirement is in unless explicitly specified, notably because when the entity addition hook is executed, we don't know yet its 'parent' entity (eg folder of an image, image commented by a comment). To handle such things, CubicWeb provides Operation, which allow to schedule things to do at commit time.

    In our case we will:

    • on entity creation, schedule an operation that will set default visibility
    • when a "parent" relation is added, propagate parent's visibility unless the child already has a visibility set

    Here is the code in cube's hooks.py:

    from cubicweb.selectors import implements
    from cubicweb.server import hook
    
    class SetVisibilityOp(hook.Operation):
        def precommit_event(self):
            for eid in self.session.transaction_data.pop('pending_visibility'):
                entity = self.session.entity_from_eid(eid)
                if entity.visibility == 'parent':
                    entity.set_attributes(visibility=u'authenticated')
    
    class SetVisibilityHook(hook.Hook):
        __regid__ = 'sytweb.setvisibility'
        __select__ = hook.Hook.__select__ & implements('Folder', 'File', 'Image', 'Comment')
        events = ('after_add_entity',)
        def __call__(self):
            hook.set_operation(self._cw, 'pending_visibility', self.entity.eid,
                               SetVisibilityOp)
    
    class SetParentVisibilityHook(hook.Hook):
        __regid__ = 'sytweb.setparentvisibility'
        __select__ = hook.Hook.__select__ & hook.match_rtype('filed_under', 'comments')
        events = ('after_add_relation',)
    
        def __call__(self):
            parent = self._cw.entity_from_eid(self.eidto)
            child = self._cw.entity_from_eid(self.eidfrom)
            if child.visibility == 'parent':
                child.set_attributes(visibility=parent.visibility)
    

    Remarks:

    • hooks are application objects, hence have selectors that should match entity or relation type to which the hook applies. To match relation type, we use the hook specific match_rtype selector.
    • usage of set_operation: instead of adding an operation for each added entity, set_operation allows to create a single one and to store the eids of the entities to be processed in the session transaction data. This is a good pratice to avoid heavy operations manipulation cost when creating a lot of entities in the same transaction.
    • the precommit_event method of the operation will be called at transaction's commit time.
    • in a hook, self._cw is the repository session, not a web request as usually in views
    • according to hook's event, you have access to different member on the hook instance. Here:
      • self.entity is the newly added entity on 'after_add_entity' events
      • self.eidfrom / self.eidto are the eid of the subject / object entity on 'after_add_relation' events (you may also get the relation type using self.rtype)

    The 'parent' visibility value is used to tell "propagate using parent security" because we want that attribute to be required, so we can't use None value else we'll get an error before we get any chance to propagate...

    Now, we also want to propagate the may_be_read_by relation. Fortunately, CubicWeb provides some base hook classes for such things, so we only have to add the following code to hooks.py:

    # relations where the "parent" entity is the subject
    S_RELS = set()
    # relations where the "parent" entity is the object
    O_RELS = set(('filed_under', 'comments',))
    
    class AddEntitySecurityPropagationHook(hook.PropagateSubjectRelationHook):
        """propagate permissions when new entity are added"""
        __regid__ = 'sytweb.addentity_security_propagation'
        __select__ = (hook.PropagateSubjectRelationHook.__select__
                      & hook.match_rtype_sets(S_RELS, O_RELS))
        main_rtype = 'may_be_read_by'
        subject_relations = S_RELS
        object_relations = O_RELS
    
    class AddPermissionSecurityPropagationHook(hook.PropagateSubjectRelationAddHook):
        __regid__ = 'sytweb.addperm_security_propagation'
        __select__ = (hook.PropagateSubjectRelationAddHook.__select__
                      & hook.match_rtype('may_be_read_by',))
        subject_relations = S_RELS
        object_relations = O_RELS
    
    class DelPermissionSecurityPropagationHook(hook.PropagateSubjectRelationDelHook):
        __regid__ = 'sytweb.delperm_security_propagation'
        __select__ = (hook.PropagateSubjectRelationDelHook.__select__
                      & hook.match_rtype('may_be_read_by',))
        subject_relations = S_RELS
        object_relations = O_RELS
    
    • the AddEntitySecurityPropagationHook will propagate the relation when filed_under or comments relations are added
      • the S_RELS and O_RELS set as well as the match_rtype_sets selector are used here so that if my cube is used by another one, it'll be able to configure security propagation by simply adding relation to one of the two sets.
    • the two others will propagate permissions changes on parent entities to children entities

    Step 3: testing our security

    Security is tricky. Writing some tests for it is a very good idea. You should even write them first, as Test Driven Development recommends!

    Here is a small test case that'll check the basis of our security model, in test/unittest_sytweb.py:

    from cubicweb.devtools.testlib import CubicWebTC
    from cubicweb import Binary
    
    class SecurityTC(CubicWebTC):
    
        def test_visibility_propagation(self):
            # create a user for later security checks
            toto = self.create_user('toto')
            # init some data using the default manager connection
            req = self.request()
            folder = req.create_entity('Folder',
                                       name=u'restricted',
                                       visibility=u'restricted')
            photo1 = req.create_entity('Image',
                                       data_name=u'photo1.jpg',
                                       data=Binary('xxx'),
                                       filed_under=folder)
            self.commit()
            photo1.clear_all_caches() # good practice, avoid request cache effects
            # visibility propagation
            self.assertEquals(photo1.visibility, 'restricted')
            # unless explicitly specified
            photo2 = req.create_entity('Image',
                                       data_name=u'photo2.jpg',
                                       data=Binary('xxx'),
                                       visibility=u'public',
                                       filed_under=folder)
            self.commit()
            self.assertEquals(photo2.visibility, 'public')
            # test security
            self.login('toto')
            req = self.request()
            self.assertEquals(len(req.execute('Image X')), 1) # only the public one
            self.assertEquals(len(req.execute('Folder X')), 0) # restricted...
            # may_be_read_by propagation
            self.restore_connection()
            folder.set_relations(may_be_read_by=toto)
            self.commit()
            photo1.clear_all_caches()
            self.failUnless(photo1.may_be_read_by)
            # test security with permissions
            self.login('toto')
            req = self.request()
            self.assertEquals(len(req.execute('Image X')), 2) # now toto has access to photo2
            self.assertEquals(len(req.execute('Folder X')), 1) # and to restricted folder
    
    if __name__ == '__main__':
        from logilab.common.testlib import unittest_main
        unittest_main()
    

    It is not complete, but it shows most of the things you will want to do in tests: adding some content, creating users and connecting as them in the test, etc...

    To run it type:

    [syt@scorpius test]$ pytest unittest_sytweb.py
    ========================  unittest_sytweb.py  ========================
    -> creating tables [....................]
    -> inserting default user and default groups.
    -> storing the schema in the database [....................]
    -> database for instance data initialized.
    .
    ----------------------------------------------------------------------
    Ran 1 test in 22.547s
    
    OK
    

    The first execution is taking time, since it creates a sqlite database for the test instance. The second one will be much quicker:

    [syt@scorpius test]$ pytest unittest_sytweb.py
    ========================  unittest_sytweb.py  ========================
    .
    ----------------------------------------------------------------------
    Ran 1 test in 2.662s
    
    OK
    

    If you do some changes in your schema, you'll have to force regeneration of that database. You do that by removing the tmpdb* files before running the test:

    [syt@scorpius test]$ rm tmpdb*
    

    BTW, pytest is a very convenient utilities to control test execution, from the logilab-common package.

    Step 4: writing the migration script and migrating the instance

    Prior to those changes, Iv'e created an instance, fed it with some data, so I don't want to create a new one, but to migrate the existing one. Let's see how to do that.

    Migration commands should be put in the cube's migration directory, in a file named file:<X.Y.Z>_Any.py ('Any' being there mostly for historical reason).

    Here I'll create a migration/0.2.0_Any.py file containing the following instructions:

    add_relation_type('may_be_read_by')
    add_relation_type('visibility')
    sync_schema_props_perms()
    

    Then I update the version number in cube's __pkginfo__.py to 0.2.0. And that's it! Those instructions will:

    • update the instance's schema by adding our two new relations and update the underlying database tables accordingly (the two first instructions)
    • update schema's permissions definition (the later instruction)

    To migrate my instance I simply type:

    [syt@scorpius ~]$ cubicweb-ctl upgrade sytweb
    

    I will then be asked some questions to do the migration step by step. You should say YES when it asks if a backup of your database should be done, so you can get back to the initial state if anything goes wrong...

    Conclusion

    This is a somewhat long post that I bet you will have to read at least twice ;) There is a hell lot of information hidden in there... But that should start to give you an idea of CubicWeb's power...

    See you next time for part III !