cubicweb #3040091 Improve migrations of old instances [open]

Description of the problem

When one want to migrate old applications, the current migration system presents some limitations, best illustrated by following example :

Imagine you want to migrate your instance from version 0.4 to version 2.0, while the schema of one cube it uses looks as follows :

  • in version 0.4.3 :

    class Book(EntityType):
        title = String(required=True)
        page_number = Int()
    
     class Author(EntityType):
         name = String()
         writer_of = SubjectRelation('Book', cardinality='*1', composite='subject')
    
  • in version 1.0.0 :

    class Book(EntityType):
        title = String(required=True)
        page_number = Int()
    
    class Article(EntityType):
        title = String(required=True)
    
    class Author(EntityType):
        name = String()
        writer_of = SubjectRelation(('Book', 'Article'), cardinality='*1', composite='subject')
    
  • in version 2.0.0 :

    class Book(EntityType):
        title = String(required=True)
        page_number = Int()
    
    class Article(EntityType):
        title = String(required=True)
    
    class Author(EntityType):
        name = String()
    
    class written_by(RelationDefinition):
        subject = ('Book', 'Article')
        object = 'Author'
        cardinality = '**'
    

In such a case, your migration scripts could look like :

  • 1.0.0_Any.py :

    add_entity_type('Article')
    add_relation_definition('Author', 'writer_of', 'Article')
    
  • 2.0.0_Any.py :

    add_relation_type('written_by')
    
    for author in rql('Author A').entities(0):
        author.writer_of[0].cw_set(written_by=author)
    
    drop_relation_type('writer_of')
    

Now imagine you want to migrate a pretty old instance that uses the 0.4.0 version directly to version 2.0.0 :

  • only the 0.4.0 and the 2.0.0 version schemas are known to the system : the former is stored in the database, the latter is on disk ; when the 1.0.0_Any.py migration script is executed, the writer_of relation is not present in the schema on disk, so the add_relation_definition instruction will fail, because none of the two schemas known to the system describes it.
  • your solutions to solve this problem are :
    • first install the 1.0.0 version, migrate your instance, then upgrade to 2.0.0 ; this may not even be possible if your application that uses the cube above has no compatible version with its 1.0.0 version (they may have completely different development cycles) ; it is at best painful.
    • introduce conditional instructions in the 1.0.0 script, to handle the case where writer_of does not exist ; this works, but really painful to test and often leads to unsafe migrations.

None of these short term solutions is satisfactory, hence the proposal below.

Proposal

Basicly, the proposal is to always keep a copy of the schema of the released versions in all the cube versions.

The migration system would then read these schema file one after another and perform the migration ; at the end of each migration step, the new db schema becomes the last file schema and the process goes on.

The different schema files should probably be stored in the migration folder of the cube, with names like schema_1.0.0.py, schema_2.0.0.py, etc.

prioritynormal
typeenhancement
done in<not specified>
closed by<not specified>