Blog entries by Denis Laxalde [4]
  • Hypermedia API with cubicweb-jsonschema

    2017/04/04 by Denis Laxalde

    This is the second post of a series about cubicweb-jsonschema. The first post mainly dealt with JSON Schema representations of CubicWeb entities along with a brief description of the JSON API. In this second post, I'll describe another aspect of the project that aims at building an hypermedia API by leveraging the JSON Hyper Schema specification.

    Hypermedia APIs and JSON Hyper Schema

    Hypermedia API is somehow a synonymous of RESTful API but it makes it clearer that the API serves hypermedia responses, i.e. content that helps discoverability of other resources.

    At the heart of an hypermedia API is the concept of link relation which both aims at describing relationships between resources as well as provinding ways to manipulate them.

    In JSON Hyper Schema terminology, link relations take the form of a collection of Link Description Objects gathered into a links property of a JSON Schema document. These Link Description Objects thus describes relationships between the instance described by the JSON Schema document at stake and other resources; they hold a number of properties that makes relationships manipulation possible:

    • rel is the name of the relation, it is usually one of relation names registered at IANA;
    • href indicates the URI of the target of the relation, it may be templated by a JSON Schema;
    • targetSchema is a JSON Schema document (or reference) describing the target of the link relation;
    • schema (recently renamed as submissionSchema) is a JSON Schema document (or reference) describing what the target of the link expects when submitting data.

    Hypermedia walkthrough

    In the remaining of the article, I'll walk through a navigation path that is made possible by hypermedia controls provided by cubicweb-jsonschema. I'll continue on the example application described in the first post of the series which schema consists of Book, Author and Topic entity types. In essence, this walkthrough is typical of what an intelligent client could do when exposed to the API, i.e. from any resource, discover other resources and navigate or manipulate them.

    This walkthrough assumes that, given any resource (i.e. something that has a URL like /book/1), the server would expose data at the main URL when the client asks for JSON through the Accept header and it would expose the JSON Schema of the resource at a schema view of the same URL (i.e. /book/1/schema). This assumption can be regarded as a kind of client/server coupling, which might go away in later implementation.

    Site root

    While client navigation could start from any resource, we start from the root resource and retrieve its schema:

    GET /schema
    Accept: application/schema+json
    
    HTTP/1.1 200 OK
    Content-Type: application/json
    
    {
        "links": [
            {
                "href": "/author/",
                "rel": "collection",
                "schema": {
                    "$ref": "/author/schema?role=creation"
                },
                "targetSchema": {
                    "$ref": "/author/schema"
                },
                "title": "Authors"
            },
            {
                "href": "/book/",
                "rel": "collection",
                "schema": {
                    "$ref": "/book/schema?role=creation"
                },
                "targetSchema": {
                    "$ref": "/book/schema"
                },
                "title": "Books"
            },
            {
                "href": "/topic/",
                "rel": "collection",
                "schema": {
                    "$ref": "/topic/schema?role=creation"
                },
                "targetSchema": {
                    "$ref": "/topic/schema"
                },
                "title": "Topics"
            }
        ]
    }
    

    So at root URL, our application serves a JSON Hyper Schema that only consists of links. It has no JSON Schema document, which is natural since there's usually no data bound to the root resource (think of it as empty rset in CubicWeb terminology).

    These links correspond to top-level entity types, i.e. those that would appear in the default startup page of a CubicWeb application. They all have "rel": "collection" relation name (this comes from RFC6573) as their target is a collection of entities. We also have schema and targetSchema properties.

    From collection to items

    Now that we have added a new book, let's step back and use our books link to retrieve data (verb GET):

    GET /book/
    Accept: application/json
    
    HTTP/1.1 200 OK
    Allow: GET, POST
    Content-Type: application/json
    
    [
        {
            "id": "859",
            "title": "L'homme qui rit"
        },
        {
            "id": "858",
            "title": "The Old Man and the Sea"
        },
    ]
    

    which, as always, needs to be completed by a JSON Schema:

    GET /book/schema
    Accept: application/schema+json
    
    
    HTTP/1.1 200 OK
    Content-Type: application/json
    
    {
        "$ref": "#/definitions/Book_plural",
        "definitions": {
            "Book_plural": {
                "items": {
                    "properties": {
                        "id": {
                            "type": "string"
                        },
                        "title": {
                            "type": "string"
                        }
                    },
                    "type": "object"
                },
                "title": "Books",
                "type": "array"
            }
        },
        "links": [
            {
                "href": "/book/",
                "rel": "collection",
                "schema": {
                    "$ref": "/book/schema?role=creation"
                },
                "targetSchema": {
                    "$ref": "/book/schema"
                },
                "title": "Books"
            },
            {
                "href": "/book/{id}",
                "rel": "item",
                "targetSchema": {
                    "$ref": "/book/schema?role=view"
                },
                "title": "Book"
            }
        ]
    }
    

    Consider the last item of links in the above schema. It has a "rel": "item" property which indicates how to access items of the collection; its href property is a templated URI which can be expanded using instance data and schema (here we only have a single id template variable).

    So our client may navigate to the first item of the collection (id="859") at /book/859 URI, and retrieve resource data:

    GET /book/859
    Accept: application/json
    
    HTTP/1.1 200 OK
    Allow: GET, PUT, DELETE
    Content-Type: application/json
    
    {
        "author": [
            "Victor Hugo"
        ],
        "publication_date": "1869-04-01T00:00:00",
        "title": "L'homme qui rit"
    }
    

    and schema:

    GET /book/859/schema
    Accept: application/schema+json
    
    HTTP/1.1 200 OK
    Content-Type: application/json
    
    {
        "$ref": "#/definitions/Book",
        "definitions": {
            "Book": {
                "additionalProperties": false,
                "properties": {
                    "author": {
                        "items": {
                            "type": "string"
                        },
                        "title": "author",
                        "type": "array"
                    },
                    "publication_date": {
                        "format": "date-time",
                        "title": "publication date",
                        "type": "string"
                    },
                    "title": {
                        "title": "title",
                        "type": "string"
                    },
                    "topics": {
                        "items": {
                            "type": "string"
                        },
                        "title": "topics",
                        "type": "array"
                    }
                },
                "title": "Book",
                "type": "object"
            }
        },
        "links": [
            {
                "href": "/book/",
                "rel": "up",
                "targetSchema": {
                    "$ref": "/book/schema"
                },
                "title": "Book_plural"
            },
            {
                "href": "/book/859/",
                "rel": "self",
                "schema": {
                    "$ref": "/book/859/schema?role=edition"
                },
                "targetSchema": {
                    "$ref": "/book/859/schema?role=view"
                },
                "title": "Book #859"
            }
        ]
    }
    

    Entity resource

    The resource obtained above as an item of a collection is actually an entity. Notice the rel="self" link. It indicates how to manipulate the current resource (i.e. at which URI, using a given schema depending on what actions we want to perform). Still this link does not indicate what actions may be performed. This indication is found in the Allow header of the data response above:

    Allow: GET, PUT, DELETE
    

    With these information bits, our intelligent client is able to, for instance, form a request to delete the resource. On the other hand, the action to update the resource (which is allowed because of the presence of PUT in Allow header, per HTTP semantics) would take the form of a request which body conforms to the JSON Schema pointed at by the schema property of the link.

    Also note the rel="up" link which makes it possible to navigate to the collection of books.

    Conclusions

    This post introduced the main hypermedia capabilities of cubicweb-jsonschema, built on top of the JSON Hyper Schema specification. The resulting Hypermedia API makes it possible for an intelligent client to navigate through hypermedia resources and manipulate them by using both link relation semantics and HTTP verbs.

    In the next post, I'll deal with relationships description and manipulation both in terms of API (endpoints) and hypermedia representation.


  • Introducing cubicweb-jsonschema

    2017/03/23 by Denis Laxalde

    This is the first post of a series introducing the cubicweb-jsonschema project that is currently under development at Logilab. In this post, I'll first introduce the general goals of the project and then present in more details two aspects about data models (the connection between Yams and JSON schema in particular) and the basic features of the API. This post does not always present how things work in the current implementation but rather how they should.

    Goals of cubicweb-jsonschema

    From a high level point of view, cubicweb-jsonschema addresses mainly two interconnected aspects. One related to modelling for client-side development of user interfaces to CubicWeb applications while the other one concerns the HTTP API.

    As far as modelling is concerned, cubicweb-jsonschema essentially aims at providing a transformation mechanism between a Yams schema and JSON Schema that is both automatic and extensible. This means that we can ultimately expect that Yams definitions alone would sufficient to have generated JSON schema definitions that would consistent enough to build an UI, pretty much as it is currently with the automatic web UI in CubicWeb. A corollary of this goal is that we want JSON schema definitions to match their context of usage, meaning that a JSON schema definition would not be the same in the context of viewing, editing or relationships manipulations.

    In terms of API, cubicweb-jsonschema essentially aims at providing an HTTP API to manipulate entities based on their JSON Schema definitions.

    Finally, the ultimate goal is to expose an hypermedia API for a CubicWeb application in order to be able to ultimately build an intelligent client. For this we'll build upon the JSON Hyper-Schema specification. This aspect will be discussed in a later post.

    Basic usage as an HTTP API library

    Consider a simple case where one wants to manipulate entities of type Author described by the following Yams schema definition:

    class Author(EntityType):
        name = String(required=True)
    

    With cubicweb-jsonschema one can get JSON Schema for this entity type in at different contexts such: view, creation or edition. For instance:

    • in a view context, the JSON Schema will be:

      {
          "$ref": "#/definitions/Author",
          "definitions": {
              "Author": {
                  "additionalProperties": false,
                  "properties": {
                      "name": {
                          "title": "name",
                          "type": "string"
                      }
                  },
                  "title": "Author",
                  "type": "object"
              }
          }
      }
      
    • whereas in creation context, it'll be:

      {
          "$ref": "#/definitions/Author",
          "definitions": {
              "Author": {
                  "additionalProperties": false,
                  "properties": {
                      "name": {
                          "title": "name",
                          "type": "string"
                      }
                  },
                  "required": [
                      "name"
                  ],
                  "title": "Author",
                  "type": "object"
              }
          }
      }
      

      (notice, the required keyword listing name property).

    Such JSON Schema definitions are automatically generated from Yams definitions. In addition, cubicweb-jsonschema exposes some endpoints for basic CRUD operations on resources through an HTTP (JSON) API. From the client point of view, requests on these endpoints are of course expected to match JSON Schema definitions. Some examples:

    Get an author resource:

    GET /author/855
    Accept:application/json
    
    HTTP/1.1 200 OK
    Content-Type: application/json
    {"name": "Ernest Hemingway"}
    

    Update an author:

    PATCH /author/855
    Accept:application/json
    Content-Type: application/json
    {"name": "Ernest Miller Hemingway"}
    
    HTTP/1.1 200 OK
    Location: /author/855/
    Content-Type: application/json
    {"name": "Ernest Miller Hemingway"}
    

    Create an author:

    POST /author
    Accept:application/json
    Content-Type: application/json
    {"name": "Victor Hugo"}
    
    HTTP/1.1 201 Created
    Content-Type: application/json
    Location: /Author/858
    {"name": "Victor Hugo"}
    

    Delete an author:

    DELETE /author/858
    
    HTTP/1.1 204 No Content
    

    Now if the client sends invalid input with respect to the schema, they'll get an error:

    (We provide a wrong born property in request body.)

    PATCH /author/855
    Accept:application/json
    Content-Type: application/json
    {"born": "1899-07-21"}
    
    HTTP/1.1 400 Bad Request
    Content-Type: application/json
    
    {
        "errors": [
            {
                "details": "Additional properties are not allowed ('born' was unexpected)",
                "status": 422
            }
        ]
    }
    

    From Yams model to JSON Schema definitions

    The example above illustrates automatic generation of JSON Schema documents based on Yams schema definitions. These documents are expected to help developping views and forms for a web client. Clearly, we expect that cubicweb-jsonschema serves JSON Schema documents for viewing and editing entities as cubicweb.web serves HTML documents for the same purposes. The underlying logic for JSON Schema generation is currently heavily inspired by the logic of primary view and automatic entity form as they exists in cubicweb.web.views. That is: the Yams schema is introspected to determine how properties should be generated and any additionnal control over this can be performed through uicfg declarations [1].

    To illustrate let's consider the following schema definitions which:

    class Book(EntityType):
        title = String(required=True)
        publication_date = Datetime(required=True)
    
    class Illustration(EntityType):
        data = Bytes(required=True)
    
    class illustrates(RelationDefinition):
        subject = 'Illustration'
        object = 'Book'
        cardinality = '1*'
        composite = 'object'
        inlined = True
    
    class Author(EntityType):
        name = String(required=True)
    
    class author(RelationDefinition):
        subject = 'Book'
        object = 'Author'
        cardinality = '1*'
    
    class Topic(EntityType):
        name = String(required=True)
    
    class topics(RelationDefinition):
        subject = 'Book'
        object = 'Topic'
        cardinality = '**'
    

    and consider, as before, JSON Schema documents in different contexts for the the Book entity type:

    • in view context:

      {
          "$ref": "#/definitions/Book",
          "definitions": {
              "Book": {
                  "additionalProperties": false,
                  "properties": {
                      "author": {
                          "items": {
                              "type": "string"
                          },
                          "title": "author",
                          "type": "array"
                      },
                      "publication_date": {
                          "format": "date-time",
                          "title": "publication_date",
                          "type": "string"
                      },
                      "title": {
                          "title": "title",
                          "type": "string"
                      },
                      "topics": {
                          "items": {
                              "type": "string"
                          },
                          "title": "topics",
                          "type": "array"
                      }
                  },
                  "title": "Book",
                  "type": "object"
              }
          }
      }
      

      We have a single Book definition in this document, in which we find attributes defined in the Yams schema (title and publication_date). We also find the two relations where Book is involved: topics and author, both appearing as a single array of "string" items. The author relationship appears like that because it is mandatory but not composite. On the other hand, the topics relationship has the following uicfg rule:

      uicfg.primaryview_section.tag_subject_of(('Book', 'topics', '*'), 'attributes')
      

      so that it's definition appears embedded in the document of Book definition.

      A typical JSON representation of a Book entity would be:

      {
          "author": [
              "Ernest Miller Hemingway"
          ],
          "title": "The Old Man and the Sea",
          "topics": [
              "sword fish",
              "cuba"
          ]
      }
      
    • in creation context:

      {
          "$ref": "#/definitions/Book",
          "definitions": {
              "Book": {
                  "additionalProperties": false,
                  "properties": {
                      "author": {
                          "items": {
                              "oneOf": [
                                  {
                                      "enum": [
                                          "855"
                                      ],
                                      "title": "Ernest Miller Hemingway"
                                  },
                                  {
                                      "enum": [
                                          "857"
                                      ],
                                      "title": "Victor Hugo"
                                  }
                              ],
                              "type": "string"
                          },
                          "maxItems": 1,
                          "minItems": 1,
                          "title": "author",
                          "type": "array"
                      },
                      "publication_date": {
                          "format": "date-time",
                          "title": "publication_date",
                          "type": "string"
                      },
                      "title": {
                          "title": "title",
                          "type": "string"
                      }
                  },
                  "required": [
                      "title",
                      "publication_date"
                  ],
                  "title": "Book",
                  "type": "object"
              }
          }
      }
      

      notice the differences, we now only have attributes and required relationships (author) in this schema and we have the required listing mandatory attributes; the author property is represented as an array which items consist of pre-existing objects of the author relationship (namely Author entities).

      Now assume we add the following uicfg declaration:

      uicfg.autoform_section.tag_object_of(('*', 'illustrates', 'Book'), 'main', 'inlined')
      

      the JSON Schema for creation context will be:

      {
          "$ref": "#/definitions/Book",
          "definitions": {
              "Book": {
                  "additionalProperties": false,
                  "properties": {
                      "author": {
                          "items": {
                              "oneOf": [
                                  {
                                      "enum": [
                                          "855"
                                      ],
                                      "title": "Ernest Miller Hemingway"
                                  },
                                  {
                                      "enum": [
                                          "857"
                                      ],
                                      "title": "Victor Hugo"
                                  }
                              ],
                              "type": "string"
                          },
                          "maxItems": 1,
                          "minItems": 1,
                          "title": "author",
                          "type": "array"
                      },
                      "illustrates": {
                          "items": {
                              "$ref": "#/definitions/Illustration"
                          },
                          "title": "illustrates_object",
                          "type": "array"
                      },
                      "publication_date": {
                          "format": "date-time",
                          "title": "publication_date",
                          "type": "string"
                      },
                      "title": {
                          "title": "title",
                          "type": "string"
                      }
                  },
                  "required": [
                      "title",
                      "publication_date"
                  ],
                  "title": "Book",
                  "type": "object"
              },
              "Illustration": {
                  "additionalProperties": false,
                  "properties": {
                      "data": {
                          "format": "data-url",
                          "title": "data",
                          "type": "string"
                      }
                  },
                  "required": [
                      "data"
                  ],
                  "title": "Illustration",
                  "type": "object"
              }
          }
      }
      

      We now have an additional illustrates property modelled as an array of #/definitions/Illustration, the later also added the the document as an additional definition entry.

    Conclusion

    This post illustrated how a basic (CRUD) HTTP API based on JSON Schema could be build for a CubicWeb application using cubicweb-jsonschema. We have seen a couple of details on JSON Schema generation and how it can be controlled. Feel free to comment and provide feedback on this feature set as well as open the discussion with more use cases.

    Next time, we'll discuss how hypermedia controls can be added the HTTP API that cubicweb-jsonschema provides.

    [1]this choice is essentially driven by simplicity and conformance when the existing behavior to help migration of existing applications.

  • Using JSONAPI as a Web API format for CubicWeb

    2016/01/26 by Denis Laxalde

    Following the introduction post about rethinking the web user interface of CubicWeb, this article will address the topic of the Web API to exchange data between the client and the server. As mentioned earlier, this question is somehow central and deserves particular interest, and better early than late. Of the two candidate representations previously identified Hydra and JSON API, this article will focus on the later. Hopefully, this will give a better insight of the capabilities and limits of this specification and would help take a decision, though a similar experiment with another candidate would be good to have. Still in the process of blog driven development, this post has several open questions from which a discussion would hopefully emerge...

    A glance at JSON API

    JSON API is a specification for building APIs that use JSON as a data exchange format between clients and a server. The media type is application/vnd.api+json. It has a 1.0 version available from mid-2015. The format has interesting features such as the ability to build compound documents (i.e. response made of several, usually related, resources) or to specify filtering, sorting and pagination.

    A document following the JSON API format basically represents resource objects, their attributes and relationships as well as some links also related to the data of primary concern.

    Taking the example of a Ticket resource modeled after the tracker cube, we could have a JSON API document formatted as:

    GET /ticket/987654
    Accept: application/vnd.api+json
    
    {
      "links": {
        "self": "https://www.cubicweb.org/ticket/987654"
      },
      "data": {
        "type": "ticket",
        "id": "987654",
        "attributes": {
          "title": "Let's use JSON API in CubicWeb"
          "description": "Well, let's try, at least...",
        },
        "relationships": {
          "concerns": {
            "links": {
              "self": "https://www.cubicweb.org/ticket/987654/relationships/concerns",
              "related": "https://www.cubicweb.org/ticket/987654/concerns"
            },
            "data": {"type": "project", "id": "1095"}
          },
          "done_in": {
            "links": {
              "self": "https://www.cubicweb.org/ticket/987654/relationships/done_in",
              "related": "https://www.cubicweb.org/ticket/987654/done_in"
            },
            "data": {"type": "version", "id": "998877"}
          }
        }
      },
      "included": [{
        "type": "project",
        "id": "1095",
        "attributes": {
            "name": "CubicWeb"
        },
        "links": {
          "self": "https://www.cubicweb.org/project/cubicweb"
        }
      }]
    }
    

    In this JSON API document, top-level members are links, data and included. The later is here used to ship some resources (here a "project") related to the "primary data" (a "ticket") through the "concerns" relationship as denoted in the relationships object (more on this later).

    While the decision of including or not these related resources along with the primary data is left to the API designer, JSON API also offers a specification to build queries for inclusion of related resources. For example:

    GET /ticket/987654?include=done_in
    Accept: application/vnd.api+json
    

    would lead to a response including the full version resource along with the above content.

    Enough for the JSON API overview. Next I'll present how various aspects of data fetching and modification can be achieved through the use of JSON API in the context of a CubicWeb application.

    CRUD

    CRUD of resources is handled in a fairly standard way in JSON API, relying of HTTP protocol semantics.

    For instance, creating a ticket could be done as:

    POST /ticket
    Content-Type: application/vnd.api+json
    Accept: application/vnd.api+json
    
    {
      "data": {
        "type": "ticket",
        "attributes": {
          "title": "Let's use JSON API in CubicWeb"
          "description": "Well, let's try, at least...",
        },
        "relationships": {
          "concerns": {
            "data": { "type": "project", "id": "1095" }
          }
        }
      }
    }
    

    Then updating it (assuming we got its id from a response to the above request):

    PATCH /ticket/987654
    Content-Type: application/vnd.api+json
    Accept: application/vnd.api+json
    
    {
      "data": {
        "type": "ticket",
        "id": "987654",
        "attributes": {
          "description": "We'll succeed, for sure!",
        },
      }
    }
    

    Relationships

    In JSON API, a relationship is in fact a first class resource as it is defined by a noun and an URI through a link object. In this respect, the client just receives a couple of links and can eventually operate on them using the proper HTTP verb. Fetching or updating relationships is done using the special <resource url>/relationships/<relation type> endpoint (self member of relationships items in the first example). Quite naturally, the specification relies on GET verb for fetching targets, PATCH for (re)setting a relation (i.e. replacing its targets), POST for adding targets and DELETE to drop them.

    GET /ticket/987654/relationships/concerns
    Accept: application/vnd.api+json
    
    {
      "data": {
        "type": "project",
        "id": "1095"
      }
    }
    
    PATCH /ticket/987654/relationships/done_in
    Content-Type: application/vnd.api+json
    Accept: application/vnd.api+json
    
    {
      "data": {
        "type": "version",
        "id": "998877"
      }
    }
    

    The body of request and response of this <resource url>/relationships/<relation type> endpoint consists of so-called resource identifier objects which are lightweight representation of resources usually only containing information about their "type" and "id" (enough to uniquely identify them).

    Related resources

    Remember the related member appearing in relationships links in the first example?

      [ ... ]
      "done_in": {
        "links": {
          "self": "https://www.cubicweb.org/ticket/987654/relationships/done_in",
          "related": "https://www.cubicweb.org/ticket/987654/done_in"
        },
        "data": {"type": "version", "id": "998877"}
      }
      [ ... ]
    

    While this is not a mandatory part of the specification, it has an interesting usage for fetching relationship targets. In contrast with the .../relationships/... endpoint, this one is expected to return plain resource objects (which attributes and relationships information in particular).

    GET /ticket/987654/done_in
    Accept: application/vnd.api+json
    
    {
      "links": {
        "self": "https://www.cubicweb.org/998877"
      },
      "data": {
        "type": "version",
        "id": "998877",
        "attributes": {
            "number": 4.2
        },
        "relationships": {
          "version_of": {
            "self": "https://www.cubicweb.org/998877/relationships/version_of",
            "data": { "type": "project", "id": "1095" }
          }
        }
      },
      "included": [{
        "type": "project",
        "id": "1095",
        "attributes": {
            "name": "CubicWeb"
        },
        "links": {
          "self": "https://www.cubicweb.org/project/cubicweb"
        }
      }]
    }
    

    Meta information

    The JSON API specification allows to include non-standard information using a so-called meta object. This can be found in various place of the document (top-level, resource objects or relationships object). Usages of this field is completely free (and optional). For instance, we could use this field to store the workflow state of a ticket:

    {
      "data": {
        "type": "ticket",
        "id": "987654",
        "attributes": {
          "title": "Let's use JSON API in CubicWeb"
        },
        "meta": { "state": "open" }
    }
    

    Permissions

    Permissions are part of metadata to be exchanged during request/response cycles. As such, the best place to convey this information is probably within the headers. According to JSON API's FAQ, this is also the recommended way for a resource to advertise on supported actions.

    So for instance, response to a GET request could include Allow headers, indicating which request methods are allowed on the primary resource requested:

    GET /ticket/987654
    Allow: GET, PATCH, DELETE
    

    An HEAD request could also be used for querying allowed actions on links (such as relationships):

    HEAD /ticket/987654/relationships/comments
    Allow: POST
    

    This approach has the advantage of being standard HTTP, no particular knowledge of the permissions model is required and the response body is not cluttered with these metadata.

    Another possibility would be to rely use the meta member of JSON API data.

    {
      "data": {
        "type": "ticket",
        "id": "987654",
        "attributes": {
          "title": "Let's use JSON API in CubicWeb"
        },
        "meta": {
          "permissions": ["read", "update"]
        }
      }
    }
    

    Clearly, this would minimize the amount client/server requests.

    More Hypermedia controls

    With the example implementation described above, it appears already possible to manipulate several aspects of the entity-relationship database following a CubicWeb schema: resources fetching, CRUD operations on entities, set/delete operations on relationships. All these "standard" operations are discoverable by the client simply because they are baked into the JSON API format: for instance, adding a target to some relationship is possible by POSTing to the corresponding relationship resource something that conforms to the schema.

    So, implicitly, this already gives us a fairly good level of Hypermedia control so that we're not so far from having a mature REST architecture according to the Richardson Maturity Model. But beyond these "standard" discoverable actions, the JSON API specification does not address yet Hypermedia controls in a generic manner (see this interesting discussion about extending the specification for this purpose).

    So the question is: would we want more? Or, in other words, do we need to define "actions" which would not map directly to a concept in the application model?

    In the case of a CubicWeb application, the most obvious example (that I could think of) of where such an "action" would be needed is workflow state handling. Roughly, workflows in CubicWeb are modeled through two entity types State and TrInfo (for "transition information"), the former being handled through the latter, and a relationship in_state between the workflowable entity type at stake and its current State. It does not appear so clearly how would one model this in terms of HTTP resource. (Arguably we wouldn't want to expose the complexity of Workflow/TrInfo/State data model to the client, nor can we simply expose this in_state relationship, as a client would not be able to simply change the state of a entity by updating the relation). So what would be a custom "action" to handle the state of a workflowable resource? Back in our tracker example, how would we advertise to the client the possibility to perform "open"/"close"/"reject" actions on a ticket resource? Open question...

    Request for comments

    In this post, I tried to give an overview of a possible usage of JSON API to build a Web API for CubicWeb. Several aspects were discussed from simple CRUD operations, to relationships handling or non-standard actions. In many cases, there are open questions for which I'd love to receive feedback from the community. Recalling that this topic is a central part of the experiment towards building a client-side user interface to CubicWeb, the more discussion it gets, the better!

    For those wanting to try and play themselves with the experiments, have a look at the code. This is a work-in-progress/experimental implementation, relying on Pyramid for content negotiation and route traversals.

    What's next? Maybe an alternative experiment relying on Hydra? Or an orthogonal one playing with the schema client-side?


  • Towards building a JavaScript user interface to CubicWeb

    2016/01/08 by Denis Laxalde

    This post is an introduction of a series of articles dealing with an on-going experiment on building a JavaScript user interface to CubicWeb, to ultimately replace the web component of the framework. The idea of this series is to present the main topics of the experiment, with open questions in order to eventually engage the community as much as possible. The other side of this is to experiment a blog driven development process, so getting feedback is the very point of it!

    As of today, three main topics have been identified:

    • the Web API to let the client and server communicate,
    • the issue of representing the application schema client-side, and,
    • the construction of components of the web interface (client-side).

    As part of the first topic, we'll probably rely on another experimental work about REST-fulness undertaken recently in pyramid-cubicweb (see this head for source code). Then, it appears quite clearly that we'll need sooner or later a representation of data on the client-side and that, quite obviously, the underlying format would be JSON. Apart from exchanging of entities (database) information, we already anticipate on the need for the HATEOAS part of REST. We already took some time to look at the existing possibilities. At a first glance, it seems that hydra is the most promising in term of capabilities. It's also built using semantic web technologies which definitely grants bonus point for CubicWeb. On the other hand, it seems a bit isolated and very experimental, while JSON API follows a more pragmatic approach (describe itself as an anti-bikeshedding tool) and appears to have more traction from various people. For this reason, we choose it for our first draft, but this topic seems so central in a new UI, and hard to hide as an implementation detail; that it definitely deserves more discussion. Other candidates could be Siren, HAL or Uber.

    Concerning the schema, it seems that there is consensus around JSON-Schema so we'll certainly give it a try.

    Finally, while there is nothing certain as of today we'll probably start on building components of the web interface using React, which is also getting quite popular these days. Beyond that choice, the first practical task in this topic will concern the primary view system. This task being neither too simple nor too complicated will hopefully result in a clearer overview of what the project will imply. Then, the question of edition will come up at some point. In this respect, perhaps it'll be a good time to put the UX question at a central place, in order to avoid design issues that we had in the past.

    Feedback welcome!


  • Exploring the datafeed API in CubicWeb

    2014/09/26 by Denis Laxalde

    The datafeed API is one of the nice features of the CubicWeb framework. It makes it possible to easily build such things as a news aggregator (or even a semantic news feed reader), a LDAP importer or an application importing data from another web platform. The underlying API is quite flexible and powerful. Yet, the documentation being quite thin, it may be hard to find one's way through. In this article, we'll describe the basics of the datafeed API and provide guiding examples.

    The datafeed API is essentially built around two things: a CWSource entity and a parser, which is a kind of AppObject.

    The CWSource entity defines a list of URL from which to fetch data to be imported in the current CubicWeb instance, it is linked to a parser through its __regid__. So something like the following should be enough to create a usable datafeed source [1].

    create_entity('CWSource', name=u'some name', type=u'datafeed', parser=u'myparser')
    

    The parser is usually a subclass of DataFeedParser (from cubicweb.server.sources.datafeed). It should at least implement the two methods process and before_entity_copy. To make it easier, there are specialized parsers such as DataFeedXMLParser that already define process so that subclasses only have to implement the process_item method.

    Overview of the datafeed API

    Before going into further details about the actual implementation of a DataFeedParser, it's worth having in mind a few details about the datafeed parsing and import process. This involves various players from the CubicWeb server, namely: a DataFeedSource (from cubicweb.server.sources.datafeed), the Repository and the DataFeedParser.

    • Everything starts from the Repository which loops over its sources and pulls data from each of these (this is done using a looping task which is setup upon repository startup). In the case of datafeed sources, Repository sources are instances of the aforementioned DataFeedSource class [2].
    • The DataFeedSource selects the appropriate parser from the registry and loops on each uri defined in the respective CWSource entity by calling the parser's process method with that uri as argument (methods pull_data and process_urls of DataFeedSource).
    • If the result of the parsing step is successful, the DataFeedSource will call the parser's handle_deletion method, with the URI of the previously imported entities.
    • Then, the import log is formatted and the transaction committed. The DataFeedSource and DataFeedParser are connected to an import_log which feeds the CubicWeb instance with a CWDataImport per data pull. This usually contains the number of created and updated entities along with any error/warning message logged by the parser. All this is visible in a table from the CWSource primary view.

    So now, you might wonder what actually happens during the parser's process method call. This method takes an URL from which to fetch data and processes further each piece of data (using a process_item method for instance). For each data-item:

    1. the repository is queried to retrieve or create an entity in the system source: this is done using the extid2entity method;
    2. this extid2entity method essentially needs two pieces of information:
      • a so-called extid, which uniquely identifies an item in the distant source
      • any other information needed to create or update the corresponding entity in the system source (this will be later refered to as the sourceparams)
    3. then, given the (new or existing) entity returned by extid2entity, the parser can perform further postprocessing (for instance, updating any relation on this entity).

    In step 1 above, the parser method extid2entity in turns calls the repository method extid2eid given the current source and the extid value. If an entry in the entities table matches with the specified extid, the corresponding eid (identifier in the system source) is returned. Otherwise, a new eid is created. It's worth noting that the created entity (in case the entity is to be created) is not complete with respect to the data model at this point. In order the entity to be completed, the source method before_entity_insertion is called. This is where the aforementioned sourceparams are used. More specifically, on the parser side the before_entity_copy method is called: it usually just updates (using entity.cw_set() for instance) the fetched entity with any relevant information.

    Case study: a news feeds parser

    Now we'll go through a concrete example to illustrate all those fairly abstract concepts and implement a datafeed parser which can be used to import news feeds. Our parser will create entities of type FeedArticle, which minimal data model would be:

    class FeedArticle(EntityType):
        title = String(fulltextindexed=True)
        uri = String(unique=True)
        author = String(fulltextindexed=True)
        content = RichString(fulltextindexed=True, default_format='text/html')
    

    Here we'll reuse the DataFeedXMLParser, not because we have XML data to parse, but because its interface fits well with our purpose, namely: it ships an item-based processing (a process_item method) and it relies on a parse method to fetch raw data. The underlying parsing of the news feed resources will be handled by feedparser.

    class FeedParser(DataFeedXMLParser):
        __regid__ = 'newsaggregator.feed-parser'
    

    The parse method is called by process, it should return a list tuples with items information.

    def parse(self, url):
        """Delegate to feedparser to retrieve feed items"""
        data = feedparser.parse(url)
        return zip(data.entries)
    

    Then the process_item method takes an individual item (i.e. an entry of the result obtained from feedparser in our case). It essentially defines an extid, here the uri of the feed entry (good candidate for unicity) and calls extid2entity with that extid, the entity type to be created / retrieved and any additional data useful for entity completion passed as keyword arguments. (The process_feed method call just transforms the results obtained from feedparser into a dict suitable for entity creation following the data model described above.)

    def process_item(self, entry):
        data = self.process_feed(entry)
        extid = data['uri']
        entity = self.extid2entity(extid, 'FeedArticle', feeddata=data)
    

    The before_entity_copy method is called before the entity is actually created (or updated) in order to give the parser a chance to complete it with any other attribute that could be set from source data (namely feedparser data in our case).

    def before_entity_copy(self, entity, sourceparams):
        feeddata = sourceparams['feeddata']
        entity.cw_edited.update(feeddata)
    

    And this is all what's essentially needed for a simple parser. Further details could be found in the news aggregator cube. More sophisticated parsers may use other concepts not described here, such as source mappings.

    Testing datafeed parsers

    Testing a datafeed parser often involves pulling data from the corresponding datafeed source. Here is a minimal test snippet that illustrates how to retrieve the datafeed source from a CWSource entity and to pull data from it.

    with self.admin_access.repo_cnx() as cnx:
        # Assuming one knows the URI of a CWSource.
        rset = cnx.execute('CWSource X WHERE X uri %s' % uri)
        # Retrieve the datafeed source instance.
        dfsource = self.repo.sources_by_eid[rset[0][0]]
        # Make sure it's parser matches the expected.
        self.assertEqual(dfsource.parser_id, '<my-parser-id>')
        # Pull data using an internal connection.
        with self.repo.internal_cnx() as icnx:
            stats = dfsource.pull_data(icnx, force=True, raise_on_error=True)
            icnx.commit()
    

    The resulting stats is a dictionnary containing eids of created and updated entities during the pull. In addition all entities created should have the cw_source relation set to the corresponding CWSource entity.

    Notes

    [1]

    It is possible to add some configuration to the CWSource entity in the form a string of configuration items (one per line). Noteworthy items are:

    • the synchronization-interval;
    • use-cwuri-as-url=no, which avoids using external URL inside the CubicWeb instance (leading to any link on an imported entity to point to the external source URI);
    • delete-entities=[yes,no] which controls if entities not found anymore in the distant source should be deleted from the CubicWeb instance.
    [2]The mapping between CWSource entities' type (e.g. "datafeed") and DataFeedSource object is quite unusual as it does not rely on the vreg but uses a specific sources registry (defined in cubicweb.server.SOURCE_TYPES).

  • Handling dependencies between form fields in CubicWeb

    2014/07/11 by Denis Laxalde

    This post considers the issue of building an edition form of a CubicWeb entity with dependencies on its fields. It's a quite common issue that needs to be handled client-side, based on user interaction.

    Consider the following example schema:

    from yams.buildobjs import EntityType, RelationDefinition, String, SubjectRelation
    from cubicweb.schema import RQLConstraint
    
    _ = unicode
    
    class Country(EntityType):
        name = String(required=True)
    
    class City(EntityType):
        name = String(required=True)
    
    class in_country(RelationDefinition):
        subject = 'City'
        object = 'Country'
        cardinality = '1*'
    
    class Citizen(EntityType):
        name = String(required=True)
        country = SubjectRelation('Country', cardinality='1*',
                                  description=_('country the citizen lives in'))
        city = SubjectRelation('City', cardinality='1*',
                               constraints=[
                                   RQLConstraint('S country C, O in_country C')],
                               description=_('city the citizen lives in'))
    

    The main entity of interest is Citizen which has two relation definitions towards Country and City. Then, a City is bound to a Country through the in_country relation definition.

    In the automatic edition form of Citizen entities, we would like to restrict the choices of cities depending on the selected Country, to be determined from the value of the country field. (In other words, we'd like the constraint on city relation defined above to be fulfilled during form rendering, not just validation.) Typically, in the image below, cities not in Italy should be available in the city select widget:

    Example of Citizen entity edition form.

    The issue will be solved by little customization of the automatic entity form, some uicfg rules and a bit of Javascript. In the following, the country field will be referred to as the master field whereas the city field as the dependent field.

    So here the code of the views.py module:

    from cubicweb.predicates import is_instance
    from cubicweb.web.views import autoform, uicfg
    from cubicweb.uilib import js
    
    _ = unicode
    
    
    class CitizenAutoForm(autoform.AutomaticEntityForm):
        """Citizen autoform handling dependencies between Country/City form fields
        """
        __select__ = is_instance('Citizen')
    
        needs_js = autoform.AutomaticEntityForm.needs_js + ('cubes.demo.js', )
    
        def render(self, *args, **kwargs):
            master_domid = self.field_by_name('country', 'subject').dom_id(self)
            dependent_domid = self.field_by_name('city', 'subject').dom_id(self)
            self._cw.add_onload(js.cw.cubes.demo.initDependentFormField(
                master_domid, dependent_domid))
            super(CitizenAutoForm, self).render(*args, **kwargs)
    
    
    def city_choice(form, field):
        """Vocabulary function grouping city choices by country."""
        req = form._cw
        vocab = [(req._('<unspecified>'), '')]
        for eid, name in req.execute('Any X,N WHERE X is Country, X name N'):
            rset = req.execute('Any N,E ORDERBY N WHERE'
                               ' X name N, X eid E, X in_country C, C eid %(c)s',
                               {'c': eid})
            if rset:
                # 'optgroup' tag.
                oattrs = {'id': 'country_%s' % eid}
                vocab.append((name, None, oattrs))
                for label, value in rset.rows:
                    # 'option' tag.
                    vocab.append((label, str(value)))
        return vocab
    
    
    uicfg.autoform_field_kwargs.tag_subject_of(('Citizen', 'city', '*'),
                                               {'choices': city_choice, 'sort': False})
    

    The first thing (reading from the bottom of the file) is that we've added a choices function on city relation of the Citizen automatic entity form via uicfg. This function city_choice essentially generates the HTML content of the field value by grouping available cities by respective country through the addition of some optgroup tags.

    Then, we've overridden the automatic entity form for Citizen entity type by essentially calling a piece of Javascript code fed with the DOM ids of the master and dependent fields. Fields are retrieved by their name (field_by_name method) and respective id using the dom_id method.

    Now the Javascript part of the picture:

    cw.cubes.demo = {
        // Initialize the dependent form field select and bind update event on
        // change on the master select.
        initDependentFormField: function(masterSelectId,
                                         dependentSelectId) {
            var masterSelect = cw.jqNode(masterSelectId);
            cw.cubes.demo.updateDependentFormField(masterSelect, dependentSelectId);
            masterSelect.change(function(){
                cw.cubes.demo.updateDependentFormField(this, dependentSelectId);
            });
        },
    
        // Update the dependent form field select.
        updateDependentFormField: function(masterSelect,
                                           dependentSelectId) {
            // Clear previously selected value.
            var dependentSelect = cw.jqNode(dependentSelectId);
            $(dependentSelect).val('');
            // Hide all optgroups.
            $(dependentSelect).find('optgroup').hide();
            // But the one corresponding to the master select.
            $('#country_' + $(masterSelect).val()).show();
        }
    }
    

    It consists of two functions. The initDependentFormField is called during form rendering and it essentially bind the second function updateDependentFormField to the change event of the master select field. The latter "update" function retrieves the dependent select field, hides all optgroup nodes (i.e. the whole content of the select widget) and then only shows dependent options that match with selected master option, identified by a custom country_<eid> set by the vocabulary function above.