Blog entries by Sandrine Ribeau [3]

OSCON 2010 - Data freedom and the semantic web

2010/07/29 by Sandrine Ribeau

I presented CubicWeb at OSCON 2010. I could only stay for a day and I did not get a chance to see a lot of talks, but judging from the conference schedule it seems only a few of them were related to making data available on the web. I will focus on these talks, for they are very relevant to us who are building the semantic web.

http://assets.en.oreilly.com/1/event/45/oscon2010_125x125.jpg

I highly encourage you to watch this video of Stormy Peters, "Is Your Data Free?". It addresses the issue of the privacy of data that you think belongs to you but actually doesn't. This is exactly what is behind the CubicWeb design: build your own web of data in a permission based environment in order to preserve your privacy.

http://wiki.freebase.com/skins/freebaseUpdate/freebaselogo.png

Open source, Open data presented by the Freebase folk, makes a very interesting parallel between open source and open data raising the problematic of versioning open data and providing quality data. There are methodologies and tools for open source software to ensure well designed and reliable code. There is absolutely nothing so far that could handle properly data versioning and data quality assurance. That is the biggest concern freebase has and through this talk they asked for help from the open source community so that more people would get involved in finding solutions to serve open data.

An attendee raised an interesting question about the format that everybody would agree to use to represent the data. I was surprised by the answer. It seems that so far they do not believe that this is a concern, not to say they don't care, but almost. For freebase, the main concern and most challenging part of the data representation is to have a unique identifier. I am not quite sure I agree on that part. Yes, this is important, even mandatory, but there is also the need to define or use a known format to represent this data, (RDF for example) so that we can source this data. To be semantic data, it needs to be both identifiable and readable. And I do not see the point of publishing data on the web if it is not ready to use.

Just for fun, look at Rewrite or Refactor: When to Declare Technical Bankruptcy, it might sounds familiar to you...

CubicWeb presentation went well, an interested audience which was very happy to see that we could aggregate multiple types of sources in a CubicWeb application. Of course, it would be even better if we would support an RDF source such as dbpedia: don't worry that's going to happen. Also what raised an interest is the semantic views already integrated in the framework such as SIOC, OWL, FOAF, DOAP that you can find in blog entries (sioc), schema (owl), user (foaf), project (doap).

 

RDF Resource Description Framework Icon OWL Button - microformats JSON - RSS dublincore DOAP SIOC - FOAF

 

By providing a platform for using data from multiple sources and publishing semantic data, CubicWeb is already a piece of the web of open data!


Django, lessons learned in the world of startup companies

2010/06/02 by Sandrine Ribeau

I went to the BayPIGgies meeting last thursday. The talk of this session was led by the chief software architect of RubberCan, Barnaby Bienkowski. The idea was to explain why Django turns out to be the choice a lot of startups make when building their web applications.

Governement 2.0

http://assets.sunlightfoundation.com/site/3.0/images/sf_logo_trans.png

The fact that Django is recommended by Sunlight Foundation is important. This foundation is a non-partisan, non-profit organization based in Washington, DC that focuses on the digitization of government data and the creation of tools and Web sites to make that data easily accessible for all citizens. This is part of what is called Governement 2.0. It is a neologism for attempts to apply the social networking and integration advantages of Web 2.0 to the practice of government (see E-Governement).

It looks like the Sunlight Foundation recommends Django because it comes from the publishing industry. I am not sure what is so special about this, but I wish I could get more details on it, so please add your comments below.

Since the CubicWeb's community is still small, we are not yet recommended by such a large foundation, but we'll make more effort to talk about it and try to expand our community.

Geo-localization

http://geodjango.org/images/globe.png

These days, geo-localization is a big deal in most applications. On that matter, what Django has to offer is GeoDjango, that recently became part of the Django core. It is integrated with the ORM and has pre-generated SQL queries, but it is not optimized. It uses PostGIS, which adds support for geographic objects to the PostgreSQL object-relational database. GeoDjango strives to make it as simple as possible to create geographic web applications, like location-based services. Some of the features it provides are:

  • Extensions to Django’s ORM for the querying and manipulation of spatial data
  • Editing of geometry fields inside the administration panels
  • Loosely-coupled, high-level Python interfaces for GIS geometry operations and data formats.
http://openstreetmap.org/images/osm_logo.png?1271689861

OpenStreetMap is used for the backend. It provides geographic data for any part of the world. This is a nice feature and we should consider it for CubicWeb. What we provide so far is an interface IGeocodable with related views gmap-view, gmap-bubble, geocoding-json and gmap-legend. We do not query this data yet, we simply render them nicely in a Google Map. You can find the details on how to use it here.

Online stores

Numerous web applications are not only service or data providers, they sell something. Satchmo is the Django tool to easily build online stores. It provides a shopping cart framework with checkout using different payment modules such as Authorize.net, TrustCommerce, CyberSource, PayPal, Google Checkout or Protx.

CubicWeb does not provide a component allowing to build an online store, it's not yet a domain we worked on. But I'd like to talk a bit about the cube cubicweb-shoppingcart. This cube defines shopping item and shopping cart, and enables to add items to the shopping cart. It defines type of shopping items and only those can be added to the shopping cart. Whereas Satchmo required to define categories and add items within a category, cubicweb-shoppingcart does not oblige to define categories. Creating shopping items is the only thing you need to do. That makes this component usable not only for online store. For example, we used this cube to manage Euroscipy registration fees reusing the generic schema of a "virtual" shopping cart and its related ressources (web widgets, validation hook, ...).

Re-usable components

http://pinaxproject.com/site_media/img/pinax_logo.png

Pinax has a overall good satisfaction as it supports basics components for blogging, tagging, registration, notification and so on. But one point that was raised, is the difficulty of customizing Pinax components. It seems easy to write your own version of Pinax components, but to integrate them is a pain. All the components are tightly related and by customizing one, there is a big chance it will affect the other components.

This last point is a big disadvantage. Why? Well, as a developer there is always something that you need to adjust to fit your needs. So customizing components is something you will not avoid while developing your web application. And something I'd like to point about CubicWeb, is its simplicity of re-using existing components, which are independent from each others. This is as easy as Python inheritance. And with its VRegistry, selectors and application objects (see The VRegistry, selectors and application objects for more details), customization is well integrated into the framework.

Assemble cubes and functionalities is very easy as well. Let's think of an example. We have those three cubes: cubicweb-book, cubicweb-tag and cubicweb-comment. Cubicweb-book defines Book entity type. Cubicweb-tag defines Tag entities and the ability to tag other entity types. Cubicweb-comment defines Comment entity type and the ability to comment other entity types. What if we want to create an application in which we could tag and comment Book. Well, this is done with the following schema definition where we explicitly define the relations between Book, Tag and Comment entity types:

from yams.buildobjs import RelationDefinition
class comments(RelationDefinition):
    subject = 'Comment'
    object = 'Book'
    cardinality = '1*'
    composite = 'subject'

class tag(RelationDefinition):
    subject = 'Tag'
    object = 'Book'
    cardinality = '**'

Forms

Despite the fact that forms are easy in Django, there is no way to add inline entities, at least for now (see this proposition) as easily as in CubicWeb (see HTML form construction for more details). That is very neat when you create/edit related entities. Plus, since CubicWeb 3.6, forms are much easier to handle, and we still put a lot of effort into making it simplier.

So, yes, overall Django is selected as the best compromise, but for the reason I listed, CubicWeb should be considered.

Watch out Django, we are getting on your way ;)


OSCON 2010 discount!!

2010/05/21 by Sandrine Ribeau
http://assets.en.oreilly.com/1/event/45/oscon2010_12year.png

Since Logilab will be presenting CubicWeb at OSCON, we get to have a discount code giving 20% rebate on OSCON registration. Please feel free to use this discount code while registering: os10fos.

See you there!