|
Blog entries may 2012 [2]
CubicWeb 3.15 introduces a bunch of new functionalities. In short (more details below):
- ability to use ZMQ instead of Pyro to connect to repositories
- ZMQ inter-instances messages bus
- new LDAP source using the datafeed approach, much more flexible than the legacy 'ldapuser' source
- full undo support
Plus some refactorings regarding Ajax function calls, WSGI, the registry, etc. Read more for the detail.
- Add ZMQ server, based on the cutting edge ZMQ socket
library. This allows to access distant instances, in a similar way as Pyro.
- Publish/subscribe mechanism using ZMQ for communication among cubicweb
instances. The new zmq-address-sub and zmq-address-pub configuration variables
define where this communication occurs. As of this release this mechanism is
used for entity cache invalidation.
- Improved WSGI support. While there are still some caveats, most of the code
which was twisted only is now generic and allows related functionalities to work
with a WSGI front-end.
- Full undo/transaction support: undo of modifications has finally been
implemented, and the configuration simplified (basically you activate it or not
on an instance basis).
- Controlling HTTP status code returns is now much easier:
- WebRequest now has a status_out attribute to control the response status ;
- most web-side exceptions take an optional status argument.
The base registry implementation has been moved to a new
logilab.common.registry module (see #1916014). This includes code from :
- cubicweb.vreg (everything that was in there)
- cw.appobject (base selectors and all).
In the process, some renaming was done:
- the top level registry is now RegistryStore (was VRegistry), but that
should not impact CubicWeb client code;
- former selectors functions are now known as "predicate", though you still use
predicates to build an object'selector;
- for consistency, the objectify_selector decorator has hence been renamed to
objectify_predicate;
- on the CubicWeb side, the selectors module has been renamed to
predicates.
Debugging refactoring dropped the need for the lltrace decorator. There
should be full backward compat with proper deprecation warnings. Notice the
yes predicate and objectify_predicate decorator, as well as the
traced_selection function should now be imported from the
logilab.common.registry module.
All login forms are now submitted to <app_root>/login. Redirection to requested
page is now handled by the login controller (it was previously handled by the
session manager).
Publisher.publish has been renamed to Publisher.handle_request. This
method now contains a generic version of the logic previously handled by
Twisted. Controller.publish is not affected.
- New 'ldapfeed' source type, designed to replace 'ldapuser' source with
data-feed (i.e. copy based) source ideas.
- New 'zmqrql' source type, similar to 'pyrorql' but using ømq instead of Pyro.
- A new registry called 'services' has appeared, where you can register
server-side cubicweb.server.Service child classes. Their call method can be
invoked from a web-side AppObject instance using the new self._cw.call_service
method or a server-side one using self.session.call_service. This is a new
way to call server-side methods, much cleaner than monkey patching the
Repository class, which becomes a deprecated way to perform similar tasks.
- a new ajaxfunction registry now hosts all remote functions (i.e. functions
callable through the asyncRemoteExec JS api). A convenience ajaxfunc
decorator will let you expose your python functions easily without all the
appobject standard boilerplate. Backwards compatibility is preserved.
- the 'json' controller is now deprecated in favor of the 'ajax' one.
- WebRequest.build_url can now take a __secure__ argument. When True, cubicweb
tries to generate an https url.
A new 'undohistory' view exposes the undoable transactions and gives access to undo
some of them.
This is a fairly technical post talking about the structural changes I would like to see in CubicWeb's near future. Let's call that CubicWeb 4.0! It also drafts ideas on how to go from here to there. Draft, really. But that will eventually turn into a nice roadmap hopefully.
Some parts of cubicweb are sometimes too hairy for different reasons (some good,
most bad). This participates in the difficulty to get started quickly. The goal of CubicWeb 4.0 should be to make things simpler :
- Fix some bad old design.
- Stop reinventing the wheel and use widely used libraries in the Python Web
World. This extends to benefitting from state of the art libraries to build nice
and flexible UI such as Bootstrap, on top of the JQuery foundations (which could
become as prominent as the Python standard library in CubicWeb, the development team should get
ready for it).
- If there is a best way to do something, just do it and refrain from providing configurability and options.
First, a few simple things could be done to simplify the UI code:
- drop xhtml support: always return text/html content type, stop bothering
with this stillborn stuff and use html5
- move away everything that should not be in the framework: calendar?, embedding,
igeocodable, isioc, massmailing, owl?, rdf?, timeline, timetable?, treeview?,
vcard, wdoc?, xbel, xmlrss?
Then we should probably move the default UI into some cubes (i.e. the content of
cw.web.views and cw.web.data). Besides making the move to Bootstrap easier, this
should also have the benefit of making clearer that this is the default way to
build an (automatic) UI in CubicWeb, but one may use other, more usual,
strategies (such as using a template language).
At a first glance, we should start with the following core cubes:
- corelayout, the default interface layout and generic components. Modules to
backport there: application (not an appobject yet), basetemplates, error,
boxes, basecomponents, facets, ibreadcrumbs, navigation, undohistory.
- coreviews, the default generic views and forms. Modules to backport there:
actions, ajaxedit, baseviews, autoform, dotgraphview, editcontroller,
editforms, editviews, forms, formrenderers, primary, json, pyviews, tableview,
reledit, tabs.
- corebackoffice, the concrete views for the default back-office that let you
handle users, sources, debugging, etc. through the web. Modules to backport
here: cwuser, debug, bookmark, cwproperties, cwsources, emailaddress,
management, schema, startup, workflow.
- coreservices, the various services, not directly related to display of
something. Modules to backport here: ajaxcontroller, apacherewrite,
authentication, basecontrollers, csvexport, idownloadable, magicsearch,
sessions, sparql, sessions, staticcontrollers, urlpublishing, urlrewrite.
This is a first draft that will need some adjustements. Some of the listed
modules should be split (e.g. actions, boxes,) and their content moved to
different core cubes. Also some modules in cubicweb.web packages may be moved
to the relevant cube.
Each cube should provide an interface so that one could replace it with another
one. For instance, move from the default coreviews and corelayout cube to
bootstrap based ones. This should allow a nice migration path from the current UI
to a Bootstrap based UI. Bootstrap should probably be introduced bottom-up: start
using it for tables, lists, etc. then go up until the layout defined in the main
template. The Orbui experience should greatly help us by pointing at hot spots
that will have to be tackled, as well as by providing a nice code base from which
we should start.
Regarding current implementation, we should take care that Contextual components
are a powerful way to build "pluggable" UI, but we should probably add an
intermediate layer that would make more obvious / explicit:
- what the available components are
- what the available slots are
- which component should go in which slot when possible
Also at some point, we should take care to separate view's logic from HTML
generation: our experience with client works shows that a common need is to use
the logic but produce a different HTML. Though we should wait for more use of
Bootstrap and related HTML simplification to see if the CSS power doesn't
somewhat fulfill that need.
The current looping task / repo thread mecanism is used for various sort of
things and has several problems:
- tasks don't behave similarly in a multi-instances configuration (some should
be executed in a single instance, some in a subset); the tasks system has been
originally written in a single instance context; as of today this is (sometimes)
handled using configuration options (that will have to be properly set in each
instance configuration file);
- tasks is a repository only api but we also need web-side tasks;
- there is probably some abuse of the system that may lead to unnecessary
resources usage.
Analyzing a sample http://www.logilab.org/ instance, below are the running looping
task by categories. Tasks that have to run on each web instance:
- clean_sessions, automatically closes unused repository sessions. Notice
cw.etwist.server also records a twisted task to clean web sessions. Some
changes are imminent on this, they will be addressed in the upcoming refactoring session (that will
become more and more necessary to move on several points listed here).
- regular_preview_dir_cleanup (preview cube), cleanup files in the
preview filesystem directory. Could be executed by a (some of the) web
instance(s) provided that the preview directory is shared.
Tasks that should run on a single instance:
- update_feeds, update copy based sources (e.g. datafeed, ldapfeed). Controlled
by 'synchronize' source configuration (persistent source attribute that may be
overridden by instance using CWSourceHostConfig entities)
- expire_dataimports, delete CWDataImport entities older than an amount of
time specified in the 'logs-lifetime' configuration option. Not controlled
yet.
- cleanup_auth_cookies (rememberme cube), delete CWAuthCookie entities
whose life-time is exhausted. Not controlled yet.
- cleaning_revocation_key (forgotpwd cube), delete Fpasswd entities with
past revocation_date. Not controlled yet.
- cleanup_plans (narval cube), delete Plan entities instance older than an
amount of time specified in the configuration. If 'plan-cleanup-delay' is set
to an empty value, the task isn't started.
- refresh_local_repo_caches (vcsfile cube), pull or clone vcs repositories
cache if the Repository entity ask to import_revision_content (hence web
instance should have up to date cache to display files content) or if
'repository-import' configuration option is set to 'yes'; import vcs repository
content as entities if 'repository-import' configuration option and it is
coming from the system source.
Some deeper thinking is needed here so we can improve things. That includes
thinking about:
- the inter-instances messages bus based on zmq and introduced in 3.15,
- the Celery project (http://celeryproject.org/), an asynchronous task queue,
widely used and written in Python,
Remember the more cw independent the tasks are, the better it is. Though we still want an
'all-integrated' approach, e.g. not relying on external configuration of Unix
specific tools such as CRON. Also we should see if a hard-dependency on Celery or
a similar tool could be avoided, and if not if it should be considered as a
problem (for devops).
First, we should drop the different behaviour according to presence of a '.hg' in
cubicweb's directory. It currently changes the location where cubicweb external
resources (js, css, images, gettext catalogs) are searched for. Speaking of
implementation:
- shared_dir returns the cubicweb.web package path instead of the path to the
shared cube,
- i18n_lib_dir returns the cubicweb/i18n directory path instead of the path to the
shared/i18n cube,
- migration_scripts_dir returns the cubicweb/misc/migration directory path
instead of share/cubicweb/migration.
Moving web related objects as proposed in the Bootstrap section would resolve the
problem for the content web/data and most of i18n (though some messages
will remain and additional efforts will be needed here). By going further this
way, we may also clean up some schema code by moving cubicweb/schemas and
cubicweb/misc/migration to a cube (though only a small benefit is to be expected
here).
We should also have fewer environment variables... Let's see what we have today:
- CW_INSTANCES_DIR, where to look for instances configuration
- CW_INSTANCES_DATA_DIR, where to look for instances persistent data files
- CW_RUNTIME_DIR, where to look for instances run-time data files
- CW_MODE, set to 'system' or 'user' will predefine above environment variables differently
- CW_CUBES_PATH, additional directories where to look for cubes
- CW_CUBES_DIR, location of the system 'cubes' directory
- CW_INSTALL_PREFIX, installation prefix, from which we can compute path to 'etc', 'var', 'share', etc.
I would propose the following changes:
- CW_INSTANCES_DIR is turned into CW_INSTANCES_PATH, and defaults to
~/etc/cubicweb.d if it exists and /etc/cubicweb.d (on Unix platforms) otherwise;
- CW_INSTANCES_DATA_DIR and CW_RUNTIME_DIR are replaced by configuration file
options, with smart values generated at instance creation time;
- the above change should make CW_MODE useless;
- CW_CUBES_DIR is to be dropped, CW_CUBES_PATH should be enough;
- regarding CW_INSTALL_PREFIX, I'm lacking experience with non-hg-or-debian
installations and don't know if this can be avoided or not.
Last but not least (for the moment), the 'web' / 'repo' / 'all-in-one'
configurations, and the fact that the associated configuration file changes
stinks. Ideas to stop doing this:
- one configuration file per instance, with all options provided by installed
parts of the framework used by the application.
- activate 'services' (or not): web server, repository, zmq server, pyro
server. Default services to be started are stored in the configuration file.
There is probably more that can be done here (less configuration options?), but
that would already be a great step forward.
The following projects should be investigated to see if we could benefit from them:
Remember the following goals: migration of legacy code should go smoothly. In a perfect world every application should be able to run with CubicWeb 4.0 until the backwards compatibility code is removed (and CubicWeb 4.0 will probably be released as 4.0 at that time).
Please provide feedbacks:
- do you think choices proposed above are good/bad choices? Why?
- do you know some additional libraries that should be investigated?
- do you have other changes in mind that could/should be done in cw 4.0?
|