Having deployed and maintained several public medium sized web sites running
CubicWeb when I worked at SecondWeb, I was asked
by my friends from Logilab to write a blog post
describing how we managed our deployment while working with the customer and the
hosting company.
Customers that want to run such a medium traffic web site either tell you
which hosting company they partner with, or ask you to find one, so you have no
other choice to deal with an external hosting structure to manage the servers.
I prefer this by the way because:
- High Availability (HA) hosting really requires skills and hardware that are
neither common nor cheap;
- HA hosting requires 24/7/365 availability that SecondWeb could not (and did
not even want to) offer.
It is clearly difficult for all parties (try to put yourself in the shoes of the
customer...) to manage a website with 3 partners involved, each with their own
goals. From the development leader point of view, you will notice that the
technical people of the hosting company continuously change and you keep seeing
the same operational errors even if you provide and keep improving high quality
documentation. The software upgrade documentation has to be particularly clear
as it greatly influences the overall web site availability. You also have to
keep an history of the interventions on the servers yourself and maintain an
up-to-date copy of the configuration files.
The overall architecture proposed here partly benefits from this experience with
managed hosting company, in that we tried to keep it simple.
To achieve very high availability for your web site, you must have no single
point of failure in the whole architecture, which can be far from reasonable
from the costs point of view. However, hosting companies can share costs
between their customers and have them benefit from a double network
infrastructure all along the way from the Internet to your web servers,
themselves hosted on two distant locations. You may then choose an even number
of web servers, half of them hosted on each network infrastructure.
The important thing is that you must preserve user sessions. As of CubicWeb
3.10, DB persistent sessions have not been implemented yet (it will soon, there
is a ticket planned for this
functionality), thus you must preserve session cookies by always directing a
given user to the same web server, which is usually achieved by configuring the
load balancer(s) in IP hash mode (it is faster than balancing on the session
cookie, which implies reaching the http stack rather than staying at the TCP/IP level).
Now if you have multi-processor web servers (which is very likely these times)
you will need to use one CubicWeb application instance per processor or the
Python GIL will limit the CPU of your application to a fraction of the available
power. This is pretty easy, you just have to duplicate configuration directories
from /etc/cubicweb.d, changing instance names and ports. You can use a simple
sed-based script to generate these copies automatically and keep them in sync.
Now that we have one instance per processor, the problem of preserving sessions
is back. It can be elegantly solved using Squid,
which can of course deliver cached objects (in particular images, more on this
later), but also listen on several ports and distribute incoming requests evenly
among the CubicWeb instances based on their port of origin. Note that the load
balancer must be set up to balance between ports of the web servers, one port
for each processor. The Squid configuration file to achieve this, looks like:
http_port 81 defaultsite=www.example.org vhost
acl portA myport 81
http_port 82 defaultsite=www.example.org vhost
acl portB myport 82
acl site1 dstdomain www.example.org
cache_peer 127.0.0.1 parent 8081 0 no-query originserver default name=server_1
cache_peer_access server_1 allow portA site1
cache_peer_access server_1 deny all
cache_peer 127.0.0.1 parent 8082 0 no-query originserver default name=server_2
cache_peer_access server_2 allow portB site1
cache_peer_access server_2 deny all
This is a way to setup Squid to listen to ports 81 and 82 and distribute requests
for www.example.org to ports 8081 and 8082 respectively. This way, requests
should be evenly balanced between the processors a on bi-processor web server.
You can now setup Squid more classically to achieve what it is initially done
for: caching. See Squid docs for this, particularly the
refresh_pattern
directive. Note you do not need to force any HTTP cache standard feature in
Squid, as CubicWeb enables you to fine tune caching using simple
HTTPCacheManager classes found in cubicweb/web/httpcache.py (at the end of this
file, you will also find default cache manager configuration for the entity and
startup views).
This is controversial but it did not hurt for me: I like to put an Apache frontend
between Squid and the Twisted-based CubicWeb application, because the hosting
companies are usually pretty good at setting it up, like to use server status for
monitoring, mod_deflate for textual content compression, mod_rewrite and other
modules to customize, monitor or fine tune the web servers.
It can however be argued that Apache is a huge piece of software for such a
restrictive usage, and its memory footprint would be better used for caching.
This is an interesting part that simplifies the overall setup: if you want to
save data on disk, it is likely that you also want to keep it in sync between
the web servers, or use a highly secure network storage solution.
As we already have a data store accessible from the web servers, namely the
database itself, I often choose to use it even for images. This looks like the
nightmare of every sysadmin, but if you make sure the images are not fetched
every second from the database, by using fine tuned cache settings, it will not
hurt. And this way you still benefit from the flexibility of a database and the
easier maintenance of a single data store. We can use CubicWeb cache settings
to allow squid caching images for 1 hour for example. If you have a very dynamic
web site however, you will then need to force a URL change when an image is
edited. This can easily be achieved in CubicWeb using a custom edit controller
that creates a new image when the data attribute of an Image instance was
edited, as illustrated here:
from cubicweb import typed_eid
from cubicweb.selectors import yes
from cubicweb.web.views.editcontroller import EditController
class CustomEditController(EditController):
__select__ = EditController.__select__ & yes()
def handle_updated_image(self, old_eid):
'modify submitted form to change old_eid into a new entity eid in all key/ values'
old_eid = unicode(old_eid)
form = self._cw.form
new_eid = self._cw.varmaker.next()
# handle image eid
del form['__type:%s' % old_eid]
form['__type:%s' % new_eid] = u'Image'
# handle eid list
index = form['eid'].index(old_eid)
form['eid'] = form['eid'][:index] + [new_eid] + form['eid'][index+1:]
# handle attribute and relations
for (k, v) in form.iteritems():
if v == old_eid:
form[k] = new_eid
if k.endswith(u':%s' % old_eid):
form[k[:-len(old_eid)] + new_eid] = v
del form[k]
def _default_publish(self):
# implement image creation when data image was updated, so that we can use
# a far expiry date cache on download view
images = []
for (k, v) in self._cw.form.iteritems():
if v != 'Image' or not k.startswith('__type') or k == self._cw.form['__maineid']:
continue
try:
eid = typed_eid(k[7:])
except ValueError:
continue
if self._cw.form.get('data-subject:%s' % eid, None):
self.handle_updated_image(eid)
images.append(eid)
super(CustomEditController, self)._default_publish()
for eid in images:
self._cw.execute('DELETE Image I WHERE I eid %(eid)s', {'eid': eid})
To add the 1 hour expiry date for image download view, you can use:
from cubicweb.selectors import yes
from cubicweb.web import httpcache
from cubicweb.web.views.idownloadable import DownloadView
class CustomDownloadView(DownloadView):
__select__ = DownloadView.__select__ & yes()
http_cache_manager = httpcache.MaxAgeHTTPCacheManager
cache_max_age = 3600
Hosting companies now often have a pretty good knowledge of PostgreSQL, the
favorite DB back end for CubicWeb. They usually propose to replicate the database
for data safety at a low cost, using PostgreSQL log shipping feature. Note that
new PostgreSQL 9 versions should make it easier to setup replication modes that
could be useful to improve performance and scalability, but there is still a
lack of production level experience for the moment. Please share if you have,
because it is the main issue to deal with to scale up further.
This is worth mentioning you need a pre-production server hosted by the same
company on the same hardware (or virtual machine), because:
- software upgrade will run smoother if the technical staff of the hosting company
has already performed the same upgrade operation once: check the same person
does both within a short timeframe if possible;
- you will feel better if your migration scripts have successfully run on a
fresh copy of the production data: ask for a db copy before a pre-production
upgrade; this is much easier to do if you do not have to copy the database
dumps remotely.
- the pre-production server can host its own database server and the replication
of the production one.
When you experience a web site downtime, it is much too late to take a look at
the available monitoring. It is important to prepare the tools you need to
diagnose a problem, get used to read the graphs and have the orders of
magnitude of the values and their variations in mind.
Even the simplest graphs, like CPU usage, need to be correctly interpreted. In
a recent setup, I did not realize that only one CPU was used on a bi-pro server,
delivering half the power it should... When you cannot access the machine and
use top, you only see the information of the monitoring graphs, so you must
know how to read them !
Apart from the classical CPU, CPU load, (detailed) memory usage, and network
traffic, ask for PostgreSQL, Squid, and Apache specific graphs (plug-ins for them
are easy to find and install for classic monitoring solutions).
For CubicWeb web sites, it is also worth setting up following views and use
them for automatic alerts:
- a software / db version consistency monitoring
- a db pool size monitoring
- a simple db connection check view
- a view writing the server host name is not interesting for automatic alerts but
to see on which server your IP is directed to: this is needed when you do not
reproduce the behaviour the customer is complaining about...
There are some classes I use for these tasks. Feel free to reuse and adapt them
to your needs:
from socket import gethostname
from cubicweb.view import View
class _MonitoringView(View):
__abstract__ = True
__select__ = yes()
content_type = 'text/plain'
templatable = False
class PoolMonitoringView(_MonitoringView):
__regid__ = 'monitor_pool'
def call(self):
repo = self._cw.cnx._repo
max_pool = self._cw.vreg.config['connections-pool-size']
percent = ((max_pool - repo._available_pools.qsize()) * 100.0) / max_pool
self.w(u'%s%%' % percent)
class DBMonitoringView(_MonitoringView):
__regid__ = 'monitor_db'
def call(self):
try:
count = self._cw.execute('Any COUNT(X) WHERE X is CWUser')[0][0]
self.w(u'ServiceOK : %s users in DB' % count)
except:
self.w(u'ServiceKO')
class VersionMonitoringView(_MonitoringView):
__regid__ = 'monitor_version'
def versions_text(self, versions):
return u' | '.join(cube + u': ' + u'.'.join(unicode(x) for x in version)
for (cube, version) in versions)
def call(self):
config = self._cw.vreg.config
vc_config = config.vc_config()
db_config = [('cubicweb', vc_config.get('cubicweb', '?'))]
fs_config = [('cubicweb', config.cubicweb_version())]
for cube in sorted(config.cubes()):
db_config.append((cube, vc_config.get(cube, '?')))
try:
fs_version = config.cube_version(cube)
except:
fs_version = '?'
fs_config.append((cube, fs_version))
db_config = self.versions_text(db_config)
fs_config = self.versions_text(fs_config)
if db_config == fs_config:
self.w(u'ServiceOK : FS config %s == DB config %s' % (fs_config, db_config))
else:
self.w(u'ServiceKO : FS config %s !$ DB config %s' % (fs_config, db_config))
class HostnameMonitoringView(_MonitoringView):
__regid__ = 'monitor_hostname'
def call(self):
self.w(unicode(gethostname()))