Approaching the top of the stack, this post looks at the software components we currently use for performant and stable PHP hosting, and why we chose them.
We've already explained we're closely tied to Linux for many reasons, and Debian specifically. But what do we put on top of that? For the purposes of this post, I'll focus on PHP hosting. We started our as a specialist Drupal company and although we do all sorts of other specialist Linux hosting these days, hosting Drupal, a PHP-based CMS, is still a large chunk of our work.
Drupal is designed to run on a classic "LAMP" stack, that is to say Linux, Apache (the popular web server), MySQL (a popular native Linux open source DBMS, now owned by Oracle) and PHP, the light-weight, C-like programming language for web applications. So the easiest way to talk about this is to go through this stack, starting with the A:
Or rather, in our case, usually Nginx.
Apache is the default choice for Drupal, because it supports something called .htaccess files, which are very handy. You can control the behaviour of the web server from within your application, set redirect rules, protect directories, set memory limits, all kinds of cool stuff (hosting company permitting). Handy, it is, but the flip sides are ugly, in terms of loss of administrative control of large aspects of server behaviour and a big old performance hit: every time Apache is asked for a resource, it interrogates the .htaccess files to see how it should process the request. And this is very slow. The developers at Nginx take the position the whole .htaccess concept is fundamentally a very bad idea, and they've explained in some detail.
We (mostly) use Nginx, because it's so much more efficient than Apache, for a number of reasons (it has a fundamentally different architecture). This means when we're presented with an application (like Drupal) that relies on .htaccess, we have a bit of work to do. We have to replicate the .htaccess rules in a configuration file for Nginx, that we can include for that application's vhost (the configuration that tells the server where to listen to requests and what to do with them). This is a small overhead on our part, but it's a one shot deal and a bit of longer term maintenance. We don't think it's a big deal. We prefer the performance and security improvements over the minor inconvenience of having to write some extra web server config.
(That said, we do sometimes use Apache, and we have Apache ready to go in our configuration management system. Sometimes customers just prefer Apache - better the devil you know! Sometimes they want to do something for which there is mature support in Apache, but not in Nginx, and we need to be pragmatic about that. Single sign-on support for Microsoft Active Directory with Kerberos is a case in point.)
Here, again, we make a small departure from convention. MySQL is a free open source product, so there are several companies out there who have forked the main MySQL project and created versions of their own, also free open source and drop-in replacements for standard MySQL. The main two are MariaDB and Percona. We use Percona.
Features-wise there's little to call between MariaDB and Percona Server. Both teams seem to mostly avoid benchmarking against each other, but the Percona benchmarks they release officially against MySQL show Percona is significantly quicker and more consistent under heavy load and MariaDB do comparisons in their blog sometimes (note, they use some Percona benchmarking tools - interesting!) So it's almost a no-brainer to use a quicker drop-in replacement, and as for Percona, not only did they do a better sales job at the time, if I recall, another decision point was software management. Percona already maintained a Debian repository when we were shopping around, and I stand corrected if this is not the case, but I'm fairly sure at the time MariaDB only offered .deb files to download.
(Like with Apache, we do keep standard MySQL in our configuration manager, because sometimes customers just want it. I can't honestly think of a good reason why someone would choose MySQL over Percona, at this moment in time, but Percona and MariaDB do lag behind MySQL sometimes, and they haven't always been the more performant option. You need to keep your ear to the ground and cover all bases.)
At the base, PHP is PHP. You install PHP 5.6 on your server and it's the same code no matter how you execute it, from the command line, from a web server, from some other application, it matters not. In most cases we're talking about executing PHP via a web server, we're serving Drupal websites for the most part after all, so our servers are optimised for that. So what changes from configuration to configuration is the way in which web servers call PHP, because by default the web server and PHP are not connected. You need to install something to tell the web server to point PHP files at the PHP application for execution.
The most common way achieve this in a LAMP stack is to use the Apache web server module, mod_php. I suspect the main reason it's common is it's really easy. In most Linux package managers you install the library and you're done! PHP *just works*. However, because we mostly use Nginx (as already described) we can't use mod_php, because that is Apache only. So we find ourselves casting around for how we can hook PHP up to Nginx, our web server of choice.
And in fact, there's only one real option. FastCGI is a generic interface for allowing scripting engines (like PHP, but not only PHP) to be called from a web server. It so happens some nice people have made an application called PHP-FPM (FastCGI Process Manager) which takes away most of the pain of configuring FastCGI and PHP, so we just use that. (There are some other CGI flavours, but because we have no issue with FastCGI, we see no particular reason to start testing less popular applications on the off chance they're a few nanoseconds quicker.)
Sure, this makes for slightly more complex configuration. PHP-FPM needs to be configured and optimised for your web application, both for performance and security. And there are some "gotchas" those used to Apache and mod_php may need to get used to, for example, restarting the web server with this setup has no effect on PHP, so will not clear the opcode cache. Little quirks like that. But actually, we find the separation has more pros than cons and offers a great deal more potential flexibility.
In fact, while we still run mod_php in a few places, we are moving towards using PHP-FPM for Apache clients too. I mentioned above we still support Apache, and because FastCGI is webserver agnostic, Apache is perfectly capable of using FastCGI to execute PHP. We have a lot of experience with PHP-FPM these days and mod_php is demonstrably less efficient than PHP-FPM (long blog post here with worked up benchmark tests by someone with more time to do this than I have) so it feels like a good idea to standardise on PHP-FPM. (Indeed, it's a frustration of some that people keep using PHP-FPM as a reason to not use Apache, which is a fallacy. Apache is slower than Nginx in many circumstances, but it has nothing to do with PHP execution.)
We have many other components of course. For example, software load balancing is done with HAProxy, we use Tomcat to serve Apache Solr, Jenkins and other Java applications, we have Ruby running on servers to support Redmine and Gitlab, etc. etc. But I'd be writing all day if I covered everything, hence the LAMP focus here.
So that's that. The end of my series on our hosting stack. We've gone right through from infrastructure and virtualisation up to the application layer on top of Linux. I hope you've found this interesting and maybe even useful. Our choices might not be your choices, these selections are born out of time, experience and very specific requirements, but I hope if nothing else the explanation of the thought process is useful to someone who might be embarking on a similar venture.
Thanks for reading!