It is now a common practice to use composer as part of the deployment stack. Is this always such a good idea?
The recipe goes like this : gitignore your "vendor" directory (or whatever folder your dependencies end up in), but commit your composer.lock file, then deploy. Your CI job will then « composer install » all the dependencies where there belong to, magically reproducing your initial files layout exactly how they were.
There are generally a few additional steps involved in between though. Typically, you lost half a day figuring out the right file permissions so that the var/cache of your app can be cleared and recreated properly by the webserver user, wondered for days why some of the builds were randomly failing before realizing that no token was set in this given job, meaning github API rate limit was sometime hit. Then another good day or two finding out how to apply two patches for the same projects when they slightly conflict. And your sysadmin might be slightly suspicious about those files being downloaded and executed directly on production outside of any VCS, and anxiously watching for exploit reports.
Now, you will assure me, you’ve nailed all that, and, apart from an occasional network glitch preventing packages to be fetched, all is running smoothly. Great.
But please, re-read above. Why are you doing all this? To "reproduce your initial files layout exactly how they were". Then why don’t you just commit the files and push them, then?
That is normally the point in the discussion when you’re supposed to use the words « reproducible » and « best practices ».
Right, but what is more reproducible than moving prebuilt files around? You played the recipe once already on your dev environment, taking the risk of re-running it on production feels a bit like recompiling binaries from source simply because you can.
Composer is not magic. What it does is grab a bunch of PHP files, ensuring they are at the right version and that they end up in the right place, so they can play nicely together. Once you have the resulting file set already, why would you want to redo this over and over on each and every environment?
Let’s have a closer looks at what is stated in the Composer documentation and the reasons why the project recommends not committing your dependencies:
- large VCS repository size and diffs when you update code;
- duplication of the history of all your dependencies in your own VCS; and
- adding dependencies installed via git to a git repo will show them as submodules.
I’ll just ignore the repository size (because, frankly ?) and focus on the diff and history parts.
For one, the argument here is slightly misleading: reading the statement, you might be under the impression your repository will contain the whole git history of each and every dependency in your project. Nope, it will not. What you will end up with, along the life of your project, is the history of updates to your dependencies after your initial commit.
Which is rather the most important point here. This is a good thing! Why would not you want to be able to look at - and keep track of in your VCS - what was in update from Guzzle 3.8.0 to 3.8.1 or what difference there is between ctools 8.x-3.0-alpha27 and alpha26? Your « live » project is not only your custom code.
What would you find most useful, next time your client opens a ticket because the image embedding in the WYSIWYG editor has stopped working since the last release, when looking back at the commit « Upgrade contrib module media_entity from 8.x-1.5 to 8.x-1.6 » ? Seeing a one line hash change in composer.lock, or seeing a nice diff of the actual changes in code, so you can track down what went wrong?
The .git submodules point is fair, but easy to workaround, as explained on this very same best practices page. Also keep in mind it only applies if you use dev versions or obscure non-packaged dependencies.
So, to sum it up, if you use Composer to build your code in production, you get:
- Un-needed and time consuming deployment complexity increase, with small but real risks of failure on each and every build for external cause
- No auditing of changes that are not your own custom code
+ Easier handling of .git « false » submodules for a few dev dependencies
On the other hand, if you commit the "vendor" directory, you get:
+ Easier and straightforward deployment
+ All code that lands on production gets audited/versioned
- Small amount of work involved in dealing with possible .git « false » submodules
Then why ?
But then, why is that such a widespread practice? I can only guess here, but I suspect there are several factors at play:
- Fashion, to some extent, must play a role. There are very good reasons to do this for certain workflows, which may lead people to think that it can apply to any deployment workflow.
- The fact that it is presented as « best practice » on the Composer project page. Many people apply it without questioning whether it is applicable to their use case.
My interpretation is that, more fundamentally, the root cause is confusion between "deploying" code and "distributing" code.
Moving a « living thing » for one environment over to another environment is not the same process as making a component or app available for other projects to reuse and build upon. Composer is a fantastic building tool, it is great for the latter case, and using it to assemble your project totally makes sense. Using it as a deployment tool, less so.
If we take another look at the arguments above from a distribution perspective, the analysis is totally different:
- Large VCS repository size and diffs when you update code.
- Duplication of the history of all your dependencies in your own VCS.
Indeed, in this use case, it all makes total sense: you definitively do not need the whole git history of any component you are re-using for your project. Nor do you want your repo for the nice web-crawler library you contribute on GitHub to contain the Guzzle codebase you depend upon.
In short, think about the usage. If you maintain, say, a Drupal custom distro that you use internally as a starting point for your projects, by all means yes, ignore the vendor directory. Build it with Composer when you use it to start a new project. And continue to use Composer to manage dependencies updates in your dev environment. However, once this is no longer a re-usable component, but instead a living project that will need to be deployed from environment to environment, do yourself a favour and consider carefully whether using Composer to deploy really brings any benefit.