The cloud market is showing similar characteristics to the packaged database market and poses a similar risk of locking-in customers to their price rises and technology constraints.
While most of the advice below is generic common sense, it’s worth going over again in the context of the cloud. Sometimes we suspend our disbelief, especially when we really want things to be different. Cloud is a great enabler, bringing comprehensive facilities to all and allowing development of ideas or products at high speed. However, you must be sceptical and use proper abstraction processes to remove dependencies that might come back to bite you later on.
How it Begins
You typically get one shot at laying down your software infrastructure for a given product, and the fundamental choices you make at the beginning tend to stick with you. (Unless you are smart enough to factor in a redesign cycle into your project.) Even if computing technologies are similar on the surface, it can be costly to re-engineer the substitute product into the system.
With propriety systems, certain key vendors can slowly jack-up the price of their offering as they re-evaluate their share of the overall system value. The calculation is effective: Is it cheaper to pay the increased costs of a core package (say the database) than it is to re-engineer to a whole new product. Many software vendors with older established code make a speciality of this and can extend the useful revenue life of their product.
When it comes to data storage and handling, cloud vendors also offer many examples of proprietary services and API. The price of a particular product may look good, but if no other organisation can provide the same system, then you are faced with a re-engineering challenge if you even need to move away from the technology.
Most cloud providers offer an infrastructure product: bare-bones compute plus operating system. This is a good start to universal platforms. It is improved by virtualisation and most recently by using containers to add new features like clustering for high availability, scale out and easy deployment.
So, one way to avoid data lock-in is to bring your own middleware standards to rented infrastructure. This solves the problem but prevents users from getting many benefits of cloud use, such as managed data storage. For many, however, this is a good compromise and is one model of working with clouds.
Another alternative is for the vendor to run the data platform, especially if it is open source. MySQL is a popular example that has implementations with most vendors, where they will run the facility for you, leaving the developer to connect the service by API. This too is a good way to proceed, as you are likely to find the same service replicated with competitors, although sometimes at a surprisingly high price.
If you are in control of your development stack, a further method is to create your own image that includes an operating system and use the supplier for bare metal alone. The first advantage is speed: all of your construction and stacking took place on the build servers, not on production infrastructure. The second advantage is image immutability, so it can’t be changed once deployed, which is a reassurance when confirming security hygiene.
Finally, we should re-mention container technology in the form of Docker, which is evolving into a distinct platform of its own. This abstraction allows whole or sections of your application stack to be deployed in a dynamic framework, independent of the choices of your provider. It includes libraries and non-kernel parts of the operating system. This emerging technology, more than any other, shows the way for future investment preservation.
Data at Rest
The most significant potential lock-in comes from more innovative, big-data style storage. It usually comes with very competitive costs, allowing large volume at a low price and is the answer to many situations. Also, it comes ready-to-run as a platform without the developer having to do system engineering work; for many projects, it is ideal. However, if you embed the API, you will find it hard to engineer out unless you take specific steps to abstract the interface, so it becomes easy to substitute with an alternative.
The cloud vendors do not set out to be expensive: it is a consequence of both their cost base and the innovation. The cost of MySQL might be greater than expected because of high availability features needing more equipment at lower utilisation. Alternatively, proprietary NoSQL storage methods might be cheaper than expected because of the scale of their implementation and reaping benefits from the vendor’s investment.
However, one area where most vendors can be accused of a cost surprise is outbound data transfer. All vendors keep inbound rates low, so there is little friction moving data from your premises onto theirs. However, try moving data in the opposite direction or to a competitor, and you will find a much higher cost, especially when compared with the monthly storage rental. If you have sufficient scale, then many vendors have an ‘enterprise’ connection product to allow for an intermediate level of network cost, and this helps to reduce the cost.
For some technical or academic computing, there is already an effective lock-in. If a large data corpus is hosted for free or at a low cost in one vendor’s facility, a project that wishes to leverage it will want to co-locate to keep their costs down and reduce on data logistics. One can see this as a benign lock-in that benefits everyone over time, but it is still a restriction that may impact in time.
Technical deployment is another area where lock-in should be considered and minimised. Avoid vendor specific deployment APIs and go with abstract ones, like Ansible, Puppet or Terraform where possible. It may mean that some features are unavailable or that extensions are kept to a minimum. However, it means that the costs of migrating to a competitor will be lower. Additionally, if a scripted approach is taken, the time to set up a system will be much speedier than if the infrastructure is created by clicking a website shopping list. This is important when planning a migration to another vendor and tooling is starting to become available to help speed this.
The cloud industry is very capital intensive, sharing more economically with power generation or chemical manufacturing. The life cycle runs a little like this: borrow money, buy kit, five year run, keep utilisation up, dispose and start again. Discounts on basic equipment are unlikely to be found if the plant was set up competitively or unless a new source of cheaper finance was found.
However, discounts are can be found in software or in techniques to improve utilisation of equipment. In order get to these as a customer, would require moving from one vendor to another or adapting your product to be able to use the lower cost techniques. This is another change from standard enterprise practice, where regular sessions of hard negotiation can sometimes yield discounts. In the future, developers will have to move vendor and platform to get the best price.
Getting locked in is nothing new and future relationships with suppliers are likely to be less personal than they historically have been. Getting the best price is going to require non-functional work from developers, which itself is going to need funding from sponsors. The best advice to give is to start thinking about portability from the start and build it into your construction processes. Assume that you will have to move vendors or have two or more at the same time covering different aspects of the data.
Finally, spare a thought for future government control. If you are dealing with personal data or data of value, they are going to want to control the data residence and also where you process data. This will have a substantial impact on the commercial and practical choices that you will make. It may even drive a small return to company data centres in the form of local cloud facilities, from where key data can be stored and bridging technologies like messaging can be hosted.
In some ways, the future holds much easier access to things you need to make your project successful. But to prevent surprises later on you will need to put in work upfront to make sure your software works on multiple clouds and that the best future deals with the latest technology are available to you.
[Note: Updated with more deployment techniques]
About the Author
Architect, Automator, PM. Follow me on Twitter @nigelstuckey
Agile Infrastructure for Enterprise DevOps.
Design from diagrams, document and deploy to your cloud.
SystemGarden.com, Twitter @systemgarden