Well embedded patterns and workflows create a complex environment for Data & AI transformations within enterprises. These patterns – which their large and diverse workforces are familiar with and have proven to work – were created to accelerate new projects by building on past success and reduce risk by learning from past failure. There were likely good reasons for things to be that way.
However, not evolving in an environment of rapid technological advancements risks not making the correct business decisions, so large enterprises may choose to turn to tech consultancies for help.
Building PoCs is, frankly, relatively easy. They are built under highly controlled conditions, often isolated from production systems, and are usually not serving a large number of users who can provide immediate feedback by testing the application in unexpected ways.
Moving them to production in an enterprise environment, however, is a completely different beast. That needs a thorough understanding of software systems, software delivery and cloud technologies: enter Platform Engineering. But what is it?
Data is stored somewhere, applications run somewhere. Traditionally, this would happen on hardware (servers) owned by companies themselves, but nowadays that somewhere is often a platform on the Cloud. Platforms host, run and enable new data and software products.
Platform engineering involves building and maintaining the things that deploy and run a product's source code, so development teams can focus on developing. It enables everyone's work (Data Scientist, Machine Learning Engineers, Data Engineers, etc) to be delivered in a production environment to end-consumers.
Unfortunately, platform engineering is often overlooked, as its value is not immediately visible or makes up non functional system components, even if any time you interact with an application or website along with thousands of other users, you are indirectly benefiting from it. Applications run on servers, they process data, and need to store it. Those resources need to be connected in secure ways, limiting risk to the owners and their users, but they also need to be developed quickly and run at a reasonable cost.
More often than not, platforms will already exist and there will be existing layers that can be reused (e.g. networking to company systems) to decrease time-to-market in good enough ways. Therefore a person who can quickly grasp what the existing pattern is and how to most efficiently navigate it will have more chances of success. A platform engineering consultant combines both the technical and the soft skills required to ship software products to production.
At Mesh-AI, our team has a wealth of experience that informs our approach to successfully delivering outcomes within Data & AI. A pre-requisite for this is to have a clear vision of why we are here and what we are doing. A team’s first order of business is to read the Statement of Work to understand the mission and make sure the scope of the work is reasonable for the timelines specified. We follow this with a day or two of workshops for discovery, although this may be done prior to writing the Statement of Work to ensure it is clear and with good estimates on solution, team size, etc.
The outcome of our work is removing pains from the customer’s team by providing software designed and built to fulfill requirements which benefit humans i.e. integrate data sources to allow people to answer key business questions, such as What is our current exposure and how do we reserve capital accordingly (in insurance). It is often tempting to build a feature-rich and sophisticated solution, but start simple; get something working and validate it with the end user whilst you’re building. Produce an MVP architecture for the solution within the first two weeks, get it reviewed and start building.
During the build phase of a project we ask the customer what is their path to production, that is both the business processes required to deploy to Production and the physical process for doing so (e.g. existing automated deployment tooling).
Understanding steps/gates to deploy to an environment is key to success, as often it is not the technical challenges but organisational processes that will cause delays. We seek out tech and key process owners to gain insights around what their concerns are. Most enterprises will often include the following on their path to production:
Start work towards passing gates to production early and find out how to expedite those processes, as some of them can sometimes take weeks to months in large enterprises. I have seen Security & Threat Models take 4 months to be completed and approved.
Show you have listened, understood their concerns and exercised diligence to meet their expectations. Giving respect and showing care towards a specialist’s domain can stand you in good stead. People will be more likely to share their time and understanding with you to help navigate processes in an unfamiliar environment - it can turn a check list process into a genuine conversation and help to address any gaps in approval submissions.
However, no matter how diligent you are in your approach to discovery and delivery (both for technical and of business processes), there might be unforeseen events that cause delays to projects.
In such instances,the first thing to do is to acknowledge the problem as early as it is identified (e.g. missing necessary developer permissions) and raise it with the rest of the team to find domain experts or owners.
In the meantime, finding who runs the system is the best course of action. This can sometimes be done by the network of relationships you have built, searching through internal documentation (e.g. Confluence) or using the code repository history (e.g. using Git Blame). When interacting with system owners, one should talk reasonably, assuming you could be wrong as you do not know the history of why a system is set up how it is. Here is when excellent engineers could provide alternative solutions or workarounds. It is fine to identify a problem, but it is even better to also propose a solution.
If it becomes apparent that blockers will cause substantial delays, it is time to engage sponsors or senior stakeholders. You need to set the expectation of how delivery timelines may be delayed and ask if they can assist by connecting you with the relevant domain or process owners, or apply pressure when needed to accelerate the resolution of dependencies.
After resolving the issue, report back to the team and stakeholders, document and share the solution, so others can benefit from your findings.
As mentioned earlier, our deliverables are mainly composed of software. As engineers we pride ourselves in writing and delivering good software. In this regard, some non-negotiable conditions are that it must be:
As tech consultants, we don't sell a tangible product, we offer specialised expertise accumulated through repeatedly solving similar problems within different environments. If we can't do this in a timely manner and at a cost that can be justified with the expected ROI, we have failed at our job.
In a data enriched world full of AI opportunities, this ROI can only be realised if data and software are in the hands of real people, to make decisions and serve a business purpose.
Information is power and with it comes great responsibility. This is why businesses in highly regulated industries have thorough internal mechanisms that exist to minimise the risk involved in managing valuable data in Production.
A good platform engineering consultant shares its client's concerns, understands the framework that solutions need to be delivered within and enables the customer to achieve its business goals. If you're not in production after six months, something has gone wrong.