The Journey to Create an Internal Developer Platform
Dev Experience
Infrastructure

The Journey to Create an Internal Developer Platform

Alon Shirion
Alon Shirion

Our journey at monday.com developing an internal developer platform called sphera, improving our developers’ productivity and sense of ownership.

Intro

At monday.com, we believe in end-to-end ownership of services for our developers. However, as the infra (infrastructure) team, we cannot allow direct provisioning or modification of services or resources due to the sensitivity of our architecture. Initially, we managed all requests through a board on monday.com, where developers would open a ticket requesting their needs, and the “infra of the week” (internal shift we have in our group for managing support to developers) would handle the tickets.

While this worked well in the beginning, as R&D grew, the infra team became a bottleneck. We had a growing queue of requests that needed to be addressed, such as creating a service in AWS or concerns about approvals and reviews in the microservice creation process. We realized that we needed to automate some of the DevOps work we were doing and stop acting as a bottleneck for the developers.

Our goal was to create a centralized hub where developers could quickly access self-service tools and gain a clear understanding of our system through an easy-to-use interface. 

First Solution – cli

When I joined monday.com, the team introduced me to our cli, which is an essential part of our onboarding process. The cli provides our developers with secure access to service secrets from AWS and the creation of new microservices for development and production purposes. Additionally, it enables developers to connect safely to resources in production like databases and Kubernetes pods. 

Despite the cli’s effectiveness, we noticed additional issues as our flow became more complex. For example, our developers were forgetting to configure the CI/CD after creating a microservice or providing the correct configuration in their GitHub repo, which caused delays in the process of development and production. It was also challenging for the infra engineers to remember all the necessary steps when creating a new microservice, and it was becoming increasingly challenging to enforce these steps systematically while maintaining developer responsibility. 

After recognizing the limitations of automation, we realized that we needed a developer portal with a user-friendly interface that would enhance our developers’ velocity, productivity, and sense of ownership. By providing a golden path to success, we aim to empower our developers to achieve self-service success.

Build vs Buy

When it comes to internal tools, the question of whether to build or buy often arises.
We faced the same question when we needed a tool to manage our microservices. After exploring various products on the market, we decided to build our own tool from scratch.

The decision to build our own tool was not an easy one, but it was the right one for us. We wanted a tool that we could customize to our needs and manage with code, not YAMLs like other tools on the market. We also needed a secure on-prem solution with minimal public access, which was best achieved through custom development.

Our proof of concept was a microservice creation flow that allowed developers to create and run a microservice in minutes on their dev environment. We realized we needed a user-friendly UI that doesn’t feel like a terminal and has a more “natural flow”. Our initial UI was basic and the server was a mess, but we quickly focused on improving it, and it now accounts for 70-80% of our daily work.

We believe that building tools for ourselves, using them in our daily work, and continuously improving them is the best way to create high-quality products that meet our needs. The decision to build our own internal tool was a significant investment of time and resources, but it has paid off in terms of efficiency, security, and flexibility.


After the POC: Conclusions

We decided to call it “sphera”, a place where all of our developers can get into when they need to start working. After the POC, we asked ourselves what do we want to see on this platform? How will it simplify our work?

We had a lot of services at that point (about 80 at the time, now about 120), and it became a hassle to manage/discover them. Who is the owner? Why did we create this microservice/microfrontend? Is this service currently active in our environments? Which environment (just staging or staging and production)?

Those were a few of the questions we asked ourselves and developers kept creating more and more services. We started with a simple table on the main page that shows all of our services.

sphera’s first official version – 16.05.2022

The goal was “one click to start development”, the moment you create a service through a simple form and submit button, your new service is added to the sphera database, and all of our necessary configurations and logic that we need (CI/CD, add the image to our Docker registry, GitHub repo configuration, extract from the template what the user asked for in the form, etc…) apply on your service.

Working with sphera: A Closer Look

Behind the scenes, sphera is a client built with React, a NodeJS server, and a MySQL database. The server has many integrations to the tools that we use for our development and production: Codefresh, Datadog, GitHub, our internal tools, etc…

We also have developed microservices that integrate with sphera to not have all the logic in one big place.

The main focus was to give as much as we can out-of-the-box with minimum configuration. With that mindset, we kept developing features for the platform and tried to make it as easy and simple as possible.

Today, developers can get into sphera, authenticate through Okta and use the many features that we already have, a short explanation of some of them:

Creating a microservice/microfrontend – create a new service through a simple form requesting the resources you need to develop, and getting a GitHub repo that is extracted from a template with the right configuration, standards, and a simple view from sphera on that service after it’s being created.

A service “life cycle” management – we wanted to ensure that the developer will understand the stage of his service, and what resources it has at the moment, so we created a page that adds visibility over his current state. It starts from a design review of the service to the creation of the development environment, remote resources like databases, Ci/CD flows, and so on, until the service gets to production. Each step runs according to the progress of the development.

Secrets management – CRUD actions for all the secrets that are related to your service, all of the actions will be done with our standards that the developer doesn’t need to know about. For example, one of our developers wants to create a new secret in our EU region, they know the name, the value, and of course the region they want to create the secret. They will put all the information in our simple form through the UI of the platform and click create. Behind the scenes a request has been sent to our company secrets tool (in our case AWS secrets manager), add all the prefixes/tags and necessary information we want, and create it with our standard without any room for mistakes. We also have a complete view of our secrets between regions that let us know if we have any kind of “drift” between regions/environments.

Lock management – we are saving a “lock” status of each service to know if we want to deploy it or not. For example, if we have an incident and we don’t have time to sync everyone to know that they shouldn’t deploy anything, we simply lock all of the services (or the service with the issue), and we know we won’t have any accidental deployment.

We also have a “freeze merge” feature, if we want to block merges to master in general for some time. Each developer has access to lock/unlock and freeze/unfreeze.

Service links – GitHub, Coralogix, Datadog, ArgoCD, and more related links depending on the use case, going directly to the context of that service.

Kubernetes resources view and actions (limited as we want) – I think one of the biggest fears of DevOps developers is when a developer changes something in the infrastructure that can break it and not accurately follow the infra team standards. It will be hard to find the issue after the change, and time-consuming for both sides.

So how do you get over that fear and also don’t be a bottleneck?

The answer is in this blog post, self-service. Through sphera, developers can view their k8s resources related to their service (deployments, and HPA for example), restart their deployment safely through the UI, or change the HPA with logic limitations behind the scenes.

Deployments view  – see the status of all deployments in all regions and in all clusters, quick way to find a drift between deployments and have clear visibility.

Rollback – A button on each service that provides the user the option to choose the region and environment, and it will rollback the version of that microservice with all the necessary changes we need to do behind the scenes (commit to Gitops repo, update all the relevant places, slack message about the rollback, etc..)

Audit logs – all actions in sphera are audited and we can track who did what and when (compliance).

Feature flags management – monday is a dynamic, single-tenant deployment, which naturally has a lot of feature flags to allow a smooth rollout of new features. We added the ability to manage those feature flags using sphera to add visibility, drift detection and make it easier to manage them. Since we are managing flags in different clusters and regions, to which we don’t have direct access, the operations are done asynchronously using Kafka.

While there are numerous features available in sphera, I have chosen to highlight only a select few in this blog post, as I aim to provide a concise and focused overview without overwhelming and having you read an exhaustive list of features.


The Exciting Future of sphera: a Developer’s Dream Tool!

sphera has revolutionized the way we work. With this tool, we have significantly reduced the number of tickets from developers, and the time they spend on their tasks has been cut down considerably. Thanks to sphera, developers can create and test new microservices and microfrontends within minutes, whereas it could take up to a week before.

Incident management has also been made easier, with easy access to all the necessary tools and a simplified view of the service’s state. As a result, we’ve seen a significant uptick in developer adoption, with about 60% of our R&D now using sphera for their needs.

As the demand for sphera grows, so do our feature requests. However, our team is working tirelessly to make sure that developers have everything they need at their fingertips. Our vision for sphera is to make it the only tool developers need to get their work done.

Imagine coming into work, opening sphera, and never having to leave. With our platform, developers will have access to all the essential features, including infrastructure setup, monitoring, logging, resource creation, and more. We believe that the possibilities are endless, and there is always room for improvement and new features.

At sphera, we are developers building a platform for developers. Developers from our R&D always have fresh ideas on how to make our tool better, and they actively contribute to the code themselves. We believe that developing features that simplify or automate tasks in our daily work is not only beneficial but also essential for improving the efficiency of our platform. To give extra motivation to contribute, we even offer a shirt with the sphera logo on it for the first contribution to the project.

With sphera, developers can do their job more quickly and efficiently than ever before, making it a win-win situation for both the developer and the company. 

The future of sphera is exciting, and we are committed to making it the ultimate tool for developers. With our ongoing development efforts and commitment to excellence, we are confident that sphera will continue to transform the way developers work, making their lives easier and more productive.

A Platform by Developers for Developers – sphera