Scaling A SaaS Company While Keeping Developers Sane
Infrastructure

Scaling A SaaS Company While Keeping Developers Sane

How we’re simplifying our developers’ day-to-day and helping them create an environment that’s all about writing the next feature

Ben Yitzhaki
Ben Yitzhaki

Since we have been experiencing hyper-growth at monday.com in the past few years, our product has become more complex and our R&D has grown 2x every year. This introduced new challenges regarding on-boarding of new engineers, sharing knowledge and configurations, and around the overall development experience. 

We are constantly driven to improve the development experience for our engineers, from the recently hired one that had just been handed a laptop, to the most experienced team member. In order to support that vision, we had to evolve in many ways, some of them also relevant to non-development teams.

Finding out the issues your team has and understanding what you wish to improve is a wide subject that can be discussed in a separate post. This time, we’ll focus on the difficulties we had in our team, and how we overcame them.

Sharing knowledge

Documenting procedures and services is important to preserve knowledge. Moreover, it is also important to ensure that documents are easily searchable and accessible when needed.

We had a lot of documentation that was scattered in different locations. On some occasions, we couldn’t even find the documentation when needed and had to depend on team members’ memory (who usually found it several hours later in some abandoned git repository’s wiki).

As part of the effort to reduce complexity, we decided to embrace a knowledge base system that will keep all information in one place. We chose a service named Document360 that was easy to start with, but you can pick any other that will comply with your requirements.

Generally, our own platform would be a perfect candidate for such a use case, however, we wanted to place our engineering documentation, some of it dealing with environment operation and disaster recovery, in a place that’s as isolated as possible from our product and dependent services, so it would be accessible in case we’ll need it in a doomsday scenario. It was important for us that the service will be as simple as possible, for readers and editors alike, to create new documents and update existing ones. Lastly, we wanted to ensure that it will become the single source for information.

This change made a huge impact for us. Information became accessible and easy to find. Procedures such as the initial setup of a new environment were relocated to the proper context and hierarchy, which made orientation easier. For example, “gaining permission to git”, was located just above the “initial local setup” which required a git permission.

 

Simplifying local working environment

Our working environment needs to be simple,yet provide all the necessary tools that assist us in our daily tasks. This is challenging as development environments are dynamic and change often. Ensuring that changes in the environment won’t break anything and/or require manual changes done by each developer was a key factor for us. 

Instead of having each developer configure their own environment based on guides and tutorials, we created shared configurations. For example, monday.com is a web service and so requires a web server, such as nginx, to run locally alongside the application. For that to work, one must tailor nginx to work with our application, e.g. create a specific listener and resolve a local domain to the application. Once the configuration is shared, we can easily change it. For instance, once a new service that requires changes to nginx is added, it’ll magically apply for all developers at once.

There are plenty of ways to implement shared configurations. We did it using dotfiles as it’s simple and versatile. While creating our dotfiles, we wanted to ensure that it will require minimum effort from engineers to implement them in their machines while giving us a good entry point for any future changes we would like to do. We decided to focus on their terminal (.bash_profle or .zshrc) as it will give us an easy entrypoint for any other modification we would like to implement in the future. All we asked from them was to git clone our new dotfiles repository and include our shared .bash_profile in theirs.

This is how their .bash_profile would look, while allowing them to add whatever they want above it:

In order to allow us to easily update the configurations, we just added a “git pull” into the shared profile, resulting the shared .bash_profile look like this:

You might have noticed that we pulled data from git using a bash script that runs in the background. This is because the script runs each time a new terminal is opened, so we wanted to ensure it won’t delay our engineers when they open a new terminal (as most of the time, there won’t be any updates).

Our secrets, like database credentials & API keys, are also shared and need to be quickly accessed when running the application locally. Contrary to shared configuration files, we’re not distributing them using the dotfiles as that would require committing secrets to git, which is a big no-no. Rather, secrets are pulled by each developer from a shared service, in our case from the AWS Secret Manager. That way they are secure, yet shared among all developers.

Real-time code inputs using code analyzers

Clear coding conventions between developers working on the same project is crucial. It makes the code easier to read for developers, but also for code analyzer to detect known patterns or anomalies.

The first tool that should be configured is a linter. A linter is a code analyzer that helps enforce code styling (no more battles about spaces vs tabs), spot programming errors, bugs and suspicious code. 

Here’s an example of an issue discovered in our product using a linter, but went unnoticed: The developer, by mistake, initialized a variable as a global variable instead of a locally scoped one. The code passed end-to-end & unit tests, and even code review. The linter was configured to detect initializations of global variables, as they have an impact on most of the code. It doesn’t prevent them, but warns developers of them. In this case, the developer noticed the warnings and resolved them accordingly. 

We are using CodeClimate for most of our static code analysis. It supports various types of analyzers, making it very flexible.

MondayBot

But how do we avoid repeating the same mistakes? As an example, we are using Redis on our platform. We connect to it using a standard Redis library, however, we have wrapped it with our custom service that handles exceptions and follows best practices. It is possible to use the standard library directly, but we wouldn’t encourage it as it doesn’t follow our guidelines, so we created a rule to enforce it using Danger.

Danger allows us to write custom rules and attach those to different triggers. That way, developers can add rules based on their experience, just like they would with unit tests.

The output (“warn”) can appear anywhere that is relevant to the context of the message, as a comment to the PR in the relevant line, failing a CI/CD flow and so on. You can be very creative here!

As the rule is written in custom code, it can be enhanced as much as we want.

Having these tools, on top of the code review and tests, added a lot of stability to our product.

Conclusion

Having a simple, convenient and effective working environment improved our developers’ efficiency and helped them focus on what’s important. Changes can be small and still make a big impact, like adding an integration to a Slack channel that will notify you once a pull request is pending your review, or shared bookmarks for the entire R&D department (we are using TeamSync bookmarks and it’s awesome!).

In case you’ve found any of the above suitable for your team’s workflow, I’d recommend first understanding what are the primary time consumers or distractions, and focus on those. You could start with getting some feedback from the developers by having an open discussion, sending a survey or any other means you find convenient. Remember that once your team grows, you might need to adapt. Receiving periodic feedback from your team can let you know how happy or frustrated your team really is.

Found this interesting? Want to join us?

We build our infrastructure solutions in the same way we build everything at monday.com. We’re looking specifically for Full Stack developers with a special passion for infrastructure. Like any engineer at monday.com, if we need to change the application code to comply with infrastructure changes, we do it ourselves without waiting for someone to do it for us.

If you’re a team player with strong communication skills, this just may be the team for you. You can see here all of our team’s open positions, which includes Development Experience Engineer, Infrastructure Engineer (SRE), Infrastructure Backend Engineer, Production Engineer and DBA.

If you’re excited by what you just read and our challenges, we will be happy to meet and share the knowledge. Let’s be in touch! 🙂