The statement "Data is the new currency" has become quite a cliché. However, it's not incorrect; companies strive to understand their customers better to serve them more effectively. Data is a company's most significant asset, and they establish processes to protect it. When a company uses a SaaS platform, they entrust their most prized possession to a third party. Any compromise to this data could pose a severe risk to their survival. Consequently, SaaS providers must adhere to stringent compliance and security protocols.
SaaS providers also need to be cost-efficient, and this is where the concept of multi-tenancy comes into play. Multitenancy allows for both security and cost efficiency by enabling the sharing of hardware resources among multiple clients, while ensuring the isolation and secure storage of each client's data.
Let's dive deeper into the types of multi-tenant architecture and their respective benefits.
What is Multi-Tenant Architecture?
Imagine an apartment building where each tenant has their own individual apartment. This apartment building represents a SaaS platform, and the tenants are the different companies using this platform.
In a traditional housing scenario, each family would need to build and maintain their own house, which is resource-intensive and costly. Similarly, without multi-tenant architecture, each company would need to create and manage their own separate infrastructure, leading to high costs and inefficiencies.
However, in our apartment building analogy, the building's structure (like the multi-tenant architecture) provides shared resources such as electricity, water, and security systems. Each apartment is secure and isolated, ensuring that what happens in one apartment doesn't affect the others.
This mirrors how multi-tenant architecture allows multiple companies to share the same hardware resources while keeping their data secure and isolated from each other.
Just as tenants trust the building management to maintain the shared resources and keep the building secure, companies trust SaaS providers to manage the infrastructure and ensure data security. By sharing resources efficiently, costs are reduced for everyone, while security and isolation are maintained.
How to Implement Multi-Tenant Architecture
Design Principles
Data Isolation
Data isolation ensures that each tenant's data is segregated from others. This can be achieved through different database strategies, which we'll discuss later. The goal is to prevent any tenant's data from being accessible to others, maintaining privacy and security.
Tenant Identification
Tenant identification involves uniquely identifying each tenant within the system. This is typically done using a tenant ID, which is passed along with every request. Ensuring accurate and efficient tenant identification is crucial for maintaining data integrity and security.
Resource Allocation
Resource allocation involves distributing system resources such as CPU, memory, and storage among tenants. One of the common problems is noisy neighbor. Where one tenant suffocates other tenants because it consumes disproportionate amount of resources.
This can be mitigated through the following techniques:
Resources Isolation by containerizing tenants in their own VMs.
Applying rate limits, quotas and/or throttling utilization of resources.
Using QoS policies to ensure high priority tasks receive necessary resources under heavy load.
Monitoring and auto-scaling to allow load balancing.
Architectural Patterns
Shared Database, Shared Schema
In this pattern, all tenants share the same database and schema. Tenant data is distinguished by a tenant identifier column in each table. This approach is cost-effective and simple to implement, but may pose challenges in terms of performance and data isolation.
Shared Database, Separate Schema
Here, all tenants share the same database, but each tenant has its own schema. This offers better data isolation compared to the shared schema approach and can simplify data management. However, it can lead to higher complexity in managing the database and potentially increased costs.
Separate Database
Each tenant has its own database. This pattern provides the highest level of data isolation and security but at the cost of increased resource usage and management complexity. It's ideal for tenants with very different requirements or high security needs.
Hybrid Approach
It is not very common approach, but is mostly used in microservices architectures. This approach was devised considering that not all resources and data require the same level of isolation and security. Therefore, some parts of the data can be housed in a single-tenant architecture, while other parts can be in a multi-tenant environment, creating a mixed-tenant architecture.
Technology Stack
Database Technologies
Choosing the right database technology is crucial. Relational databases like PostgreSQL and MySQL are popular for their robust support of schemas and complex queries. NoSQL databases like MongoDB can offer more flexibility for certain use cases.
Middleware
Middleware components, such as API gateways and message brokers, play a vital role in handling tenant-specific routing and communication. They ensure requests are properly directed based on the tenant ID and can help in load balancing and rate limiting. These strategies are required mostly where every tenant has a separate database, otherwise middlewares are mostly placed at the application level.
Application Layer
The application layer must be designed to handle tenant-specific logic, ensuring that tenant IDs are correctly processed at every level. This layer could also be placed at middlewares usually provided by common backend frameworks like Nest.js, Express.js, Sprint Boot etc.
Implementation Details
What is the best approach for you, you may ask. Well the answer is... It Depends. Like every other system design decision depends on your actual use case. Every approach has its own pros and cons, and you have to pick the approach that works for your system best.
I can only tell what I did, and I learned along the way.
Use Case
Our system was an ERP for restaurants with potentially 10,000+ tenants. Few notable aspects which I considered for taking the decision are as follows:
Tech stack: Nest.js, Postgres, TypeORM, AWS Cognito for Auth-n
Monorepo with monolith architecture of the system.
Tenants had mainly 2 types (with other subtypes) of users of their systems: Customers & Employees
Being a startup, our company also had 2 major constraints: Limited Time and Limited Funding.
So we had to keep it scalable and also make the refactor in the least amount of time, along with other team working on building features in the old architecture.
We went for shared database, separate schema approach. It provided us the route to refactor our code fastest and the fact that Postgres can support 300,000 schemas. So it means we can scale further down the lane.
We didn't go for shared database & shared schema, because we wanted to ensure logical isolation of data. We also didn't want to be at the mercy of a developer's mistake. If a single 'where' condition is missed, the entire data isolation goes down the drain.
Separate databases were out of the question at this stage of the startup. We didn't have clients demanding utmost isolation, and our limited financial resources played a part in this decision.
Implementations
There are a bunch of implementation details that I can't go into because of the NDA, but I found this article a good starting point for anyone who is trying to implement it. https://github.com/thomasvds/nestjs-multitenants
This is an earlier draft of what our request lifecycle looked.
Lessons Learned
I have to put it out there that TypeORM was not a wise decision to use in this project. It doesn't have much to do with multi-tenant architecture, but I want to stress on it. If you have a simple and basic project, be my guest, use it as you may like. But if you are building an ERP or enterprise solution, never go for ORMs. Build your own wrappers around the database connectors. Or you may even build wrappers around ORMs but do not use ORMs directly.
Another challenge in such architectures is performing database migrations. Since every single migration has to be run across all schemas. So in our case, the time required to run migrations was directly proportional to the number of our clients :/ So it could take hours before our system is ready to serve requests.
If I had to retake the decision of choosing our approach, I might go for Shared Database, Shared Schema approach with Row level security of Postgres. I have heard good things about it.
Conclusion
Multi-tenant architecture offers numerous benefits, from cost savings to scalability. There's no perfect architecture, there's no perfect size fits all. Its all a game of nuances, pieces of the puzzle that fit your needs. You try your best to make stuff work with the unique set of circumstances you are faced with.
Make sure to weigh the pros and cons carefully and make a decision with open minds. Keep studying and be open to new ideas, look what others are doing, try understanding their unique perspectives and keep evolving.