字幕表 動画を再生する 英語字幕をプリント Hello, everyone! In this video, we'll look at how can we design a notification service that is scalable enough. This is never a stand alone system but this is always embedded in some other system design. Even I have used it in a lot of other system designs videos that I have made. Let's say if you are building a e-commerce application or a booking system or anything of that sort. You'll always have a notification service which will be used to notify your consumers. Let's look at how can we build a notification service. Now let's look at some Functional (FRs) and Non Functional Requirements (NFRs) that this platform should support. The very obvious first thing is that it should be able to send notifications. The next thing is that it should be pluggable. Now what does pluggable mean? So let's say we want to support SMS and email as one form of notification. Now, tomorrow somebody might want to say that I want to support in-app notifications as well. So it should be easy enough to add that. It can be further extended to a lot of kinds of notifications. So, for example, you could have something like a WhatsApp notification or anything of that sort. So, extendability is something that should be taken care of. The next thing is that it should be built as a SaaS product. Why SaaS? because the main idea is you should know who's sending what number of notifications. And it should be possible for you to rate limit. There are two use cases where you'll need that. So rate limiting as a offering you'll need, if you are giving out to other companies as a product to use. But, even internally, you would need to do some kind of rate limiting. Let's just say its being used by a company like Amazon. Now there are multiple business verticals. If all of them send you notifications, then you'll end up getting hundreds of notifications in a day. So that's a very bad user experience. So notification system should be able to overall put a rate limit across all the users across all the platforms saying a particular user should not get more than five notifications in a day. There could be certain amount of classifications done. So there are two kinds of notifications. One is a transactional notification saying you have made an order, this much amount of money has been deducted from your account, something of that sort. So transactional notifications are ok to get. But promotional notifications should always have a rate limit. So if you're building for within your company, then you'll build a user level rate limiting. But if you're giving it out externally to other companies as a product, then you might want to put a rate limiting saying how many requests can you make for this server. Or maybe make billing tiers and price them accordingly. So basically this is something that you need to capture at least and then act upon. The next thing is prioritization. Again... we'll support multiple priorities of messages. So the idea of priorities is that some messages are low priority and certain messages are high priority. Let's say, if you are sending a one time password (OTP), then that's a very high priority message. Why? Because if the user wouldn't get that SMS or email or anything, then they would not log into your platform and they'll not do the transaction that they wanted to do. But if it's a promotional message, it is okay if it's low priority and if it gets delayed, if it goes to the user let's say half an hour late also, it doesn't really matter. So we'll have this prioritization, basis which you'll always process the high priority messages first and then the low priority messages. Coming to the non functional requirements (NFRs). This platform should always be available. Why? Because, if you are planning to build it as a SaaS product which can be used by other companies, then down time would really cost us a lot. Plus, it should be built in a way that it's easy enough to add more clients. And whenever we want, we should be able to attribute saying how many clients have made us how many number of requests. Now, let's look at the overall architecture of the whole system. Just a disclaimer to start with, if you are building it for a small enough use case like let's just say if you want to send out some email notifications to some customers, given some criteria, you can build all of this as one deployable service and put all the logic in one place. But if you truly want to build it as a SaaS product wherein enormous number of clients would be using it for a lot of notifications, then this kind of a system would probably be worth it. Let's just look at the starting point. The starting point of the whole system is then a couple of clients i.e. Client 1, Client 2, could be any number of clients, who want to send out a notification. Now, there are 2-3 kinds of requests that they can send you. But all of those requests would come into something called as a Notification service, which is an interface for us to talk to the other teams in the company, other companies or anything that we want. There are two kinds of request mainly. One is where they tell you that I want to send this particular content to this particular user, let's say a email id kind of a thing, as an e-mail. This is one of the requests. Other request could be saying, I have this user id and send them this notification. And you decide how do you want to send, whether as an email or as a SMS or whatever? Normally the first kind of a model would be used by other companies where they want to decide that they need to send as SMS or email or whatever. The second model would generally be used when you are building this to be consumed within your own company. But normally as a SaaS product, its good to have both the interfaces. The idea of notification service is, for most kind of requests, it will just take the request, put it into Kafka and respond back to the client saying, I have taken the request and I will send the notification in a couple of seconds at max. You could make this transaction as a synchronous flow but that will take a bit more time and that will keep the client blocked. So, it's probably best to take the request and dump it into a Kafka and move on. But let's just say if for a very critical scenario, you need it to be a synchronous flow. You can basically make the whole process synchronous through API calls. But assuming its a Kafka for now, let's look at what happens next. This Notification service would just do a very basic set of validations saying that email id should not be null or user id should not be null, content should not be null, something of that sort. There might be a lot of other validations, given certain kinds of events that you want to do. All of those validations would happen in this component which is called Notification Validater and Prioritizer. It does a couple of things. Validations is one part of it. The main thing that it does, it decides the priority for a message. So based on some attribute within the message, let's say, we''ll keep a message type identifier kind of a thing in a request and based on that, it will decide the priority of a message. Normally, OTP is something which will be high priority because if the user can't login into the system, they can't place an order and things like that. The second most high priority thing would be a transactional message saying your order has been placed, it will be delivered by so and so time. The third, least priority one would would a promotional message wherein you are sending out offers and coupons and stuff like that. So, it decides the priority and puts the event into a Kafka topic, specific for each priority. So there'll be a topic for high priority messages, there'll be a different topic for medium priority, there'll be a different topic for low priority. And while consuming, the consumers will first consume the high priority messages and then the medium and low priority. The idea is - we don't want any lag in the high priority messages, but we should be at times okay if there's a spike on low priority message and if it takes time, that's okay. Now comes something called as a Rate Limiter. These two components can be interchanged depending upon the conversation with your interviewer. But I'm keeping Rate Limiter before it because the next component is a slightly heavier component. Rate Limiter does two kinds of rate limiting. 1) It checks that the client who's calling me, is that client allowed to call me these many number of times. That is the first thing. 2) The user whom I'm sending this notification, am I supposed to send this notification to this user these many number of times. Let's go into detail. So there could be a subscription that we have with a certain client that they can call us maybe 10 times in a second. All of those kind of rate limiting, would be one kind of rate limiting. The other thing is there could be a configuration saying I cannot send more than three promotional messages to a particular user in a day. That would be another kind of rate limiting. Both these rate limiting would be implemented in a very similar way. Saying there would be a key. It could be a client id, user id, anything. And whenever you get a request, you basically increment that key into this Redis for a certain timestamp. Or maybe the timestamp need not be a exact timestamp, it could be day, it could be second, it could be at a minute level or any level. The moment you exceed that threshold is when you drop that request. There's one more thing that it does, which is called request counting. Let's just say with some of the clients you have a pay for use kind of a subscription model, saying they can call you any number of times, there is no rate limiting as such but as the number of calls increases, the amount of money we charge them also increases. So they pay at a request basis. For those kind of clients, we'll implement it in a very similar way. Just that we will not restrict the request. We'll keep incrementing the counter and then there can be reporting built on top of it. The next component is something called as a Notification Handler and User Preferences. First let's look at what are user preferences. User might give us a preference saying don't send me SMS, send me emails instead. Or a user could have said that I want to unsubscribe from all your promotional messages. So all of those would be handled by this User Preference Service. It has two components that it talks to. One is a Preferences DB, which is a single source of truth for all the user preferences that we have in our system. This would be mainly used when we are building it for our own use. Let's say Amazon is building this notification system for their own use. This will normally not be used when you're integrating with third parties. Third parties would handle this piece on their end. The second thing it will talk to is a User service which is basically - let's say you got a request saying to this user id 123, send this particular text. What is that user 123? It's something that you can search from user service to get the email id or phone number or anything of that sort, given a user id. This basically could handle one more kind of rate limiting. A user might say that don't send me more than three promotional messages in a week or one promotional message in a week. So, if that kind of a rate limiting also exists and want to support that, then this whole Rate Limiting module would come after this Notification Handler. But, because this is a very light weight thing and this calls a DB, calls an external service, that's the reason I have kept it after the Rate Limiter. But based on your conversation with your interviewer, you could swap either of these. Once we have all of this finalized, we basically have a full fledged system that we can actually use to notify. We have a phone number, we know that we'll send it as an SMS and we have the content or we have the email id, we know we'll send it as an e-mail and we have the content. All of that is put into another Kafka for basically sending it out. Why do we need so many Kafkas? Because let's just say we had a lot of SMSes and this SMS service is not able to handle it. One way is to increase the number of consumers. But if there are certain spikes in certain clients, it's probably not worth to add a hardware for all the time for that. So we might as well put into Kafka and this SMS Handler could handle things at its own pace. There are multiple handlers that sit on top of this Kafka. You could decide to build all of these as one deployable unit, but again, with the SaaS (Software as a Service) thing in mind, I'll rather keep that as a separate service in each of them. So there are multiple handlers. Let's take a couple of examples. So all the SMS requests that will come into this Kafka would be handled by this SMS Handler. Plus, there could be different topics for each of these. So email could have one topic, SMS could have one topic and stuff like that. Now SMS Handler might integrate with multiple vendors for sending SMSes. Let's say you are a global company. You have one vendor that deals with SMSes in Asia, another vendor that deals with SMSes in USA , another vendor for Europe, stuff like that. So you could have multiple vendors that you are integrated with and all the SMS vendor integration could be done through this SMS Handler. Now, let's say if you have... if you are in India and you have 5 vendors that work in India. So what request should go to which vendor? What is the priority? How do we distribute request amongst them? That piece could also be handled within this own handler. Same thing, you have for emails. You have a Email Handler. And by the way, this communication with the handler and vendor could be a sync call. Moving back to Email Handler. Email Handler takes all the e-mail requests and calls the email vendor which will send out the real emails. This email vendor could be a simple SMTP server that you have at your end. Then you will have In-App Handler, which basically handles all the in-app notifications that you want to send out or a push notification, kind of a thing. You can use a Firebase for Android and probably use a Apple push notification service for sending out such notifications on iOS platform. You could also have a IVRS Handler. A classic example would be, I don't know, either Amazon or Flipkart, one of them. When you place a very high value cash on delivery (CoD) order, they actually send you an IVRS call, saying that we have got this order from you, do you, are you really sure you have kept this as a cash on delivery order and you can press 1 or 2 to kind of confirm or reject. So all of those things could be handled by the IVRS Handler. This is a very low throughput scenario and it would happen once in a while when compared to SMS or emails or anything of that sort. You can again scale these independently based on the kind of traffic you have. Now coming to pluggability. Let's say somebody comes and says that you need to add WhatsApp aslo as a notification mechanism. So basically you just need to add a WhatsApp Handler and basically introduce a type called WhatsApp Messaging at this layer and all of that can flow through all the way. This Kafka... this basically Notification Handler will have a new topic for WhatsApp messages and a WhatsApp Handler will basically take care of sending those notifications out. So that's how you can make it fairly pluggable. Now coming to what this Notification Tracker is? You always need to keep a track of what all notifications you have sent. In case somebody sues you or there is some auditing required, you need to know what all communication you have sent out. So for all of that, you have a Notification Tracker which puts all the notifications that you have sent out on its own Cassandra. This is normally a write only thing, which would be read once in a while when there's an audit or something of that sort. Depending upon your traffic and the throughput, you could decide to club some of these components. So for example, you can decide to club all of these components as a single deployable service. You could also club both these components into a single deployable service and you can move the Prioritizer right there and make lesser number of components and try to club all of this as one service as well if it's a very low throughput thing. You can decide it based on your conversation with your interviewer on how do you want to do that? Now, let's do something different. What if you want to send bulk notifications? What if you want to do something like - for all the users who have ordered a TV in last 24 hours, send them a notification like installation service or anything of that sort. Or for all the users who had ordered milk 3 days back, send them a notification that whether do you want to order the same thing again or not? So all of those things come under something called as a bulk notification. Now, how do you want to do that? So the very first thing is there would be a UI kind of a thing, which is represented as Bulk Notification UI, which will talk to something called as a Bulk Notification service, which takes a filter criteria and a notification and sends it out. Filter criteria would be something like find all the users who have placed a milk order, anything in last three days and send them a notification with certain text. How does that work? So there's something called as this User Transaction Data. This basically abstracts out a lot of services which are outside the scope of this notification service but real business functional services. For example, again, taking the example of Amazon, let's say, there are a couple of services that handle their e-commerce orders. There are a couple of services that handle their pantry orders. All of those services would be putting their transactional information - that your order has been placed, it has been delivered, it has been canceled - all of those kind of attributes into various Kafka topics. What we could do is basically build a search functionality on top of those topics. Normally, in most companies, you do have those kind of products already built. But assuming you don't have that, let's try to build that. And if we have that, we can just leverage it. What will happen is - this will put information into a lot of Kafkas, lot of topics. I am abstracting it out as this one Kafka. All of those information would be listened by something called as a Transaction Data Parser. Now, this Transaction Data Parser basically... first of all, parses all of that information. Now, because you have multiple clients and multiple services who are sending data, they could be putting in multiple formats of json or XML. So, for example, pantry (Amazon) could have a separate signature and regular e-commerce could have a separate signature. This data parser basically takes all of those messages, puts it in a format or converts it in a format that it understands and then it puts it into its own data store. It could be Elasticsearch or a MongoDB kind of a structure where you could do complicated aggregations and nested queries. On top of this data store, there's something called as a Query Engine. That Query Engine basically takes a query, which could be like an aggregation plus a filter kind of a thing, saying, find out all the users who are in Bangalore let's say, who have ordered some food item in the last few days, or find all the people who are having some items in their cart but have not placed the order or any kind of a filter criteria that you want to build that would be powered by this Query Engine. It would have its own DSL (Domain Specific Language), which would be a format - how do you kind of structure your query and that would be a signature that it exposes. Now, this Query Engine would query that thing on this data store. So it also understands the schema of this data store and it basically would return us a list of users that match that criteria. Now this Bulk Notification service has a list of users that it has got from Query Engine, it has a message that it got from this UI, it basically call this Notification service saying send this notification to all of these users and these users, it would get from this Query Engine. This Query Engine is not something which would just be used by this Notification service. It could be used by a lot of services. You could have a very simple Rule Engine which possibly says that if a user has cancelled 10 consecutive orders in last 3 days, go deactivate their account. They are probably, you know, just doing fraud. Or, there could be a simple Fraud Engine running which has some criteria and takes some action basis that. All these condition and action kind of a thing, wherever you have a condition running and action happening, all of those could be, you know, powered by this kind of a thing. In fact, your Notification service itself is that kind of a thing. So you have a filter criteria as your condition and action is basically the notification sending process. So this could also be a part of your Rule Engine. Something like 5 days after an order has been delivered, send a mail to consumers for giving a product review. So something of that sort. And this could then also be used to build a Search platform altogether. But overall, this is something that and if this is something that is already built in the company, you could just leverage that and have this Bulk Notification service talk to the querying platform of that whole platform. So, yeah, I think that should be it for a Notification service. Thanks for watching.
B1 中級 Notification Service System Design Interview Question to handle Billions of users & Notifications 11 0 meowu に公開 2023 年 10 月 17 日 シェア シェア 保存 報告 動画の中の単語