字幕表 動画を再生する
hello viewers today's video is a bit different from my rest of the systems interview videos
in today's video actually i have asked a friend of mine who is an ex microsoft and ex airbnb employee
to actually take a mock interview and he will be asking a system guy interview questions which
he hasn't told me yet so i haven't prepared for what he will be asking me and so i will actually
try to design this whole system at one time so please consider this youtube video as an example
for how you have to take a systems interview especially when the interviewer asks you something
which you haven't prepared in advance and you will see now in this video doing the mock interview how
i am thinking out loud how i am going through all the different things how i'm actually expressing
my opinion with confidence how i'm discussing different design approaches why one is better than
the other or not and you will also see that even in the interview there are something which i don't
personally like but i did still even i did that for example the back of the envelope
calculation i i don't personally prefer them but since the interviewer asked me to actually
perform the back of the envelope calculation so instead of arguing with the interviewer actually
i did perform uh some back of the envelope in a calculation very quickly so i hope that you
will find this video useful so let me know in the comments below if you find this video useful and
if you find it useful and valuable and you learn something about how to tackle a system designed
by going through this mock interview then please also like this video and if you haven't subscribed
to my channel yet then please subscribe to this channel and click the bell icon so that you get
notified when i am uploading some new videos also please note that i have not edited this
mock interview video so that you will have the experience of real mock interview so let's go and
see the mock interview uh hi let's do uh design mock design interview and i want you to design
youtube okay so before designing youtube i think we should first discuss what would be the
uh we should try to scope down the the whole big system and see what are the different
functional and non-functional requirements uh that youtube have and that we would like to actually
uh tackle in this uh interview so let's see whether we have a text box or something yes
okay so let's see what are the different uh functional requirements
that we can think of about the youtube so first of all i think uh we would like user to actually
have an account of course user need a gmail account in order to upload
videos to youtube so user account the second i would say the functional requirement is
basically we would like user to upload a video so upload video functionality and in the same
way we would like once the user has uploaded the video he should be able to provide some metadata
information about the video for example the the title the description the tags etc so
metadata information about the video so this step this functionality is only related to the
to the user who is who is actually a creator or who is a youtube video creator but let's see
what let's now discuss the functionality and the requirements from the perspective of a user who's
who's watching different youtube videos so the very first i would say functionality would be
that user should be able to search some youtube video
and of course when the once the user find the
the video then the user should be able to play the video play the video and of course
the other all the different uh actions that the user can perform like fast like
forwarding the video move to a particular time in the video etc uh apart from that
we would like also youtube to actually recommend videos to the user
recommend videos to the user
so in a user home page the the user could see different videos recommended by the by the youtube
at the same time users should be able to subscribe to different channels in youtube
and of course this is all you functionality to use a watching videos and this is the
functionality which is a user creating or publishing videos and of course here
we could also say that user should be able to create
a youtube channel and in the channel he can share different different youtubes
so i think this week what do you think i think we can actually uh keep all the
functional non-fictional requirements for this interview uh i think we can just keep these
uh functional requirements they look good yeah yeah
so now let's see what are the different non-functional requirements that youtube could
have so i think the very first requirement is always we want our system to be highly available
we don't want youtube to go down and we don't want user to actually
see the message that the users is down and we would like also our system to fault all
then so even it's totally possible that some servers in youtube may go down but it will not
actually affect the overall functionality or overall availability of the service
we also would like our service to be highly scalable as the load on the users as a number
of users increase the number of videos in case we also would like our system to be highly scalable
we would like our youtube service to be durable so once a video has been uploaded to youtube we don't
want the the video to be deleted and we would like to keep that video in the in the system
then as far as the consistency is concerned
consistency consistency requirements has i would say these are these will be
there will be different consistency requirements for different parts of youtube for example when
a user is uploading a video then after uploading a video he should be able to
see that video that he has uploaded that video so we need us somewhat strong consistency there
but as far as when a user is searching a video and let's say another user actually upload a video at
that time and that video does not appear in the in the search it's totally fine because at that level
we can have some sort of eventual consistency similarly when a user actually make a video public
at that time it's totally possible that some users will see that the video is public and some may not
see but i think here also it would be okay to have eventual consistency where uh some users
would see that the video is public a bit later it's fine i think from the consistency requirement
yeah so i think these would be uh some functional and non-functional requirements
i would also say that of course when we are designing this youtube service we will have
we will have some monitoring service analytics because as part of service development
and also operations of of service we always need some monitoring and analytics as part of
the service operations as well so which will tell us the help of the service the load on the service
etc so which i'm just discussing right now but of course we are not going to uh if you want we can
go into that detail as well but i think right now i will keep uh only i will keep focus on
these uh these functionality non-functional and social requirements what do you think i
think that's the right approach we can discuss uh monitoring later if we have time okay cool so
usually uh i know that there are people who actually uh prefer some back of the envelope uh
estimation but usually i don't want to waste time usually in estimating the back of the envelope
by in doing this back of the envelope calculation because i think that we should try to actually
first try to design the service based on the functional non-functional requirements and try to
keep the design as flexible enough so that if it needs to be highly scalable and available we can
make so and as far as the estimation is concerned we can actually of course we can go and like this
five ten minutes doing that uh but eventually i think at the end i think what we would need
is basically uh just to know okay what are if let's say if you're uploading videos what would
be the size of the video how much uh storage we would require and of course uh this would be
estimations and we will have an electronic service to actually uh keep an eye on all those metrics
and that's why we can start with some initial storage and we can always increase that that's
why i don't want to waste time on that but if you would like i can go into backup the envelope
estimation as well otherwise we can go directly to the high level design and then the detail design
so is the estimation is useless like doing the cost estimation is really never helpful or
it is going to be useful at some point later i mean it is usually useful when you are designing
your service in real life it would be a bit useful because we would like to know what would be at the
beginning what would be the scale could have or there would be a of course it will be just an
estimation and the real numbers might be different but it will actually make you decide okay
do i need to allocate let's say three app servers or six app servers or 10 app servers
for my service based on the load that i'm predicting that that will happen from day one
because as far as the adding news servers are concerned we can always add new servers
uh if we see that the load is increasing and we can also start with some bare minimum number of
servers that would that we really need in order to design the service okay let's do some five minutes
uh cost estimation uh uh some some estimation of uh numbers so that we have some data available
okay let's do them some cost estimation
so let's just assume that right now uh i don't know right now what would
be the right now the current i think how many users uh youtube has maybe almost uh
i would say almost uh a billion users but as far as the users who are uploading the videos
uh i'm not sure how many of them are right now let's consider them maybe uh i would say
100 million let's see if there are 100 million users who are actually publishing videos okay
and let's suppose if they are actually publishing video uh on average i would say every week
like one one user is uploading video every once a week for example once a week video upload
and let's say in that case uh a video on average usually well nowadays you will find videos from
like under a minute to more than like like more than like hours like almost six seven
eight hours of videos you will find on youtube but let's just assume right now that on average there
will be video for around 15 to 20 minutes that's a 20 minute video on average there will be videos
which will be of course less than 20 minutes and there will be videos which will be more than 20
minutes but let's just assume these are the on average there will be a video of size 20 minute
and of course it will be using let's say 1080p and usually when you youtube uploads when you
upload a video on youtube youtube actually save those videos in multiple different bit layers in
different formats so for example if you upload our video at 1080p usually uh youtube usually
stores those videos not just in s 1080p but also at 720p and 480p and 320p etc because we we will
discuss later also that youtube actually uses adaptive uh bit streaming to actually uh make the
the the experience of use of user to be like help to make the user experience good actually
this is also i think one was the non-social department that we we didn't discuss that we
would like the experience of user to be like fast we would like video playback to be faster
so it means that the the server would be actually able to recognize the the client
so the client will inform the server about its bit weight that it has or and also the
screen size etc and the server can actually send the right size video uh to the client
so apart from that i think so let's say if i if we are doing a 20 minute average 1080p video
so usually as i said uh there are videos like usually 20 minute video
depending on the bit where it could take around uh on average
10 to 20 gb and if it's a very high quality video and so let's say on average we say it's a
10 gb video and of course uh use youtube actually also store uh this video and other different
formats as well like if it's if you're uploading a 1080p uh you youtube actually also store uh one of
the other formats of the video like which have reduced resolution like 7080p etc so usually i
would say on average 10 10 gb for that around 5 to 4 gb for 720p around 1gb for i would say
let's say 480p videos so usually i would say on average let's just say that there will be around
15 to 20 gb a video of storage that we will have for a video and now depending on
let's say 100 million videos per week so you can say 100 multiplied by almost let's say 20
gb which is almost 2 terabytes of video per week
that would be uploading and in that and based on that we can see how many videos we would see
let's say in a month or in a year for example do you want to go further into that or should we
go to the high level yeah i think this is good enough we can move to the next section okay
so let's let me draw the high level uh diagram i think is there a way to make it bigger
okay so i am actually going to draw the high level diagram so there are multiple parts if you see
from the user's perspective the youtube service will be actually comprised of multiple
microservices there will be an upload service this upload service will be responsible for uploading
a youtube video and it will we will go further into it and see how those upload service what
is the functionality of the cloud service how it works etc apart from that there will be of course
let me copy
there will be
a user service
there will be of course
i would say a channel service
a user can have multiple channels so the channel service which actually show all the
information about the about the channel the user has uh
apart from user service and these channels there will be also other service which is like let's say
search service
there will be
recommendation service
so these will be different uh services
that we will see in our system and if you want we can actually go uh in under in in the
design of these services one by one that how these will these services will work together
um yeah let's uh uh go so usually what happens is that when we actually upload a service
so we will go and see what's uh how the upload service works and
then we will see some onward that usually let me actually draw arrows
so once the service is when once a video is uploaded that actually information goes to the
general service as well also the same information actually goes to the search service because now
once the video is uploaded then the the you the once the video is uploaded the video is searchable
and of course the same video can be recommended to other users also so this is like which it also
goes through recommendation service and we will see the design of all those things when like how
these are implemented okay so if you like we can first go and see the design of upload service yeah
okay
so usually upload service if you check
it will be comprised of multiple app servers so there will be one
there will be at least three or more app servers here i'm just going to draw like now
two app servers but there will be multiple and they will be behind a load balancer
so they will be behind a load balancer actually
apart from this there will be actually a object storage service where the video will be uploaded
so
it's like f3 for example if you are using and in the same way we will also have some distributed
uh cache as well so let me see whether i can draw a distribute cache is there a suitable
shape for it uh let's see right now let's just say this one is a distributed cache
so usually what happens that
when the user is uploading a video that video will be of course it the the at the at the client side
the the client app on that case the in the browser for example when you open the
the youtube studio inside the youtube studio when you upload a video the video actually
gets uploaded to the object storage so usually what happens is that if you uh if you experience
you if you have done this you will see that once you start uploading the video you will see a
message on your browser that please do not close this this the other tab because you don't want
to actually disconnect the connection that your browser has with this app server and usually what
happens that even in between if the connection get drops the browser then makes sense makes equations
another like connection and that question can go to any other app server also so what happens
right now once this connection is established from the user browser let me draw it also here
this is a user browser you will be actually sending the the you will be uploading the video
in the form of of course small packets or segments that will be uploaded to the object storage and
once the storage is done done uh the app server any point of time will actually send at that
point when it's done it actually sends a message into the this queue this is a distributed queue
it will insert the messages just in this distributor informing about the
the the video that has been uploaded into the into the storage and now they are
into the storage and now there are multiple agents that are listening to this queue and what they do
they actually post process this video and what those agents done that whenever they find any
message in the queue for any video that has been uploaded then they go to that video they actually
break that video into different small segments and they do the process of task coding so dashboarding
is basically what is that that usually if you go into the detail you will understand that
usually when you upload the video you upload it in some format it might it might be mp4 it might
be some other video format and usually if it's mp4 it's some sort of in i would say compressed video
format so in the process of task coding they first actually decompress the video to get the raw video
and then they actually encode the video again into another compressed format and of course as i told
you that whenever we upload a video on youtube youtube not just let's say you are uploading a
video of bit weight let's say 1080p youtube not only store that video but it also actually post
process this video after the upload and actually also store other formats of this video with
lesser bit weight as well like 720p etc and also it also depending on not just the bitrate but also
this is because this is this is actually the i would say it's important later on because now what
happens is uh later on the client that is trying to access the video could be either a desktop or
could be a smartphone or mobile phone and in that case they could the client could have different
sizes or the client could support different sizes and that's why uh youtube also after like after
after the user upload the video it actually post process that video and do this transcoding from
the original format that the user has uploaded to into different other formats so what happens now
these app servers these are the upload servers and once they upload the the video into object storage
they will insert a message in the queue and then there will be agents that will be listening to
that queue there will be multiple agents and those agents what they do then they actually
do the task coding on that video and they also at the same time uh divide that video into different
small segments and usually these segment size are from i would say three to ten seconds
and then they stole them in the object storage again in those in those small chunks
so this will be the upload service would look like what is the need to make them in small chunks
to write them in small chunks okay yeah i can tell you about this so this is the concept this
is something which is called adoptive bitrate of bits adaptive bit streaming so let me actually
adoptive streaming
so let me give an example what happens in the database streaming
let's suppose we have a video and it has different that's the segments let's say segment a
see let's see this is in 1080p the same basically the video go gets also the same segments also
stored as let's say a level uh resolution which is let's say 720p so these ones upper one are
like these are the 1080p these are the 720p and then also there will be another one for 480p
now usually what happens is that let's say a client comes
let's suppose that there's a client i'm just going to
this is a client machine and the client initially when it's start up
like it start downloading the video it tells okay i'm at a very good network connection and i'm a
it's a laptop so it start copying this can it this one this is the one which is basically let me copy
so this is 1080p this is 720p this is 480p so the client is start downloading this chunk a or
segment a and start playing and while it's doing that it's realized that its network speed is not
that good so in that case what happens that it can always actually switch from downloading this b to
this b okay because what happens now because let's suppose if the client connection is slow then
after playing this this segment a if this segment b is is being downloaded is not yet download then
what will happen the video will get stuck at a place and you will see a hover glass and this is a
bad experience for the for the user and that's why all the uh that's why what youtube does is that it
uses that adoptive bit streaming where uh and this actually works between the server and the client
both they both work together in that case and the client from time to time actually checks how much
uh segments it has downloaded and and what is the current uh network speed that it is uh
seeing and based on that it can actually tell us about that okay i don't want this
b segment at 1080p i can actually i need i can take the one from 720p because my connection
speed is is not that good yeah that's good uh so let's move back to the upload
design is everything complete there or there is anything else needed in upload
service so i think there are there are multiple agents pulling uh these events from queue
is this all the same kind or there are different kind of agents doing some other stuff as well so
now it depends what are the let me see these are the different agents so these agents which are
here in this video with their multiple types of them first of them is basically the ones
which are doing the task coding so there will be task coding agents which are doing transcoding
apart from that there will be other agents also which which will be actually once uh
the i think this is where this transcoding agents they will be what they will be doing
they will be actually uh segment breaking this video into different segments and then
putting them again in the queue so that they can be further processed so let's say
an agent will actually break this uh a video let's say it's a big video and it will break it
into let's say it said into different like hundred or thousand segments but depending on whatever the
size of the video was and it will actually it will actually uh upload those segments into the
s3 or blog or object storage and it will also insert notification in the queue and
then there will be other agents which will then receive those cues those messages from the queue
and based on those they will actually start task coding each segment in pal individually
okay in this case what will happen that multiple agents would be transcoding different segments
of a video in parallel okay so that would be one thing and so once all of them have
and that's why you will see usually when you upload a video it takes some time to
to post process it depends on the number of agents that are currently working on right now and of
course there will be multiple users who will be uploading their videos so it could all depends on
on the load factor and of course if we see that the number of that that the load is increasing we
can always introduce more agents because these agents are nothing but just uh stateless uh
host or like components which can then which just get data from the queue and perform the
task coding on the segment and then upload the segment into the s3 again so this will
be one thing apart from then once we have com once they have completed the task coding at that time
at a time what we can do we can also uh inform this inform these uh agents can also inform the
general service so where they will say okay now this video has been uploaded and it's
now available or like usually when you upload the video when you start playing video there will be
a message that will be sent to the channel service and so you will see that okay now you will see
that there is a video that is getting uploaded and you can actually go and update all the metadata
information for that video in the channel service but still you won't be do anything else because
right now as far as the video is concerned you will see that the video is still being uploaded
but once the agents have once the youtube once the video has been uploaded and then
all the transporting agents are done with task coding of all the segments of the video
at that time they can notify the child service that okay now this video is long is now ready to
be shared so they can at the end they can actually send the message to the channel service again
so let's let's talk about uh search service uh how is uh such search service will interact with the
videos or how it will index okay yeah we can go into that design now so how much time we have now
944 okay we have 10 minutes you have 10 minutes okay let me try to then quickly let me just uh
so usually as far as the
search service is concerned we can actually divide the search service into
i would say two parts the the first part is basically the the part which actually generate
these the search index for the for the for the video so what happens is that once the video is
uploaded and then it the message pass to the to the search service that is a new video available
and usually this is the case when the video is made public
if the video is kept by it of course there's no no point sending it to the search service
because the video is private but once the videos is made available it's made public at that point
this will be the general service where you will make the video public and the channel
service will notify the search survey that a video has been made public now
and at that time what happens there's there will be in the search service there will be two
part of it let me
search indexer
and the other part would be
i would say so the search indexer is the one which which is responsible for creating the
the inverted index for the for the video i will let you i will let you know what what is the
what is the the inverted indexes but there will be like search indexer and then i would
say let's give it a name search finder for example where this is the this is the
this is the service or micro service that will be serving the user search request
so usually what happens now the general service actually notifies the search index service the
search index level will be again different app servers uh behind load balancer and what it will
do it will provide some information about the the youtube video uh those information will contain
the description the the the title and the keywords in the that the user has specified while uploading
the youtube video and now what this search index service will do first of all it will actually
uh try to actually figure out what are the different uh so it it has all the first tags which
the user has specified these are the keywords already it will also go through the description
and the title uh to get the tags or the keywords that it can create that it can actually for which
it can create the search so these are the keywords or the first terms for which it will it will we
will the search index self service will generate the inverted uh search index in that case so
what happens now let's say we have a video let's say youtube video a
that was passed to it the search indexer will actually go through the description
and the title and that and it will create these terms and now it will actually store
this this information in an inverted index and how the inverted index looks like uh let me
give you an example so this information we can actually store in let's say in some data store
and that data store will be partitioned there are now two ways we have to actually partition uh
those those the data store for example we actually can partition the the data store based on the
on the document itself or the youtube video itself so in that case what will happen all the
uh search keywords there will be index on on the search keywords for that youtube video
that will be stored in the same let's say partition let me copy it this is a data store
and the data store we have let's say
we have a table let's say a video table
and what this video table has if you go and check this table
the video table has let's say a video id and then of course other information like title uh
path to the video for example etc and then keywords
and so this is not depends whether we shot this and this data store will be shot by the let's
say video id and what happens now that when this search indexer will create an index for the video
these keywords these are basically they have a secondary index on these keywords as well
there will be secondary index on these keywords
the video id is the primary key and now we have two approaches to actually uh store these
secondary indexes the first approach is that we are we also store all the like there are two so
that the partition where the video data is stored we also stored all the keywords or the terms
for that video in in that part in that partition so let's say there was a youtube video about
designing youtube so it has keyword like designing youtube and that keyword which is secondary index
will be stored here in the same partition where the video is stored
and now in that case what happens that whenever a search needs to be done
let's say a user is searching for designing youtube then the search finder will actually
search will send the search query and it will scan all the different partitions
for all the videos which have this keyword designing youtube in it so in this case the the
queries will this the search finder will send the search we will actually scan the
all the partitions in the in parallel this is one approach the other approach we have is basically
that we have the secondary indexes these are also these are
so the first project we already discussed is is called a local index where the indexes are
stored locally there's another opposite global index where we partition the the data store
also by the by this by these keywords or terms as well so for example designing youtube is if
it's a term it will be stored in one partition and so all the information all the information about
all the videos which have this keyword designing youtube will be stored in one partition only
so in that case what happens is that when the search index there are different pros and cons
in both these approaches in the first approach the search indexer only writes to a single partition
both the information about the video and its uh its keywords on which we can perform the search
on it and then the search finder actually scan all the partitions when searching for some term
and in the second approach the search indexer will will actually write for each keyword it
will go to a partition of that keyword and or term and will write information about the video there
so in that case the right will be a bit slower but as far as the search is concerned
search will actually based on the on the keyword it will just go to the single partition and
uh read the data from there so what is the what is the scheme for uh first partition what what
are you choosing to partition in the first strategy you're using the video id or title
uh how are you partitioning that that data so that's what actually we will in the invited case
that's what i'm saying what will be happening is that if we are actually this is
as i said this is the inverted search index and inverse means that there are some terms let me
it says there are some terms or keywords to a video id we need that so in the first approach
we were partitioning everything by video id okay so that's why it's totally possible
some videos are stored in one partition and some videos other videos are stored in other partition
and when a video a is stored in one partition all the terms which that video has they are they also
get stored in the same partition or in the same machine in that case for example yeah okay that is
the first approach and that's why the right is way faster but this but the search searching is slower
because in that means if a user is searching for a video designing youtube then you have to search
all the different partitions you have to send the search query to all the partitions and then
all the partitions which have this which have any video with the with the term design youtube
will actually return you the results okay of course then you can perform the ranking on
which strategy should we use first or second first or second so i think as far as the
as far as the rights are concerned so there are other issues as well first first of all let's say
if we use the second approach where search terms are written to one partition then it's totally
possible that there could be some such terms which are hot such terms so there are a lot of videos
that are associated with those search terms in that case that partition will become very hot
yeah and it also also depends also if that a lot of people are searching for something yeah also
the say the same partition will become very hot in that case so that's why i would say you know
we have to decide of course on one approach and usually in that case uh uh easiest would be i
would say that we start with the first approach uh in because it it will avoid any hard term
issue like where a search term becomes so hard that everyone is searching and so we are hitting
a same partition all the time we will avoid those type of scenarios if we use the the first approach
and of course in that case what will happen that the scan query can take a lot of time but now what
the search finder can do the search finder will have some aggregate servers which actually where
where the users actually query them and then they send their they actually actually initiate the
search query and it's totally possible what this execute servers can do that
it might possible when they initiate the search and some partitions may return the results early
and some may not return early so as soon as they have some enough results they can actually
rank those results and and present them to the user okay so they don't have to wait for
all the results from all the different uh partitions to come up all right so we are running
out of time to wrap it now uh now um so thank you very much uh it was great learning from you
doing interview with you and i learned a lot while doing this interview thank you okay thanks so guys
this was the mock interview uh i hope you must have found it very useful and valuable for you
this mock interview video was not edited we didn't edit anything in this mock interview so that you
can have the real experience of how an interview look like so let me know in the comments below if
you find this video useful and if you have any other comments if you want me to make videos or
any other topic then please let me know in the comments below as well thank you and take care