Name: Live: Serverless SaaS
Uploaded: 2022-09-01T00:00:00Z
Description: Designing a multi-region serverless SaaS on AWS for DNS and HTTP monitoring using Lambda, DynamoDB, EventBridge, and Timestream

Transcript

Hello and welcome, sorry I'm a few minutes late starting the stream tonight. Hello to the replay squad as well, thank you very much for tuning in after the fact. Today we've got a bit of an interesting one for you. I'm going to be designing something from scratch. It's going to be a serverless SaaS application that I'm maybe planning on building myself, I'm not 100% sure yet, but we're going to go through and design the application. I'll give you a few of the requirements and stuff of what I'm thinking, and then we'll take it from there. So I'm going to jump into it.

Okay, so for my serverless SaaS application, there's basically two things that I'm going to want to build out at least initially, and if we've got time I'll add in some more features and stuff. So what we're going to do is we're going to build out DNS plus HTTP monitoring. and what we're going to do for this application is i want it to be able to trigger around the world so that's kind of like requirement number one i'm gonna document it over here on the right so i'm gonna want it multi-region from the start um i'm going to want it to cost as little as possible um what else do i want obviously some form of payment system in place payment billing and that's going to be it for now so if i think about where and how i kind of want to monitor this uh no idea why that is being a little bit silly let's try that again uh so if i think about where i want to host this so if that's our kind of requirements are in that corner i want to host it in the eu in the us asia um probably something in mina as well at least to start off with and this gives me nice global coverage uh for my monitoring solution that i'm going to build um let's see we can have multiple points of presence inside of europe it's not at least a start offer to save costs don't particularly want every single region in europe but we can definitely go into that depth a little bit more uh and the other one is south america okay that's obviously going to carry on playing up so if we're going to talk about these is from the regions we need to have a multi-region design um don't think there's any other particular requirements that i've got here i think this is going to be quite a straightforward um solution here so let's get cracking into it so at the core of whenever we want somebody to sign up to our sas we have to have some kind of access solution so for me i'm going to be basing this on aws cognito so why aws cognito partly because of cost because one of my requirements here is to keep costs low so with 80 cognito i get a free tier which means i can get up and running and do a lot of development really really easily i can also set up advanced features later on um such as you know the classic enterprise level features should i want to so if this does turn into something uh pretty awesome which i hope obviously i hope it does then i can you know push that through to enterprise level features later on and it also means as well if i start running any lambdas which i'm probably going to do quite a lot of then this will be able to talk with lambdas quite nice and easily through lambda authorizes um so that brings me on to two parts uh which are going to need a front-end ui for this solution and we're going to need a back-end api so for our ui i'm probably going to statically host this um with cloudfront maybe there's nothing going to be too dynamic here i don't need to dynamically draw pages it's going to be quite simple will probably be blazer underneath um i'll put a question right there something i need to investigate a little bit more around the front-end side of stuff and what technologies exactly i want there but for the back-end api this is going to be hosted on lambda and probably i have an api gateway in front of here as well so this is a gps's api gateway that's going to go into my lambda and then we'll see off between the two here as well now from a costing perspective i do need to go and double check to make sure api gateway um or the costing of that is versus aws load balancer for the solution it's not something i've done before the stream so i'll have to go and check that out um but ultimately i want to keep these costs kind of low so whichever is the cheaper option is the likely the one i'm going to pick for this one because i don't need any advanced functionalities from api gateway like i don't need to serve swagger rest endpoints or anything like that i can do all that through the lambda api um i'm being a c-sharp dev that's going to be asp.

NET Core sitting in there in the background and that'll be our main api surface um you'll see that's going to need some kind of persistence as well so i think there's going to be two forms of persistence this is going to need uh the first one is going to be more of a database solution and this one here i'm going to have DynamoDB and i'll go on to why i'm picking that one in a second and the other one is s3 so why do i need s3 in the solution i'm going to start film because it's a little bit quicker to go through primarily billing focused bits and pieces so i'm going to have my billing going through something like stripe and we'll talk through that a bit later on but i'm going to want to serve the files out from my api but i don't want to serve out directly from s3 what do you want to proxy through my lambda um i know that you can expose s3 straight to the internet but i would rather rather avoid that and limit what can access my three bucket um and if you actually watch one of the videos i posted recently about creating time-based endpoints you can do kind of this proxy access um and do a file download scenario so i'll probably implement something like that to ensure that we've got secure downloads of invoices um that people can access 93 the api but also through dui and stuff like that so we'll just send those files directly out to the to the user um from from there and if i do change my mind about where i keep things like invoices and stuff in the future if i want to dynamically generate them change buckets etc i'm i'm free to change my mind later on i'm not going to kind of tie download urls and stuff to specific s3 buckets which is always good um sure so why am i picking DynamoDB is my backing database for my little slash monitoring solution partly because i want the multi-region capabilities of it um also could use aurora serverless where i can go global tables and stuff with aurora but i'm not going to do that because i want more of a change capture feed which i can get with dynamite quite easily and it also integrates with lambda really really easily as well so that's kind of why i'm going to pick uh DynamoDB here um so i think about like the things that i want to store i'm going to need kind of which domains uh i'm going to need like a frequency i'm going to need kind of the result stored as well of each check um what else in the future is something i've thought about you might want to have like um a rule condition so you might have some rules to basically say right this dns address has got to match uh this specific thing or it's going to be within these values or one of these values and stuff like that um so we can build in some nice rule-based systems here to make sure that the domain is matching what we think it's going to match um and provide kind of that external bits and pieces so that's kind of the domains we've got the same for kind of the urls again it's just going to be frequency the result and the rules we may have some additional config here as well so we might need a specific authorization header or something like that to go through and say uh you know kind of what happened with that um future scope one thing we could do since we're hitting these urls anyways we could offer kind of a screenshoting service to say what did the page look like at this time that should be something that's really easy to do on on top of that um so we've got our domains we've got our urls we're going to need some form of notifications that would be awesome and last but not least we'll probably want to store some billing information um but what do i need anything spectacular around that i don't i'm unsure about the billing um whether it's going to be in dynamo or more leaning towards a time series database for a few different reasons so i might come back to billing in a in a bit um so if we this is what we can probably deploy everything that we've got there so far in all regions anything's anything i need to check on cognito to see about this multi-region capabilities um i don't know if it's a global resource or if it is limited to a single region i would need to go and check that but most of this stuff like the lambdas the api gateway cloud front dynamic db s3 all works in multi-region so i'm covering off my requirements over here quite nicely so that is great from that perspective so we kind of said on the left here that we we want our domains and our urls to be scanned on some kind of frequency in my head i've got this down as cron expressions so we can we can kind of determine at certain points of time set frequencies and we can build differently depending on how many times people want to be called if you want to be called every minute yeah we can 100 do that but it's going to cost you to do that um so let's have a look at how i would kind of look at building out these this frequency power things so at the core of the solution i would have it on bridge because with eventbridge one of the things we can do is we can put a roll on here that says trigger on this chronic version and if we trigger on the chron expression we can then scan the table that we've got for everything that matches that frequency and the last time something was accessed and bits and pieces like that so we'll have our lambda function that gets triggered on our rule and that lambda will go to our global instance of database to um basically get urls and domains that we need to check uh once that is done we can so if that's kind of like step one of our process um and the result is coming back step two of our process we can publish something to event bridge for another lambda to pick up so kind of this could be a bit of a spalling diagram from now on but so we're going to have another lambda down here that's going to start talking to you um maybe cloudflare's uh dns over https we might have google's one as well for example we may have another eventbridge lambda over here that might be just our http my http guy and that's going to go off out to our users website and this way we can we've kind of got a nice separation of what is going on between our dns level stuff and our http of stuff now to make this work multi-region we would have to set up the rule in each region which means we're going to need an event bridge in each region so don't think you can have a single event bridge that spans regions yet um so and that kind of actually works in our favor anyway um because we can isolate different rules and stuff to different regions so something i've got to add on to there here is which regions do we want and the design of the database we kind of haven't gone through um but we've got our regions on both our urls and our domains so then we kind of got the section over here and where we're in say the south america one of the south american regions the event bridge rule is triggering we go off and get the urls and domains from our region local copy of DynamoDB it goes back publishes that onto eventbridge to say i don't know url check requested or something like that and then it will go through to the right lambda um whether it's a dns or http um one and you would go and get the basically do the check and to say right here is the results so following down the http flow for now that's kind of like step three then the result you want to go into kind of step four which i think makes a lot of sense we're publishing everything back to the event bridge we can then start recording the results now how we record the results is kind of up to us we can either write back the results not only to uh event bridge so everybody gets it but we don't have a choice of where do we put this data afterwards from a ui perspective we do want to have some graphs showing kind of latency we want status pages and stuff like that to kind of show up time whether it's up or down or failed and all that kind of stuff so this is why i was leaning earlier towards a time series database because i can store some of the records in there to say this url id was accessed for this customer it took this amount of time um or this dns record with this provider took this amount of time um that kind of stuff would suit quite nicely into time series database i can squash i couldn't aggregate it and there's some really cool things i've been looking at in the last week or so with time stream around pre-aggregation of data inside the time stream if you're not aware of what time stream is it's aws aws's time series database uh from the looks of things it seems to be pretty cost effective until you get to a really really really large scale so probably not going to reach that um at least not in the next year so i'm not worried about the cost on that one for now but if we're writing it to our time series database we can then start producing nice pretty graphs um for our ui which we've got above here so i'm just gonna you know this is quite a messy design um but i want to keep this for later on so with my event bridge if we assume that starting point for this section of the diagram is i've just finished my http request and i want to go back to that bridge that's kind of step one so i will likely actually dual right um to two different places so i'll have one lambda here and this is going to write to DynamoDB and the reason why it's going to write to dynamic db is as part of this http uh kind of status result i'm thinking about capturing at least like the headers and the response payload from the result and storing that with kind of with the overall result like what's the status code um what kind of values and all that kind of stuff then with that we can just pop that through store in dynamic db and we can retrieve that later on on the ui which is great great from a debugging perspective of hey i received this alert but why did this alert trigger um oh okay it returned me a 401 okay why did it return your 401 oh i've changed some config somewhere and it's all messed up that key was actually being used something like that that stuff always happens so being able to debug that from our perspective is really great we can also put things like trace ids as well so we want to capture all that information but we don't necessarily want to store that in a time series database so our second part of my little my jaw right so i'm gonna call that number two and number three so number three we are gonna write to another database this is time stream and this is going to store basically what their result classification was so was it successful or did it fail i'm not going to go into detail sometimes in timestream um we don't want to pay for the extra storage on there considering we're going to store the other stuff in DynamoDB we just need to know was it successful or or not how long did it take and kind of how to uniquely identify us there's only going to be three or four fields in here um like i said we'll have some scheduled queries in here so we're going to see like uh 99 99 percentile person tile uh like 95th and we can pre-aggregate these as much as we want so if we grab those percentiles as a scheduled query a sort of talk last night actually when i was thinking about this design a little bit and with the schedule queries in time stream i think they took one of their processing or queries that were processing from i think it's like 38 seconds down to 300 milliseconds um on quite a large data set uh it was like in the terabytes of data and they created in like 300 milliseconds to get the aggregation because of the schedule query did all hard work in the background for them and i thought that is great and looking at it if it reduced the overall cost overall in terms of latency and the overall monetary cost associated with it i am all good with that so this is where the design starts to get a bit interesting because we have two different one thing like to consider is i don't think timestream is a global resource we can't set up like a global table as we can dynamic tv or aurora so we do have to think about right how do we access stuff globally but only have one time series database so i'm thinking that we come up with two different region types to go and process this so we'll first of all have a primary region and this is where we're going to do most of our data processing so this is what's going to hold the time series database and then we'll likely have to have an edge region what do i mean by edge region so this is one of the other regions i've got up here so if our primary region is the eu then our edge regions will be us asia mena and south america and how does the architecture differ between what we've got in our primary region and what we're going to have in our edge region so for me i would still have an eventbridge instance i would still have my lambda that writes the dynamic db let's finish with ryan but then i would probably call that it on the edge region um there is a scenario where we do want to do notifications and i want the from an aggregation perspective would you want to do notifications from the edge or from the primary region now for whatever reason the primary region goes down we lose some stuff on the ui okay it's not great but the actual monitor the actual system that's monitoring i.

e the lambda functions um from the other part of our diagram over here this stuff keeps on working and this would be in in an edge region here it's kind of monitoring stuff because we'd have access to our global table um the lambda functions will be region local as well which would be great so what do we want our notification system to be at the edge initially i'm going to say no from my sas to keep it simple um i will eventually take a look at kind of this core system here where we've got we've got cloudform which is gonna be global big gateway and stuff like that um we may in fact i can just call back into the api and trigger a notification because we're gonna do this on we're gonna have like Route 53 in there anyway um so yeah let's just draw this out so the other thing that's like number one that comes off the second thing that's going to come off here so we still have our lambda to say hey right we've breached the rule um so to notify it we can actually go back to our api now we have to figure out how to do the authorization for that api um but we should we could probably do that with i am authentication so we don't have to deal with any keys um i'm just gonna know that down there and the way this works with i am authentication uh so what you can do on the api gateway which is this lovely little component right here in the center of the screen now is you can basically say i want you to do i am authentication into this api gateway um as one of your or my kind of reference points and then once you've done that you can then grant rails access and stuff like that i would have to see whether you can run dual authorizers uh for different endpoints um but yeah that might actually yeah we can because i can just have it on and turn on point so yeah we can run i am authentication here um and yeah go from there and the hello dungy yes it has been a very long time with you mate how you doing um so yeah that item approach would work and that kind of gives us our notifications and my rating is terrible but i'll understand what that means afterwards um so yeah because api gateway will be our global resource we can obviously also obviously but we can use a feature inside of Route 53 where you can do health checks on endpoints and do region based failovers as well which is awesome um so as long as we've got our api gateway set up in the correct way we can authorize the kind of the notification lambda which is uh this guy over here and we can say all the notification numbers regardless of the region so long as they're from my account can go and talk to my api gateway at an iam level we don't have to deal with any api keys which is awesome zero trust for the win um so what am i scoping out uh good question uh so basically i'm taking a look at an idea that i've had for a dns and http monitoring solution um there's a bunch of other stuff that i want to do after this um in terms of like incident response architecture monitoring um rule-based navigations and stuff like that but i'm starting out with right if i'm going to build something that's built something minimal that i can get up off the ground i'm running pretty quickly um so i'm gonna start out with basic dns and http monitoring uh before i start adding some of the advanced stuff later on so we're gonna track just basically track dns records hey do these match what we think they should match urls should they match what we think they should match we'll have some kind of notification system in there and we'll have billing in there as well uh so for the primary stuff i'm gonna have cognito api gateway from aws with lambda and cloudfront um nothing too magical there for configuring all the stuff and then we're getting into the nitty gritty which how do i if i'm gonna monitor dns or https from different regions how do i go about doing that and this is kind of the left-hand side of the screen here now is what we would do to put into an edge region and to actually do the monitoring so we trigger something from event bridge picks up the rule we're going to get the configuration and then we pass the configuration back to say hey we've asked for this thing to be checked um and then you'll go and actually do the checking and publish the result and then this is where we just started to get onto with the primary and edge regions where for the billing purposes we only really want one place in our architecture that captures a lot of the billing information um so we're gonna base that on aws timestream um this is what we call our primary region and then our edge regions is kind of what i just discussed but this is the second side of things of like once we've got the result of like the dns or http request what do we do from there like how do we continue on and do how do we do notifications because the billing side of things would basically be taken care of by writing into dynamic db and then we can use the change feed from that to for additional stuff if we need to but otherwise it's taken care of by tapestry um so yeah i had a little debate about notifications where where i should do them should i do them at the edge locations where the things are getting triggered from or should i do it from a central place they're going to be triggered initially from the edge um but i can use some of the adaptive architecture because a lot of the stuff i'm trying to use is multi-region focused i'm going to push it out through Route 53 onto the notification endpoint from there hopefully that catches you up nice and quickly okay so given that in mind what we're going to use for our notifications back end or what do we need so i think i'm going to need three things out the back of this i think one is going to be standard email sms you could argue voice as well put a question behind this to that i don't think that was something i would initially implement um then the good old web hook oh thinking about it for a notification system there is a really cool thing aws does now with eventbridge called partner events and i've not set it up from a publisher's side but i wonder if we can do some event bridge to event bridge communication now um so long as it's in the same region where we're publishing the notifications from it should be fine it's a good thing i'll have to check into that partner events i probably won't put it in first thing anyway to be honest i probably wouldn't put voice in i'll put a question mark by sms as well i probably won't do that for an initial version i'll probably be looking at email so the default i guess would be centigrade here um web hook we can implement ourselves for lambda we don't we'll probably back it off for like sqs or something first and then just have the retry policies and stuff like that in place from lambda um if i was gonna do sms and voice i would probably go via twilio to do that as well and then that all kind of covers everything that we got in terms of notifications for everyone so i think that would be good from a notification perspective uh and to be honest that's pretty much a core architecture for a a monitoring based service a very simple one admittedly there's a whole bunch that we would need to configure here in terms of retry policies and stuff like this and how long should we wait if we can't contact it like should we even retry when if we can't get to the user's endpoint um a few more complicated things like if they want us to connect over vpns then or private linking points and stuff like that i think we can deal with that another time um because that's just some vpc configuration this definitely won't be needed or supported in the very first versions of this um because it's going to be a lot of custom bits and pieces and provisioning that's going to have to happen for that um so that's the kind of like the main bit uh let's just check my notes so the other bit i'm going to go into now is right if i'm going to expand this beyond domains and urls and i'm going to go into through architecture monitoring and this is where i've had some really cool ideas so i think for the majority of things it will probably work in a similar way to this section down here so i'm just going to redraw that so we'll have our this is our dynamo table that we had above this is going to have our configuration in there we'll have our event bridge and lambda that sits in between so if we've got our standard kind of access pattern here and this is to do the architecture monitoring so basically what i'm looking at implementing here is to say on a given frequency um get me all the ec2 servers once i've got all the ec2 servers for each ec2 service um do they match a specific rule set and i want to let the users define this for also and there's something called open policy agent which you may or may not be aware of um which you can basically define a load of json and then you can give it a load of rules and say does this json match these rules and it does that really really well and i decided to try and re-implement something very very similar um in net and it's actually already on my github page which you can go try and find if you want to go find it's under the aspect repo and the premise from that is i want users to be able to define kind of what do they consider good architecture and bits and pieces i'll provide some default rules that people can go and just enable but for every kind of resource that you want so you got ec2 you got s3 lambdas i'll cover all the popular ones first and then work my way out to the edge ones but for all these resources like kind of how do you know that it's well architected or what's your own rule sets you might want to say like i want these things tagged and compliance policies and all that kind of stuff right so if i start configuring that in my lovely dynamo configuration table and gets triggered onto a rule um how from there am i going to do stuff so we know that we're going to have kind of this little this little loop between the lambda um so i'm going to go through the flow as i'm kind of thinking about it so one is like the event has now started or like a discovery has started and i think for each one of these services that's on the top like ect s3 lambda we're going to need a discovery lambda so i'm just going to call that dl for now and we're going to have loads of these one for different types of services and they're going to take their results and they're going to push it back to them bridge then step three we'll have our how do you describe this uh inspection lambda which is gonna to need configuration source or like dynamo and this inspection lambda that's the thing that's going to be running the rule set over here so it's going to pick up the configuration for whatever object is coming through um the discovery lambdas and it's going to run the inspection and keep things consistent all publish stuff back that then once it's done its discovery will hook on to this system in in the primary um so we'll follow the same solution that is if we're at the edge we'll we'll publish down into dynamo any failures we'll go back through the lambda and notification system for the edge regions um and then when the primary region picks up the dynamo resources so i forgot to put the cdc in here once we pick up the change data capture feed or inject directly into the lambda from all the time stream database from there um so that gives us a nice little leave we did a discovery we do our inspection we publish the results and because we publish oh that's the point uh that would be the last step in here we've actually written down the publish and this is our h lambda again lambda so it's been a long day and i cannot write it today that will essentially publish up there uh that kind of gives us a nice little loop between discovery inspection publishing to to a local database and only uh in fact i'm gonna change from my design because i don't think i need if i'm always right into dynamo from every single region and i've got my change data capture feed i don't need to hook off a vin bridge here so let's just pretend that doesn't exist anymore and we're going to make our lambda come off the change date capture feed so it doesn't matter where the information comes from per se just that we actually capture information um and that will go to the lambda into the time stream database and that will be a primary only service i think put that primary only so i know that for later on um okay so i guess the o'neill that's actually way more way easier than i thought it was going to be i've got a whole complicated design that scribbled down like six or seven weeks ago um that was way more complicated than this and uses a bunch of sql excuse but the use of eventbridge here simplifies stuff massively um in terms of number of resources and bits and pieces um so we just need to make sure that we do our our mapping between the discovery lambdas and the actual resources itself we could create just one giant one um but as i think we just want to go through and say like right going to expand it out over time and have nice deployable little units maybe offer them on a repo or something like that so they can share a common code base but otherwise individual lambdas and bits and pieces so assuming they'll take me three or four months to build out um payments is the the one bit that i said i was going to cover um as well so we're looking at this primary region here and i think the way that you would do it i'm going to base this off of them bridge again uh i'm literally only for the crown expressions that um makes it really nice and easy i don't need anything to run more than what it needs to and so we'll have a crown of like i don't know uh let's call it every month um every month we're gonna run my lambda and that lambda is going to need access to that time stream database and i can do that so we can get the billing get the billing for that month i guess then we'll need to so if that's step one this is step two and then step three once we got all that information we actually need to split onto two different paths i think uh no actually first thing we want to do is generate the report and that's going to go to s3 so we can download it from later on and we'll have to look at storage tiers like whether it's going to be infrequently accessed and stuff or not and then once that's done it's kind of step forward process step 5 is to say like the report is generated except it's not going to come from the lambda i want to use the s3 change notification to do that i think um that tells us that the objects in the bucket we can associate metadata with that and have a nice folder structure around s3 and then when that's done we then need to do two things in parallel potentially and this is where my knowledge of something like stripe doesn't help me i can build stripe but i don't know how stroke works it's kind of irony i'm just going to call it a stroke here it's probably what i would use a billing if you do know anything that's like super simple um to do kind of ad hoc billing of people then please let me know that'll be amazing i'll do to stripe and then somehow that'll notify the user and we can assumingly we can attach some information like the s3 report link and and stuff like that uh so yeah because we'll be able to access and give them like an item on itemized breakdown of like exactly how many requests that we we've been doing how many were successful how many were unsuccessful was kind of like you know if you wanted like a request breakdown for absolutely everything we could definitely do that now the one thing we kind of didn't talk about which i do need to consider from a cost perspective is the data in DynamoDB that's going to kind of live up here yes we're going to have a lot of configuration data but that's not going to be the bulk of where our storage cost has been going to be because if we're talking about doing a check every minute for a domain or url um you know that's a couple of thousand records a day uh for especially for a number of urls we can make this data turn out quite quickly so if we we can keep it live in the in the database for i don't know call it seven days i think it's reasonable to be able to see anything from the last seven days out of the api um but if you want it longer than that then you know we can actually archive into s3 here and then look at the storage tiers that's associated with that um and to do that we can use the change data capture feed to archive data into s3 and we can do a nice hierarchical order based on customers and stuff like this so if we needed to go and get a load of data back or they needed like a specific record then we can go back and go and get it all we can temporarily from a support aspect go in restore it in dynamic db extract and restore so i might have to build a tool for that but deal with that problem and come to it extract the data store into dynamic again and it'll be accessible from the api obviously apply like a ttl because we don't want to keep that data forever in dynamo um yeah and then storage and that's really super cheap and we can start looking at aggregating that data on like a daily kind of process and stuff like that um or we could even get a lambda to go through and aggregate it into a big json file or something we can figure out a way of like nicely aggregating that on a per day basis or something like that um so yeah that is i think that's pretty much everything that i would want to do because everything else i think would just be basic crud operations with apigee around land and there's going to be anything um really interesting in there so to put it into perspective some of the other stuff that i've been thinking about um with this is incidents just helping you manage those that's basically the crowd operations um what we can do is use the monitoring stuff to start triggering incidents which is great we've also got if you've got an incident yeah i want to post one uh for notification uh monitoring you might want some light around that kind of this stuff you might want some scheduling and when i say scheduling um what i mean is like on-call rotors and stuff like that cool so we've got a question here of i wonder what this setup cost per hour on aws of course what i said right at the beginning is i needed to keep an eye on costs for this so it's a great question thank you for calling out on it and let's go through and answer that question for you now so a lot of what i'm looking at i'm going to start off in the same place where i've started a short diagram is everything that i'm looking at here is either got a free tier or um is very low cost so for example aws cognito i think even if you i added enterprise users with like samwell-based authentication um i get 53 users a month which is perfect for bootstrapping i think it might be a thousand users with social logins so um imagine i had 10 people sign up and they paid me like one pound per month to monitor their domain you know i'm not actually spending anything on aws at this point to kind of do it because cognito is free lambda's got a free tier uh i would need to check the cost of api gateway uh to be sure but i think that's relatively cheap because all the cost is usually in the lambdas behind the scenes dynamdb is an interesting one there's two modes you can run in um initially i would run it in pay as you go kind of mode so i'm not provisioning any read or write units to start off with um what this means is if there's nobody using it for three months then no worries i'm not actually paying for anything nothing's running apart from api gateway so i might be paying like 20 dollars a month or something like that for it just to sit there and do nothing same with us three i'm not gonna be paying for anything if it's in there and if it is in there i'm gonna be paying peanuts um so i think i when i started going through this and looking up some of the costings i looked at the costings of everything um off the top of my head but i know there's a lot of stuff in freitas i think this all comes out about 100 a month if people are using it quite a bit so i reckon with you know if people start looking at a couple of domains or a couple of urls they want to monitor depending on like we can set up like frequency tiers and stuff like this if you want it done every minute it's going to cost you more than it's if you want you want it like once an hour or something like that and we can definitely do that with this kind of stuff so the most expensive thing is going to be scanning dynamic db for the configuration um and getting that detail eventbridge again there's i believe there's a free tier so i get like a couple of thousand events a month for free um off of that i would have to check the outbound pricing for this as well uh yeah most of this like if it's not being used i'm not going to be paying for basically anything uh stripe i think you pay when you actually get a payment uh so that's fine report generation is fine because you know csv works for everyone right um again there's nothing up here i am just having a look through to see if there's anything that's gonna particularly cost uh i actually suspect the two big cost centers will be over here at least initially sangria and twilio but to be honest if i call that like 50 a month that's not bad um just having a look time stream is pretty cheap as well um i think i costed that out about 20 a month and dynamo was about 50.

uh you might have to put a bit more spaces in between your messages come first stars for me as the alternative to twilio uh but if there is an alternative it's cheaper than twilio then great i haven't done too much research because that was near the end of what i thought about i'd only really thought about to be honest before i started the stream i only really thought about this primary and edge region concept and how to make multiple regions kind of work together in parallel um and even whilst i was designing it like today on stream i decided to change my eyes by this big squiggly bit here which i'm going to get rid of now because that's no longer needed let's get rid of that um yeah no i think i think i got this architecture even with a couple of million lambda requests a month uh like um i think it was under a hundred dollars a month um i think i have to do some like initial setup costs like Route 53 maybe uh you i could get away with actually hosting it on a different dns provider and not paying anything for aws on Route 53 at least initially that might be an option because i've got things like dns simple already set up fully automated with polumi so i can just leverage that and make some updates to the domain from there um cloudfront is any other one i don't know but yeah i if i call it like worst case 150 a month for with nobody using an absolute worst case i don't think that's bad at all and this is why i want to design it as a sas based system or serverless base system which is why i call the stream the way i've called it is because serverless is meant to give us the promise of like scale to zero and a whole bunch of other stuff and infinitely scalable and it also makes my design pretty simple don't get me wrong there are a lot of lambda functions in here um but there's all stuff that would have probably been services anyway so the alternative design to this would be run it all on kubernetes and use kubernetes jojo to go and do some of the time jobs and it would have been fine to sit there and do that we could have written custom pods and stuff to go and look at specific configuration for specific users but then you have to pay for the underlying compute and then i'm instantly talking two or three hundred pound a month plus a load balancer plus all the other bits and pieces so this definitely makes it way more cost efficient um and with the design i don't have to run everything everywhere to start off with like i know i want to run in these regions to get some nice global monitoring but from day one i can just stick with the ee region i can stick with one region i don't have to pay for anything global and it's kind of perfect um so yeah really interested to know what that twilio alternative that you said is uh because i said it started out for me on my screen so i can't see it might be because you put in an actual link instead of the actual name of the company um so if you just let me know that that'd be great uh wire thanks i'll just put that on screen for everyone uh so this was the twilio alternative that was mentioned to be honest i think that would be at least in if i was going to build this out like building the infrastructure for this would be very very quick with polymer terraform um the actual bits gonna take time honestly for me is going to be the ui part of this is getting the ui to work with the api uh and cognito i've never used cognito before but i think that's where direction i want to go so that trio of things setting up would be fun to play with to start off with and then the other bit that would take me some time is the integration with stripe because i've never integrated with stroke before and i'd have to see exactly how that goes um be interested to know if you use stripe or something else to do your villain uh again assume name is talented on the twitch chat uh that's the what's the last thing i was gonna say ah the only other thing that i would even possibly remotely consider doing but i wouldn't do until i had a decent amount of paying customers would be what i would do in front of the ui and the api gateway and that's i would put a dbs global accelerator here as well just to optimize some of the access for everyone so if you are down in like the far reaches of south america then you jump onto aws's network at the earliest possible point and then go from there um but to be honest that's not that's not something i would implement straight away that's something i would implement when i know i've got a i've got customers down in that location and that's where they're accessing it from and stuff like that so yeah i think that is how i would go through and design my serverless solution that does dns http monitoring and in the future it goes and does architecture monitoring incidents post modern full custom rule sets um interesting yep so yeah i'm just having a look around i think so if we look back to my original requirements we know we've got payments covered with stroke uh we've just gone through the low cost aspect of this so gone through all the costings i'll go and double check it but like i said i think this is only like a 100 150 a month which is fine for me um and the multi-region aspect is covered based predominantly on the architecture choices i'm making um so yeah i'll stick around for a couple more minutes um if there's any more questions i'll stick around for the next minute or so to answer any questions that you guys might have um otherwise thank you very much for joining it's much appreciated to see you all uh talented as well i'm interested to check your sass out um you should be able to find my twitter handle either on twitch or on my youtube page you just send me a link on there i don't think the link is going to send in chat i think i turned some moderation settings on to stop links um but yeah if you just send me a message on twitter or something with your sas i'll definitely check it out i'm interested to see how other people are building their stuff as well alrighty then since it seems to be all quiet and it doesn't seem to be any questions i'm gonna call it a day um should probably go and say hello to my wife and yeah thank you very much for everyone uh some good questions and stuff and i will see you again hopefully next week i will confirm by the end of the week whether it's going to be on You'll be able to see the schedule live on YouTube as well, I always publish a little thing on YouTube.

If you enjoyed this video, consider subscribing to the YouTube channel for more content like this.