K002: The First Factor – Network For Modern Infrastructure
transcript
Kamalika: Hey there. What's up everyone? Thanks for listening to Cloudkata, the modern infrastructure show. My name is Kamalika Majumder and I am an independent consultant and practitioner of DevOps driven modern infrastructure. This is my first podcast to share my 12 years of experience from legacy servers to data centres and then to modern day cloud. These are some of the most exciting and interesting stories of this journey, which started this very day 12 years back. This show will be available on all podcast platforms, as well as on cloudkata.com. So listen in to Cloudkata on your favourite podcast platform as I take you through my journey of modern infrastructure, you can also subscribe to the playlist and transcripts available on www.cloudkata.com. I repeat, that's cloudkata.com. You can share your views, queries feedback, or connect with me on cloudkata.com. So without further delay, let's get started.
Intro Music:
Kamalika: This is season one, and it's called Anatomy of modern infrastructure. In order to master something you should always start from the basics of that topic. That is why this season is dedicated to deep dive into the anatomy of modern infrastructure through the Ten Factor Infra and these are network system storage, identity management, logging, monitoring, security, availability, disaster recovery, environment on demand, these 10 factors integrates together to form the framework for a more robust modern infrastructure. Today's episode is about the first factor Networks. I feel very at home, whenever I speak about Networks, probably because I started as a network administrator 12 years back, and that has opened my minds on how things really work in this IT world. It made me curious to know the end to end process of how Net really Works. Network is like the veins that connect each and every point of infrastructure. So see how networks help drive the modern infrastructure of today's digitally transformed world.
What are the driving factors of digital transformation? Today's modern infrastructure is driven by the four demands of digital transformation. And these four demands are accessibility, performance, privacy, and security. What is accessibility, providing users seamless connectivity to product or services, from all time money to always on services, every business wants to be online 24/7 in today's modern digitally transformed world. And once you have given them that level of accessibility, the next demand that comes up is of performance. So accessing a well performing service or application is the top most priority of any business today. And that's what all of us want, isn't it, a highly performing application that is available on our palms 24/7. Once you achieve that kind of service the third demand is about providing privacy to your customers or users. Today's world is moving towards a zero trust policy idea. And it is gaining momentum with the larger adoption of cloud and software as a service model. And along the same line comes the 4th demand that is security. Protecting intellectual property is one of the top most criteria to meet the compliance and regulations of any country. These four demands that are accessibility, performance, privacy and security are widely driving today's digital transformation. The 10 Factor Infra that I am presenting to you will give you the mechanisms to meet this demand. So let's begin our season with the first factor networks and see how we should supply the four demands of business today. And how the network plays a key role in laying the first level of foundation for these to meet these four demands. Network is a foundation of a solid, secure and stable infrastructure. It is the root of infra, the stronger the root is, the healthier the entire tree will be. Any loophole in it will lead to compromising the entire system.
So it is very important to build a robust network for secure and seamless connectivity in and out. How do we do that ? Let's see
In order to meet these four top most driving factors that is accessibility performance privacy and security I present to you the four design parameters of the network, so what does the network factor say ? Network factors say that these four demand factors that is privacy security performance accessibility can be made by the four design parameters which is segregated network, perimeter security, single secure entry point, dedicated peer to peer connections. Let me begin with some real life stories to understand these four design parameters and how you should put your configuration into these four layers. Let me begin my explanation with the first design parameter that is a segregated network and let's see how a segregated network helps in today building a robust infrastructure. So I begin with a story where a self segregated network has helped achieve the desired data privacy need for customers. This story is about two businesses which were from entirely different domains but they had one single need and that was privacy of intellectual property. Both had multitenant business models where they had customer focus services. One was a leading business consulting agency whereas other was a leading retail consumer appliance company. Both needed to protect their ip. For the first one, the business consulting agency they used to provide services to their customers on various business analytics and metrics and that is why giving privacy to their customer so that one's data is not leaked to the other was the top most requirement.For the second one which was the home appliance making company they had to protect their pre-launched appliance information from leaking out into public before their launch day so they had to protect the data even the information and images of their pre launched or soon to be launched products and that's why they needed to protect their intellectual property.Both were aiming towards moving to cloud and it was a very interesting situation where both wanted to you know get the performance and the flexibility and the enhancement of modern day infrastructure that is on cloud however they had one topmost requirement which is protecting their intellectual property on cloud especially this is public cloud i'm talking about not any private cloud on premise so they wanted to move from there to the cloud.
So how did they achieve that or and what was the design that I created for them in order to achieve what they needed. The first thing that needed was segregation of network topology because as i said they were multi-tenant and they had to separate out some layer of data from the other layer so they had to really segregate the data layer and achieve that.
The first design parameter that was adopted was a segregated network topology so what is a segregated network. Imagine you're building your house, what's the first thing you do ? Hire an architect and start designing the rooms based on your need and interestingly that is driven, and if you have not observed carefully or observe carefully it is driven by what level of visibility you want to give for the external or incoming parties in short it is driven via privacy so your living room is your public exposed area where your guest can sit and it is accessible to your guests whereas your bedroom is your private area where you may restrict complete access to only yourself or your family members however you have your kitchen which may have selective access to your guests or to you. Right now building a network is similar to building your house because you are laying the foundation stone so that is why it is important to segregate your network based on what are the incoming and the outgoing accesses that has to happen for example in your network first layer you will have to isolate the traffic which has to be public and which has to be private and which has to be semi protected or protected now, I call it a public private protected model the ppp model. The public tier is that layer of network which will host the services which are open to the internet let's say your api gateway or your load balancers which provide your mobile api's or your web portals to the internet public. Then comes the private layer, what is your private layer? The private layer will host those services which are data critical which should not be opened up onto bare internet, it should be only accessible to the internal services or internal network. And then there comes a third layer which is protected which is, which should be partially accessible, that means it needs outgoing access, it needs to send data outside of your network, however incoming access is still restricted.
So with this three tier network segment you have a very clearly segregated network traffic. Once you have that model, once you have this template then you can implement this three tier network for as many of your customers as you want and they need not have have cross access between them and this on the cloud can be achieved through vpcs or if you are on premise you can achieve it via dmzs or vlans. So using network segregation what you can do is for one customer you can have one vpc for the other customer you can have another vpc and within the same vpc you further segregate into the three tier network which is public private protected and and and then you can choose whom to allow traffic and whom to not allow traffic. This level of segregation should also be implemented between your production and non production services you know data or or the setup. So if you have separate the production and non production layer or staging and production they are separated out. However use the same three tier network for both non production and production that way you will have one template implemented multiple times for multiple setups.
So that's how using a segregated network you can provide the first step towards providing privacy for the intellectual property of your customers that you are hosting.
Now let's come to the second design parameter. So once you have achieved a segregated network it's not done you will have to still apply three more design parameters and what are these, the second one is perimeter security. So now let's see, you have designed the borders right now you still have to identify who is coming in and who is going out just because something is made public should not mean that anything and everything can enter. I can understand that there may be still services which should be opened up and there cannot be any kind of restriction on to who accesses it however I am going to explain how you can still protect that kind of accessibility in the next design parameter.
First let's talk about perimeter security and what happens if you do not implement perimeter security. To explain that I would like to share one of the biggest security incidents that i had experienced firsthand. So this is a story about a company's master website, you know the marketing website of the company's profile, what had happened was that it was because it was a company's business website so it was of course open to public internet and it was not protected by any restricted or strict security measures or any firewall policies so and on top of it it was running some vulnerabilities on the software on which the website was hosted and what attackers did that, they should have run some kind of scanners to find out what ports are open on that website and it was an https website and they found out that it is running some version of a particular software which was vulnerable and they broke into that vulnerability right inside the machines and then they injected a very sophisticated javascript injection and what was the result of it, suddenly that business website was detected as a malware infected website by more search engines and immediately it spread like you know wildfire everywhere and because it was a company's business website it was The First place that anybody will go in, it was a.com website where the company's reputation was heavily impacted. And immediately after it was identified, the first thing that was done was that stop that website, stop that IP. And then you know, fix the vulnerability, patch, rebuild the website, you know, format the servers, as it was, it was actually not on Cloud. It was a private hosted server. But it was open to the internet. So all those measures were taken, however, the attacker kept compromising that website, no matter which place we go. And we tried multiple times. And at last, what we had to do is build everything from scratch on a demilitarised zone set up on one of our data centres, which was very strictly protected. And then there were strict firewall rules applied, so much so that it was filtering who can access that website and what all website pages should be exposed.
The lesson learned from that incident is that even if you're opening up an HTTPS port onto internet, it is not secure just implementing certificates will not give you the level of security that needs to protect and these days it is very important especially to protect the HTTP based web sites which are open to internet. So how can you help protect your web based sites using perimeter security? I would like to clarify that it is not the end there are still four more parameters that we'll have to design. So what are we doing in perimeter security? Now, remember, you have already segregated your network you have identified using your network subnets that is your private subnet which is a public subnet which is your protected subnet. Now, it comes to defining who and what and where can go through this three layer of public private protection that is why you will have to achieve through perimeter security and that is done by using network policies or in some cases, it can be called firewall rules, or you can say security groups. Basically, these are network policies based on whitelisting. So this is how you do it, you blacklist everything you deny everything. That is the first rule that you implement, deny any any. And then you start whitelisting who can access your website, this specifically applies to your public posts. And for all the three tiers that I mentioned earlier, like public private protected, apply this deny rule by default. Now when you are whitelisting the IPS for the private layer, you may allow the private subnet to only have access from your publicly hosted load balancers. Or let's say you are using Akamai or something else, and then only that can access and in some cases, you might be lucky enough to restrict it to only your satellite offices or your client sites. However, in many cases, especially for mobile applications, you may not be able to restrict or whitelist the IPS because mobile applications are accessed over mobile internet and these are all dynamic IP addresses and it is not possible to restrict to one IP address or one country and I will come to the next point how to achieve the level of security in case in these cases especially for mobile applications where things are open to internet. Now, the second thing that you apply in the network policy is filtering. What filters you need to apply first thing is IP level filter, the source and the destination from where it is coming where it should go. Remember, if you are allowing it to the public layer, only allowed to the public layer and to the private layer. Maybe your databases hosted on it only allow your public subnet which is again your private subnet, but it is labelled as public because it will be allowed for external access. So for public subnet allow access from the whitelisted sources and also allow filters on ports and protocols.
Just because you trust that source does not mean that you should allow everything. Remember most of these satellite offices or your third party partners might also be prone to compromises. So that is why apply only particular ports allow the minimal, what is there and what is needed. Do not give a blanket list of ports and protocols to be accessed from dual source source ips the next level of filtering is that system to system level access now imagine you have public protected and private so what you can place in the public subnet are your web servers or your you know server web based load balancers or any kind of http based front end service then comes the protect layer what can you keep in the protected layer is your compute servers let's say your application servers or your micro services which will only accept traffic from the web servers which are in the public subnet the web servers which accept external or incoming access and also outgoing exit so outgoing is only accessible to the internal servers right now these protected layer can also be allowed to send traffic outside because sometimes your application might need to connect to a third party library or a third party api to integrate into your functionalities the third layer and most critical is a privately do not allow any kind of outgoing access no internet access in it only incoming access from your protected subnet so it's a drop down access so outside traffic lands on to your web servers on public subnet from the web servers that gets transferred over the private subnets to the protected layer and the protected layer can send traffic outside and can also send traffic to the private layer only on a read only basis or sometimes rewrite basis the private layer is completely cordoned off from the external world if you observe it carefully even the protected layer is cordoned off from the external world the only thing that external world knows about is certain ports and protocols that are exposed on your web layer so that's how you can apply perimeter level security to your network or your environment so remember the example of building your house that i mentioned earlier first thing you design your rooms and your accessibility and then with perimeter security you get the boundary walls and now you need to create the gateway remember the public subnet which is open to internet through which people are able to access your service now that has to be protected who enters it so imagine your house you build the house you build a bound very solid boundary wall and to compare it with our network example we have built the segregated network with perimeter security now we are all ready to open up our public layer to the users now when you open up the public layer you need like in your house you cannot just keep a big gap in your boundary wall and let everybody enter your house right you need to protect you need to have a gate through which only trusted users are entering and no troublemakers are entering your house and that comes brings me to my next point which is single secure entry point and in today's scenario in today's digital world with more everything and being delivered as mobile applications it is very important to make sure that you have a protected entry point for your mobile api's otherwise it is very easy to hack into an enter into your internal subnets and compromise it and so i would like to explain what are the challenges if you do not have it that you can face we had one of the example that goes way back in my journey it is another example of due to lack of any very strong secure entry point how one of the web services was brought down by a very standard ddos attack by attacker so this was a set of app servers which is opened up using apache web server and it was a client web portal which was built. So what attacker did was probably he would have scanned through it and he identified that oh it was an apache http website so he launched a massive ddos attack and he did that by using our website in his some kind of proxy server and sending out all invalid website request to our servers and it was trying to access all sorts of you know illegal sites. And it was sending, it was proxying it to our web server. And our web server got so busy that Apache web server got so busy dropping those, you know, four or four not found requests that it barely had any time to serve the actual request, and the server was heavily slowed down. And the whole cloud client portal was busy and down for more than a week. And that caused huge loss in business. And what we had to do is we had to, it was so massive that we had to actually blacklist that public IP on which that web server was open. Because it was registered to be a malware server, you know, a proxy server on the internet. And we got a lot of complaints we had to blacklist. And what we had to do in those days, we didn't have a very fancy tool. So we applied IP table rules as celiacs rules. And also we build a software IPS system before the web server so that it can filter out those requests, which are not valid, because there is no way to stop DDoS attack, the only way is to protect and prevent yourself from getting busy in solving the DDoS DDoS attack. And that is why a single secure entry point is very important in today's modern infrastructure. There. Fortunately, there are services available that help you achieve that. So how do you do that? Remember your publicly exposed web server. So how do you expose your web servers, you may do it using a public facing load balancer right? Now, you may apply certificates into a load balancer and security policy or load balancer. And you might think that Oh, that is enough, I secured my load balancer with a certificate and that is no, that is not enough, because you still are prone to DDoS attack. And when people send a massive amount of, you know, a DDoS attack on your load balancer, it will get busy in rejecting those requests and forget about serving the valid request. So it will slow down your website. And it might even make your website inaccessible. So what you need is you need a shield in front of it, that can be achieved with various tools. Most cloud providers provide a CDN and some things like AWS shield, which specify which are specifically built to handle these kinds of attacks. And these are also called the web application firewalls in some of the cloud providers. And sometimes they provide a combination of all of them. And if you want to keep it agnostic of the cloud provider, let's say you want you don't want to stick to these pro services provided by only that or you have a hybrid cloud model. Let's say something is hosted on AWS, something on early cloud, something on premise. So you have an on a hybrid cloud model. So in order to keep the solution provider agnostic, you can go with the services like CloudFlare, or Akamai, they provide your cloud security gateway, they and then how do they do that? They actually give you bandwidth reservation for your IP addresses. So they give you dedicated IP addresses a range of public IP addresses, and they make sure that they purchase enough Internet bandwidth for their customer. So that part of it even if there is a DDoS attack, there is enough Internet bandwidth in their pipe for their customers to serve their applications and not be bothered by these DDoS attacks. There are other functionalities that have to be implemented for these in a single secure entry point. Things like web application, firewall rules and overs, policies to prevent SQL injection, JavaScript injection, XML injection, all sorts of security hacks that might cause compromises to your website. So instead of implementing multiple tools to achieve it, I would recommend you take some of the tools like CloudFlare or Akamai because they give you a whole package to implement all these rules. So what do you need, you need DNS based mapping and DNS resolver certificates to make sure that the website does not throw any unnecessary errors. And certificates are also important. Don't go with self signed certificate always because even if you automate it, you need to renew the certificate after every maybe three months or two months like Let's Encrypt certificates if it's good, you can continue it but sometimes what happened is you will have to keep renewing it and self signed certificates are not really accepted in another requirement which is really taking its place today is mutual TLS. The zero trust policy remember that I spoke about even if you know the person who is at thing you you establish some kind of validation so that you can confirm that he is really that person who needs to access it and that is achieved using mutual tls so when you have mutual tls to be implemented between your services and your third party partners or any other api's it is always good to have a publicly signed certificate which is signed by a valid authority on it and i will explain more about certificates in another episodes which is focusing purely on security so again to repeat what all you need you need a dns base mapping a dns resolver for your single secure entry point the web application firewall rules ddos protection dedicated ip addresses ip segregation make sure that those these providers like cloudflare akamai they also give you dedicated slot so that you do not get dynamic ip so you can read for your load balancers or your api gateways they can only allow traffic from the services like cloudflare or akamai so that the end they also give you encrypted channel of transmission so that there is no man in the middle attack so a single once you have a secure entry point so first request will land into your gateway cloud security gateway which can be cloudflare or akamai or aws shield or google cloud armour or anything like that and then it lands to a load balancer so your load balancer is well protected from any kind of unnecessary request which it does not need to serve right so this gives you the gate v now you will ask me i have got segregated network a very solid perimeter security and a well protected gateway it should be enough right now remember the four points we have only achieved three of them which is accessibility privacy and security there is still one very important demand which is performance now ironically someone might think that security should be thought about first but unfortunately security is thought about last performance is thought about first and that is why i wanted to bring up security first and then performance less because what happens is most cases we are so much focused on getting the best performance making everything accessible over a very good internet supply that we forget about these features which really hit us once we have gone live so that's why i wanted to make sure that when we are building the infrastructure the network is foolproof protected secure with seamless connectivity and that it's like a performance at last because that anyways will be our priority when we build our application so what should be done to achieve the performance now when you apply there is a myth still in people that when you apply so many security practices the performance degrades because of course i cannot ping a server i cannot get into the server only port 80 or 443 is opened up actually in today's day and age it is no longer open up only 440 is opened up and people will think okay do i really get the performance that i need
now yes you will get it but you will have to add certain tuning to your network design and what is the design parameter that we should focus about that is about dedicated peer to peer connectivity once you build your network and your application is hosted onto it in today's scenario with so many third party on sas products available we will tend to integrate with all of these providers very rarely today's companies or startup ecosystem will develop something which is already built there is a saying say don't reinvent the wheel right if there is a payment gateway you will not build your own payment gateway if you are a bank you would like to use what is the best payment gateway available or what is the best switching provider available right and when you integrate let's say you have build your services your functionality or ecommerce portfolio and you want to integrate with this card provider make sure that there is a dedicated peer to peer connectivity do not connect to them over bare minimum internet because there are certain issues in it if you just keep things open up to internet people will only think about oh there may be security issue so let's set up a secure or encrypted channel so what do we do we set up a site to site vpn now site to site vpn might give you the protection of your data in transit however does not guarantee the bandwidth and latency that your data transmission needs the second important thing is data replication now in today's 24 seven always on services one or more copy of your data has to be replicated across your sites and that's why you see in clouds the multi az or multi region model and clouds like google have their own dedicated network to provide you multi region connectivity now as long as you are in a single cloud provider you are in their network network setup so you are in a dedicated internet however once your traffic leaves your cloud provider and let's say goes to another cloud provider which may or may not be the same one where your service are hosted or that it may be even a data centre with legacy systems you will have to make sure that there is the bandwidth is stable and latency is as minimum at least less than one millisecond otherwise you will see the performance hit in your application so what is that that you need for your performance make sure that there is a dedicated direct connection between your peers no matter what peer it is it may be your own internal offices and your services or storm cloud it may be your cloud services to your satellite centres or your kiosk or it may be your third party partners this dedicated connectivity can be achieved to various way nowadays most cloud providers give physical level direct links between the clouds to interconnect the clouds and these provide a lot of advantages as i said minimum latency better performance privacy of data in transit so even after you have established vpn there are still chances that there is a man in the middle attack so if you have a dedicated connection you make sure that no one else traffic else's traffic is flowing to the dedicated connection so you protect your privacy of data in transit in the first section i described about protecting your ip which is data at rest and this is data in transit it also gives you fault tolerance so you do not have to depend on your internet connection let's say your internet is down then everything will be down if that connectivity is not available right imagine you are a bank a mobile application and suddenly internet connection is down and people are able to transfer money using any ft so that is immediately a setback for a bank banking application so dedicated peer to peer connectivity gives fault tolerance dual isp models which help you mitigate these issues that might come up and this is also important for multi site data replication sometimes you may want to keep a second copy of your data in on your on premise or sometimes you may not be able to move your data core data onto cloud due to regulations and compliances and you need to meet the data replication across sites like say between you have a dc and a dr and still date i have experienced very recently that no matter you say that you are on cloud you are in a multi az setup the regulators of that country will ask you to provide evidence that your data replication happens correctly and there is no loss of data and that is why when you are having services clusters of data to replicate across the data centres you need to make sure that the latency between those data centre is less than one millisecond if the latency between the data centre is more than one millisecond then your data replication will fail and we have observed it with one of our setups wherein we chose the dc dr from the same vendor and it was all promised all going on fine until our data size exceeded 500 gb and we were close to one tb of data and the replication suddenly stopped one day and then to get the data replication working we had to literally wait when the latency is less than one millisecond and once when asked to the vendor they say that oh we did not actually you did not ask for a dedicated connections you just asked for a connection so we gave you a connectivity so that you know brought me to thinking that we should really validate with the cloud providers that what is the latency between their data centre well in most of the cloud providers at least to the top most one aws or gcp the data centre at latency is handled by their own private setup so you can be rest assured so that is why make sure that you always if you are on a single cloud provider you might have already achieved dedicated peer to peer connectivity so it is better to use a direct connection and in respect to cloud suppose you are you are hosted on same cloud and your partner is also string same cloud instead of having a vpn connection with them try to see if they can set up a of deep tcp right now technically it is possible but there may be some compliance need and you know approvals that is needed so when you are on a cloud platform and if you are on a standard cloud platform utilise the vpc peering functionality instead of a site to site vpn or vpn gateway functionality so that your traffic remains private on two dedicated links of that cloud provider and utilise their bandwidth as much as you need.
So that completes the four design parameters that I wanted to cover for the network factor.
So in the 10 factor infrastructure, for security privacy accessibility and performance what does a network factor say -
To summarise a network factor says that a robust infrastructure needs to be secure and seamless connectivity across all systems and services. To secure connectivity network should be segregated or subnet it with respect to incoming and outgoing access using firewall policies to secure perimeter there should be a single secure entry point for traffic landing onto your services from internet and for seamless connectivity and better performance make sure you use dedicated private links for peer to peer connectivity and data transmission
With that note i would like to conclude today's episode let me know how did you like it and if you would like to know more about networks share your feedbacks your queries your thoughts on cloudkata.com the transcript of this website along with other reference materials and examples that i spoke about the deck that i discussed about will be available on cloudkata.com.
So do share your feedback and queries and subscribe to the shooter to get the latest updates i'll be back next friday with another brand new episode and more infrastructure is on Cloudkata season one next episode will be about the second factor that is system so stay tuned to continue on to this journey of anatomy of modern infrastructure through 10 factor infra talk to you next week on Cloudkata Mastering Modern Infrastructure.
Till then take care stay healthy stay safe this is your infra coach Kamalika signing off bye bye
Sign up to receive email updates
Enter your name and email address below and I'll send you periodic updates about the podcast.
Other episodes:
K013: A Bank On Cloud – Part 2 : The People Side Of The Story
Distance by pandemic united by goal – How I enabled a completely remote in-house DevOps community from scratch for a bank, in the midst of a pandemic when the entire team was locked down across multiple cities and timezones.
K012: A Bank On Cloud – Part 1 : The Tech Side Of The Story
Designing & developing modern infrastructure for one of Indonesia’s first cloud-native Digital Banks. In this two part story, this episode covers the tech side of the project.