Network protocols and proxies are among the most fundamental building blocks of system design. And if you’re preparing for a system design interview (or just want to be more familiar with system design), then understanding these topics is a good place to start.
Below, we’ll briefly define network protocols and proxies, and then we’ll dig into the essential details that you’ll need to know about each one. Here’s what we’ll cover:
- Network Protocols
- Example network protocol and proxy questions
- System design interview preparation
Networks of computers make up everything on the internet - from local private networks enabling communication between web service components to the global public network linking services with billions of users.
Network protocols make it possible for any networked computers to talk to each other, no matter where they are or what hardware or software they’re running. They do this by establishing standard forms for sending and receiving data over the network’s physical infrastructure.
The most common network protocols are time-tested, widely used, and well understood. Knowing what these protocols do and when they are used will be a key tool to being able to understand and design technical systems.
Before we describe the specific protocols, we’ll need some context on where they fit in network infrastructure. We’re going to go over the two most common models of networking that present a framework for how internet communication works. These models are widely used to categorize protocols, hardware, and many other system components, so it’s important to understand and be able to talk about them.
1.1 Networking Models
Both of these models we’re about to discuss divide the network stack into “layers”. These layers provide increasingly complex guarantees that the next layer can build off of. This enables communications at the highest layer to deal with application-related concerns like the meaning of the data, and not hardware-related issues like data loss from faulty wires.
There are protocols at each layer of the networking models. Protocols work together by treating the output message of the previous protocol as input data, and won’t modify or interfere with it. This way each protocol can do its portion of the work to transmit data over the network, and rely on the other protocol’s guarantees without having to know how they work.
1.1.1 TCP/IP Model
TCP/IP, also known as the Internet Protocol Suite is one of the oldest networking models and greatly influenced the development of the internet as we know it. It’s named after its two primary protocols, TCP and IP. This model has four layers and associated protocols that define the abstraction:
- Link Layer: protocols relevant to a local network, called a “link” or “IP network”. These computers are physically wired on the same network and don’t need a router to communicate, e.g. MAC addresses
- Internet Layer: protocols relevant to connecting different IP networks, e.g. IPv6
- Transport Layer: protocols for direct communication channels over the internet, e.g. TCP
- Application Layer: protocols relevant to applications sending data to and from users over the internet, e.g. HTTP
1.1.2 OSI Model
The Open Systems Interconnection model (OSI model) is a well established model of the internet that is a useful conceptual tool. It has 7 layers, which allows more specificity when talking about networking than with the TCP/IP model. OSI doesn’t specify protocols, and many popular protocols don’t fit in one or more of its layers:
- Physical Layer: transmission of raw data on hardware. e.g. Ethernet
- Data Link Layer: establishing connection for data transfer between computers in the same physical network, e.g. MAC addresses
- Network Layer: establishing connection for data transfer in packets between computers in different networks. e.g. IP
- Transport Layer: transferring data with reliable quality. e.g. TCP
- Session Layer: managing data transfer sessions between computers.
- Presentation Layer: translating lower layer data formats for use by the application layer.
- Application Layer: application-enabling functionality. e.g. HTTP
Now that we’ve gone over the TCP/IP and OSI models, let’s go into how some of the most commonly-used protocols work, and how they fit into these models.
The Internet Protocol (IP) is the key protocol that allows computers in different physical networks to communicate with each other. It’s defined in the Internet Layer of the TCP/IP model, and corresponds to approximately Layer 3 of the OSI model.
IP defines and works with the fundamental data unit of a packet. It also provides addressing, in the form of IP addresses, so packets can be correctly routed from their source to destination.
An IP packet consists of a header and some data. The IP header contains information including the source and destination address. The data is formatted and contains whatever is useful for the next layers.
The Transport Control Protocol (TCP) manages reliability of data transferred with IP. TCP is defined in the Transport layer of the TCP/IP model, and corresponds to approximately Layer 4 of the OSI model.
TCP works by first establishing a connection between the client and server, and then transferring data. It builds on IP to add guarantees that data messages are delivered reliably, in order, and checked for errors.
If the application needs faster data transfer and doesn’t require a confirmed connection it can use the similar User Datagram Protocol (UDP) instead. UDP works at the same layer as TCP, but has no guarantees about data delivery or ordering, which works well for situations like broadcasting.
TCP is a fast and reliable protocol, and as such many other protocols build off of it, like TLS encryption and WebSockets.
The Hypertext Transport Protocol (HTTP) allows applications to view and modify data over the network. HTTP corresponds to the Application layer of the TCP/IP model, and Layer 7 of the OSI model.
To use HTTP, and it’s secure variant HTTPS, a client makes a coded request to a server which sends back a coded response. HTTP requests and responses are divided into the header which contains metadata about the request and the body which contains data in some specified format (e.g. JSON).
The codes included in HTTP requests and responses convey information about the kind of request or response. HTTP methods (verbs) specify what kind of request is being made. We won’t list all 9 methods here, but the key ones to know are:
- GET - a request to read data
- POST - a request to create the data in the body
- PUT - a request to create or update data at a specified url with the data in the body
- DELETE - a request to delete the data at the specified url
- OPTIONS - a request for a listing of the HTTP methods a server supports
HTTP status codes that indicate what kind of response has been sent back. They are 3 digit codes that are grouped by the first number in the code. Each code has a corresponding “reason phrase” to make human interpretation easier.
- 1XX - informational response e.g. `102 Processing`
- 2XX - successful response e.g. `200 OK`
- 3XX - redirection response e.g. `302 Found`
- 4XX - client error response e.g. `404 Not Found`
- 5XX - server error response e.g. `500 Internal Server Error`
Other features of HTTP include sessions, which can be established and maintained either server side, or client side with HTTP cookies. HTTP also supports authentication in a variety of ways.
A proxy is a server that sits between a client and application server to provide some intermediary service to the communication. There are two kinds of proxies that provide different services: forward proxies and reverse proxies.
2.1 Forward Proxies
A forward proxy sits between a pool of clients and the public internet. The goal of a forward proxy is to protect the particular client pool by filtering outgoing requests and incoming responses.
The common use cases for forward proxies are:
- Blocking malicious websites
- Anonymizing network traffic by using the IP address of the proxy instead of the client
For example, a school network might decide to block requests going out to certain social media websites. Alternatively a business network might try to mitigate phishing attacks by not allowing employee requests to known malicious domain names.
2.2 Reverse Proxies
A reverse proxy sits between the public internet and a pool of servers. Because of their location in the system as an intermediary, reverse proxies can provide a number of services, including:
- Anonymizing the cluster servers
- SSL termination
- Load balancing
- Filtering requests
- Attack prevention (e.g. DOS detection)
For example, if a company wanted to expose a public API for querying data, but not modifying it, they could filter out any requests that used an HTTP verb other than GET before passing the requests on to the servers that actually process and generate the responses.
As another example, a service could use a reverse proxy to handle TLS termination (the description of HTTPS requests) so that the application servers don’t have to handle encryption/decryption. The proxy would then pass on the requests to the servers within a private network so the communication is still secure.
The questions asked in system design interviews tend to begin with a broad problem or goal, so it’s unlikely that you’ll get an interview question entirely about network protocols or proxies.
However, you may be asked to solve a problem where these topics will be relevant. As a result, what you really need to know is WHEN (or IF) you should bring them up and how you should approach them.
To help you with this, we’ve compiled the below list of sample system design interview questions, where network protocols or proxies are relevant.
- Design Twitter (Read the answer)
- Design a highly-scalable system (Read the answer)
- Design a Web Crawler (Read the answer)
- TCP vs. UDP (Read the answer)
Network protocols and proxies each describe one element of a broader system. But to succeed on system design interviews, you’ll also need to familiarize yourself with a few other concepts. And you’ll need to practice how you communicate your answers.
It’s best to take a systematic approach to make the most of your practice time, and we recommend the following steps:
4.1 Learn the concepts
There is a base level of knowledge required to be able to speak intelligently about system design. To help you get this foundational knowledge (or to refresh your memory), we’ve published a full series of articles like this one, which cover the primary concepts that you’ll need to know:
- Network protocols and proxies
- Latency, throughput, and availability
- Load balancing
- Leader election
- Polling, SSE, and WebSockets
- Queues and pub-sub
We’d encourage you to begin your preparation by reviewing the above concepts and by studying our system design interview guide, which covers a step-by-step method for answering system design questions. Once you're familiar with the basics, you should begin practicing with example questions.
4.2 Practice by yourself or with peers
Next, you’ll want to get some practice with system design questions. You can start with the examples listed above, or with the example questions in our system design guide.
We’d recommend that you start by interviewing yourself out loud. You should play both the role of the interviewer and the candidate, asking and answering questions. This will help you develop your communication skills and your process for breaking down questions.
We would also strongly recommend that you practice solving system design questions with a peer interviewing you. A great place to start is to practice with friends or family if you can. If you don't have anyone in your network who can interview you, then you might want to check out our our system design mock interview peer group.
4.3 Practice with ex-interviewers
Practicing with peers can be a great help, and it's usually free. But, at some point you'll start noticing that the feedback you are getting from peers isn't helping you that much anymore. Once you reach that stage, we recommend practicing with ex-interviewers from top tech companies.
If you know someone who has experience running interviews at Facebook, Google, or another big tech company, then that's fantastic. But for most of us, it's tough to find the right connections to make this happen. And it might also be difficult to practice multiple hours with that person unless you know them really well.
Here's the good news. We've already made the connections for you. We’ve created a coaching service where you can practice system design interviews 1-on-1 with ex-interviewers from leading tech companies. Learn more and start scheduling sessions today.
This is just one of 9 concept guides that we've published about system design interviews. Check out all of our system design articles on our Tech blog.