Stanford CS144 Computer Networks Podcast: Introduction to Computer Networks - Layered Model, ARP, IP, and the Application Layer

When we type a URL into a browser and press Enter, a gorgeous webpage appears before our eyes. Behind this seemingly instantaneous action lies a complex yet elegant system composed of countless protocols, routers, and links, silently at work. This system is the Internet.

This article will build on the core ideas of Stanford University's CS 144 course and take you deep into the cornerstones of the Internet—from the grand blueprint of network design (the four-layer model), to its core protocol (IP), to the subtle details of address allocation and routing.

Macro Design: The Four Pillars of the Internet

To manage the complexity of network communications, computer scientists have adopted the layered approach from software engineering and divided the functionality of the entire internet into four layers. This layered design not only reduces system complexity but also allows each layer to evolve independently, greatly promoting technological innovation.

The Four-Layer Internet Model

Imagine sending a package. You first package the item (data) (application layer), then write the recipient and sender information (transport layer), then affix the specific street address (network layer), and finally, the courier chooses the specific transportation and route (link layer) to deliver it. The four-layer model of the Internet is similar to this:

Application Layer : This is the layer we interact with most often. It defines how applications communicate with each other. For example, a web browser uses HTTP a protocol to retrieve web pages, and an email client uses SMTP a protocol to send emails. This layer is concerned with the specific semantics of the communication content.
The Transport Layer is responsible for end-to-end data transmission. It provides a reliable communication channel for the application layer. The most well-known protocol is the TCP Transmission Control Protocol, which ensures that data arrives at its destination intact and in order, and handles network congestion. For applications with high real-time requirements but a tolerance for minor data loss, such as video calls, the lighter UDP User Datagram Protocol is used.
The Network Layer : This is the heart of the entire internet. It is responsible for relaying data packets from the source host across multiple networks all the way to the destination host. The core protocol of the Network Layer is the Internet Protocol (IP) . It defines the address format and routing rules, gluing countless subnetworks around the world into a unified internet.
Link Layer : This is the lowest layer, responsible for transmitting data over a single physical link (for example, between your computer and a router, or between two routers). Ethernet and Wi-Fi, which we are familiar with, are both link layer technologies.

Encapsulation: The Art of Packaging Layer by Layer

The layered model introduces a key operation: encapsulation. When application data is passed down, each layer adds its own "header information" to it, just like packaging.

The journey of a HTTP request is as follows:

The application layer generates HTTP data.
The transport layer packages it into a TCP segment and adds TCP a header (including information such as the port number).
The network layer TCP packages the segments into a IP datagram and adds IP a header (containing information such as source/destination IP addresses).
The link layer IP packages the datagram into a frame, adds a link layer header (including information such as the MAC address), and finally sends it out through the physical medium.

This process can be clearly represented by the following diagram:

+------------------------------------------------------------+
| 链路层头部   | 网络层头部 | 传输层头部   | 应用层数据 | 链路层尾部  |
| (Ethernet) |   (IP)   |   (TCP)    |  (HTTP)  | (Ethernet) |
+------------------------------------------------------------+
             <----------------------- IP Datagram ----------->
                         <----------- TCP Segment ----------->1.
2.
3.
4.
5.
6.

The receiver will perform the opposite "unpacking" process, stripping off the header layer by layer, and finally handing over the original application layer data to the application.

In Ethernet, a frame consists of three parts:

[ 以太网头部 (Ethernet Header) ] + [ IP 数据报 (Payload) ] + [ 以太网尾部 (Ethernet Trailer) ]1.

Ethernet Header : Contains the destination MAC address, source MAC address, and type field (EtherType).
IP datagram (Payload): is the data of the entire network layer (IP header + upper layer data).
Ethernet Trailer : Usually contains only one field, the FCS (Frame Check Sequence) , which is used to detect link layer transmission errors. It is 4 bytes long.

So the link layer tail you wrote in the figure actually refers to FCS . It does not necessarily have a complex meaning. It is just a check field used by the link layer for error detection.

Some textbooks/diagrams omit it because it is not core to understanding the layered structure, but strictly speaking it is there.

HTTP is at the application layer , and its "data" is what we really care about (web page content, request messages, etc.).

During encapsulation, HTTP data is packaged layer by layer:

HTTP 数据 → 加上 TCP 头部 → 成为 TCP Segment
TCP Segment → 加上 IP 头部 → 成为 IP Datagram
IP Datagram → 加上以太网头部和尾部 → 成为 Ethernet Frame1.
2.
3.

so:

In the eyes of TCP, HTTP is "data".
From the perspective of IP, the entire TCP segment (including HTTP) is "data".
From the perspective of Ethernet, the entire IP datagram is "data".

This is the key to layered encapsulation: each layer only looks at the header it needs to process, and treats the upper layer as "data" .

The core of the network: the "thin waist" philosophy of the IP protocol

In the four-layer model, the network layer IP protocols play an irreplaceable role, known as the "thin waist" of the internet. This means that no matter how upper-layer applications (HTTP, FTP, DNS) change, or how lower-layer link technologies (Ethernet, Wi-Fi, 5G) evolve, they must all IP be connected through this "waist" of protocols.

There is a profound philosophy behind this design.

IP Service Model: Simple is Beautiful

IP The designers of the protocol chose an extremely simple service model, the core features of which can be summarized as follows:

Unreliable : IP The protocol does not guarantee that a data packet will be delivered. It may be lost, damaged, delayed, or duplicated during transmission.
Best-Effort : Although unreliable, IP the protocol promises to do its best to deliver data packets and will not drop them for no reason (unless the network is congested or the packet lifecycle ends).
Connectionless : IP The protocol doesn't establish any connection between the source and destination before sending data. Each packet is treated independently and finds its own path, which allows the routers at the core of the network to be very simple and efficient.

End-to-End Principle

Since IP the protocol is so "unreliable", how can we achieve reliable communication? The answer is the end-to-end principle .

This principle states that complex functions such as reliability and congestion control should be the responsibility of the two endpoints of communication (i.e., the user's computer), rather than by the router in the middle of the network.

The benefits of doing this are enormous:

Keep the network core simple : Routers only need to focus on forwarding packets quickly, which allows them to do so cheaper, faster, and more reliably.
Flexible application customization : Applications can choose to implement different levels of reliability on the end system based on their needs. File transfers that require high reliability can use this TCP, while real-time video can use a simpler one UDP, leaving reliability implementation to the application itself.
Easy to innovate : New application functions can be quickly deployed and iterated on end systems without changing the core devices of the entire Internet.

List of IPv4 header fields

To achieve the above functions, each IP datagram has a header that contains all the information required for routing and processing. The following are the main fields of the IPv4 header:

0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version|  IHL  |Type of Service|          Total Length         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|         Identification        |Flags|      Fragment Offset    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Time to Live |    Protocol   |         Header Checksum       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                       Source Address                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Destination Address                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Options                    |    Padding    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.

Version : Indicates the protocol version. For IPv4, this value is 4.
Time to Live (TTL) : An 8-bit value that decrements by 1 after each router. When the TTL is reached 0 , the packet is discarded. This prevents packets from looping endlessly in the network due to routing errors.
Protocol (Protocol ID) : Tells the receiver IP which transport layer protocol the packet's payload should be handled by. For example, 6 delegate TCP, 17 delegate UDP.
Source/Destination Address : A 32-bit IP address that serves as the basis for the entire forwarding process.
Header Checksum : Used to detect IP whether the header is damaged during transmission.

Addresses and Orders: Network Communication Specifications

Byte Order: Big-Endian and Little-Endian Agreement

When we store a multi-byte number in memory (such as a 32-bit integer), there are two possible arrangements:

Big-endian : The most significant byte is stored at the lowest memory address. This is consistent with human reading habits.
Little-endian : The least significant byte is stored at the lowest memory address. Intel x86 architecture processors are typical little-endian processors.

Let's use a 32-bit integer 0x12345678 as an example:

内存地址         大端序存储
低 ----> 高     低 ----> 高
+------+        +----+----+----+----+
| 0x100 |        | 12 | 34 | 56 | 78 |
+------+        +----+----+----+----+
| 0x101 |       小端序存储
+------+        +----+----+----+----+
| 0x102 |        | 78 | 56 | 34 | 12 |
+------+        +----+----+----+----+
| 0x103 |
+------+1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.

Different machines may have different byte orders. In order to ensure the correctness of communication, the Internet Protocol stipulates that all binary data transmitted on the network must use big-endian order , which is also called network byte order .

To facilitate developers, network libraries in programming languages such as C provide functions such as htons() (Host to Network Short) and ntohl() (Network to Host Long) to automatically convert between host byte order and network byte order.

IPv4 Addresses and CIDR

An IPv4 address is a 32-bit number, usually written in dotted decimal notation (e.g. 171.64.0.0). For efficient management and routing, IP the address is divided into a network portion and a host portion.

The early classification of Class A, B, and C addresses was very rigid, resulting in a large amount of address waste. Today, we use Classless Inter-Domain Routing (CIDR) to allocate and represent IP address blocks.

CIDR uses the format of "IP address/prefix length", for example 171.64.0.0/16. Here, /16 the first 16 bits of the address are the network portion, and the last 16 32-16=16 bits are the host portion. This prefix length can be arbitrary, greatly increasing the flexibility of address allocation.

Longest Prefix Match

When a router receives a data packet, how does it decide which port to forward it out? The answer is to query its internal forwarding table and follow the Longest Prefix Match (LPM) principle.

A router's forwarding table consists of a series of CIDR entries. When a packet arrives, the router uses its destination IP address to match all entries in the table and selects the one with the longest prefix .

Let’s look at an example:

路由器转发表
+--------------------+--------------+
|       前缀          |    下一跳     |
+--------------------+--------------+
| 128.10.0.0/16      |   接口 A      |
| 128.10.1.0/24      |   接口 B      |
| 0.0.0.0/0          |   接口 C      |  <-- 默认路由
+--------------------+--------------+1.
2.
3.
4.
5.
6.
7.
8.

If a packet's destination address is 128.10.2.5, it matches both /16 and /0, but /16 is longer, so the packet will be forwarded from interface A.
If the destination address is 128.10.1.5, it matches both /16, /24 and /0. /24 is the longest prefix, so the packet will be forwarded from interface B.
If the destination address is 192.168.1.1, it only matches /0(default route), so it will be forwarded from interface C.

Address Resolution Protocol (ARP)

IP Addresses operate at the network layer, solving global routing problems. However, once a packet reaches the local area network (LAN), how can it be accurately delivered to the destination machine? This requires a link layer address, commonly known as a MAC address .

The Address Resolution Protocol (ARP) is IP a bridge used to establish the mapping relationship between addresses and MAC addresses.

The workflow is as follows:

Host A wants to send data to host B (IP: ) in the same LAN 192.168.1.100, but it only knows B's IP address but not its MAC address.
Host A first checks its own ARP cache table to see if 192.168.1.100 there is a record.
If not found, host A will broadcast an ARP request packet in the LAN, the content of which is: "Whose IP address is it 192.168.1.100? Please tell me your MAC address."
All hosts in the LAN will receive this request, but only host B will respond.
Host B sends an ARP response packet to Host A, which reads: "My IP MAC 192.168.1.100address is XX:XX:XX:XX:XX:XX."
After receiving the response, host A knows the MAC address of host B and stores this mapping in its own ARP cache. It can then encapsulate the link layer frame and send data.

From the macro-level layered design to IP the protocol's concise yet powerful "thin waist" model, to the specific implementations of byte ordering, address allocation, route matching, and address resolution, we see the wisdom and trade-offs inherent in Internet design. By adhering to the end-to-end principle of "keeping the core simple and pushing complexity to the edges," the Internet has built a stable and innovative global network. Understanding these fundamental principles provides a solid foundation for delving deeper into more advanced networking concepts, such as TCP reliable transport and DNS domain name resolution.

In this discussion, we've explored the underlying foundations of the internet: from the elegant four-layer model to the IP protocol as its "thin waist." We've understood how packets are addressed and routed within the network. Now, let's shift our focus upwards to examine the applications that run on this solid foundation and explore how this great global network has evolved, how it is governed, and where it's headed.

Applications in all shapes and sizes: different communication models

The core requirement of nearly every network application, from web browsing to video calls, boils down to a simple model: establishing a reliable two-way byte stream between two computers . This means that a program can write data to one end of this "pipe," just like reading or writing a local file, and be confident that the data will appear intact and in order at the other end. TCP Protocols were created to address this need.

However, different applications use very different architectures to achieve this goal.

World Wide Web: Classic Client-Server Model

This is the model we are most familiar with. When you visit a website:

Model : Strict client-server model. Your browser is the client, and the machine running the website is the server.
Protocol : HTTP (HyperText Transfer Protocol).
process :

Your browser (client) initiates a connection to the web server's URL 80 or 443 port TCP .
After the connection is established, the browser sends a HTTP GET request for a specific web resource.
The server processes the request and sends back a HTTP response that contains a status code (such as 200 OK) and HTML the content of the web page.
The browser receives and parses it HTML, and finally renders the page we see.

This model is simple and intuitive, but all the pressure is concentrated on the server. If the number of visits is too large, the server will be overwhelmed.

BitTorrent: A decentralized peer-to-peer model

To solve the bottleneck of centralized downloading, BitTorrent takes a completely different approach:

Model : Peer-to-Peer (P2P) model. In this network, each participant is both a downloader (client) and an uploader (server).
Core concepts :

Pieces : A large file is divided into many small data blocks.

Swarms : A collection of all users who are downloading or sharing the same file.

Tracker server : A coordinator. It does not store the files themselves, but is only responsible for recording which users (peers) are online in the current "group" and which file blocks they each own.

process

You open a .torrent file, and your client contacts the Tracker server to get a list of users currently online.
Your client TCP connects directly to other users.
You download different file chunks from different users, and at the same time, upload the file chunks you have downloaded to other people who need them.

Through this "everyone for me, I for everyone" model, the more people download, the faster the speed, which greatly disperses the bandwidth pressure.

Skype: A hybrid model for a complex reality

Skype's goal is to make voice and video calls between users as smooth as possible. However, a significant obstacle is Network Address Translator (NAT) . Most home and business devices are located behind NAT routers. These devices have private IP addresses and are not directly accessible from the public internet, creating significant challenges for P2P communication.

To penetrate NAT, Skype has designed a clever hybrid model:

Rendezvous Server : When you log in to Skype, your client will establish a connection with a rendezvous server on the public Internet. This server acts as an "introducer".
Reverse Connection : When Alice wants to call Bob who is behind NAT, direct connection is not possible.

Alice's client tells the rendezvous server: "I want to call Bob."
The server finds the connection that Bob has established and uses it to send a command to Bob's client: "Alice wants to call you, please initiate a connection to Alice's public IP and port."
Bob's client actively initiates TCP a connection to Alice. Because this connection is initiated from within the NAT, it can successfully penetrate the NAT.

Relay Server : What if Alice and Bob are both behind complex NATs and even reverse connections are unsuccessful? Skype will use the last resort:

Alice's and Bob's clients each establish a connection with a relay server on the public Internet.
The call data packets of both parties are first sent to the relay server, and then forwarded to the other party by the server.

The schematic diagrams of these three connection methods are as follows:

// 理想情况：直接连接
Alice <----------------------> Bob

// 方案二：反向连接 (Bob 在 NAT 后)
1. Alice -> Rendezvous Server -> Bob
2. Bob ----------------------> Alice (Bob 主动发起连接)

// 方案三：中继 (双方都在 NAT 后)
Alice <--------> Relay Server <--------> Bob1.
2.
3.
4.
5.
6.
7.
8.
9.

Skype's success is largely due to this pragmatic, multi-tiered connection strategy, which prioritizes the most efficient P2P connections and gracefully degrades to server-assisted mode if they fail.

The Pulse of the Internet: Evolution, Governance, and Challenges

The Internet is not a static technological artifact; it is a living organism that is constantly evolving and governed by a global community.

Who controls the internet?

The Internet has no single "owner"; its technical standards and resource allocation are managed collaboratively by multiple international organizations:

IETF (Internet Engineering Task Force) : The Internet Engineering Task Force is responsible for developing the vast majority of Internet protocol standards (such as IP, TCP, and HTTP). The IETF is known for its unique "meritocracy of ideas" and pragmatic culture of "rough consensus and running code."
W3C (World Wide Web Consortium) : The World Wide Web Consortium focuses on standards at the Web application level, such as HTML, CSS etc.
ICANN (Internet Corporation for Assigned Names and Numbers) : The Internet Corporation for Assigned Names and Numbers. It manages the global Domain Name System (DNS) and the allocation of IP addresses.

Evolution Case: The SIP and VoIP Revolution

In the late 1990s, with the increase in network bandwidth and computing power, voice transmission over IP networks, also known as VoIP (Voice over IP) , became possible. One of the key technologies in this revolution was the Session Initiation Protocol (SIP) .

SIP stems from the concept of soft switching : replacing the control logic of traditional, expensive, and cumbersome telephone switches with software running on general-purpose computers. SIP is a signaling protocol that establishes, modifies, and terminates calls (such as voice calls or video conferences).

SIP has become a bridge between the traditional telephony world and the IP network. Services like Skype Out/In use the SIP protocol, allowing you to call a regular phone from your computer, and vice versa. However, to gain widespread market acceptance and acceptance among traditional telecom operators, SIP introduced some design compromises, such as allowing intermediary servers to view signaling information, which deviated somewhat from the pure end-to-end principle of the Internet.

Emerging Trends and Future Challenges

The evolution of the Internet never stops, and new technologies and challenges continue to emerge:

Software Defined Networking (SDN) : Similar to soft switching, SDN separates the network's control plane (which determines how packets are routed) from the data plane (which actually forwards packets). This makes network management more centralized, programmable, and flexible, posing a disruptive challenge to traditional network equipment manufacturers.
Privacy and Security : Events like the Prism scandal have sparked global concerns about online surveillance. The IETF has established PerPass working groups such as [unclear context needed] to explore ways to enhance privacy protections on the internet through encryption and improved protocol design.
Content Distribution Networks (CDNs) : Traffic giants like Netflix and YouTube deploy CDN servers globally, caching content closer to users, significantly improving access speed and network efficiency. In the future, collaboration between different CDNs will become a new research hotspot.
The pains of traditional network integration : As the internet and traditional telephone networks converge, some old problems are finding new fertile ground. For example, robocalling, exploiting the low cost and anonymity of VoIP technology, has become increasingly rampant. Establishing a cross-network identity authentication system to track and prevent these harassing behaviors has become a pressing legal and technical challenge.

Conclusion

From the sophisticated design of applications to the collaborative governance of global communities, to the never-ending technological innovation and challenges, we see that the internet is more than just a set of cold protocols; it is a vibrant, constantly improving ecosystem. While connecting information, it also profoundly shapes our society. Understanding its past and gaining insight into its present allows us to better participate in and shape its future.

新聞