TCP/IP Networking
This chapter introduces the networking concepts you need to understand before using Corosio. If you’re already comfortable with TCP/IP, sockets, and the client-server model, you can skip to I/O Context.
What is a Network?
A network is simply computers talking to each other. Your laptop sending a request to a web server, two game consoles playing together, a phone streaming video—all involve computers exchanging data over a network.
Local vs Remote Communication
Programs on the same machine communicate through shared memory, pipes, or local sockets. Network programming adds complexity because the communicating programs run on different machines, potentially thousands of miles apart, connected by unreliable links with varying latency.
The Need for Protocols
When two programs exchange data, they need to agree on the rules: How do you start a conversation? How do you know when a message ends? What happens if data gets corrupted? These rules form a protocol.
The Internet uses a family of protocols called TCP/IP. Corosio implements the TCP portion, giving your programs reliable communication channels over the network.
Physical Foundation
Data travels across networks as electrical signals, light pulses, or radio waves. Understanding the physical layer helps explain why network programming has certain constraints.
Signals on Wires
At the lowest level, networks transmit bits as voltage changes on copper wire, light pulses in fiber optic cables, or radio waves for wireless. These physical signals have limitations: maximum distance, susceptibility to interference, and finite propagation speed.
NICs and MAC Addresses
A Network Interface Card (NIC) connects your computer to the network. Each NIC
has a globally unique Media Access Control (MAC) address—48 bits, typically
written as six hexadecimal pairs like 00:1A:2B:3C:4D:5E.
MAC addresses identify devices on the local network segment. They matter for switches and local routing but are invisible to your application code.
Ethernet
Ethernet is the dominant local area network (LAN) technology. It defines how devices share a network medium, how frames are formatted, and how collisions are handled. Modern Ethernet runs at 1 Gbps or faster over twisted-pair cables.
Network Devices
| Device | Purpose |
|---|---|
Hub |
Broadcasts all traffic to all connected devices (obsolete) |
Switch |
Forwards traffic only to the destination MAC address |
Router |
Forwards traffic between different networks using IP addresses |
Your home router typically combines a switch (for local devices) with a router (connecting to your ISP).
Network Models
Networks use layered architectures where each layer provides services to the layer above while hiding implementation details.
Why Layers?
Layering provides modularity. You can replace Ethernet with Wi-Fi without changing your application. TCP can run over any network technology that supports IP. Each layer has a well-defined interface to the layers above and below.
The OSI Model
The Open Systems Interconnection (OSI) model defines seven layers:
| Layer | Name | Function |
|---|---|---|
7 |
Application |
User-facing protocols (HTTP, SMTP, DNS) |
6 |
Presentation |
Data encoding, encryption |
5 |
Session |
Connection management |
4 |
Transport |
End-to-end communication (TCP, UDP) |
3 |
Network |
Routing across networks (IP) |
2 |
Data Link |
Local network transmission (Ethernet) |
1 |
Physical |
Bits on the wire |
The OSI model is useful for discussion but doesn’t perfectly map to real protocols.
The TCP/IP Model
The Internet uses a four-layer model that better reflects actual implementation:
| Layer | Purpose | Examples |
|---|---|---|
Application |
Your program’s logic |
HTTP, DNS, SMTP |
Transport |
Reliable delivery between processes |
TCP, UDP |
Internet |
Routing packets across networks |
IP (IPv4, IPv6) |
Link |
Physical transmission |
Ethernet, Wi-Fi |
Corosio operates at the Transport layer, providing TCP sockets. Your application logic sits above, sending and receiving data through Corosio’s abstractions.
Encapsulation
When you send data, each layer wraps it with its own header:
Application Data: "Hello"
↓
TCP adds header: [TCP Header][Hello]
↓
IP adds header: [IP Header][TCP Header][Hello]
↓
Ethernet adds header/trailer: [Eth Header][IP Header][TCP Header][Hello][Eth Trailer]
The receiving side reverses this process, stripping headers at each layer until the application data emerges.
Internet Protocol (IP)
IP handles routing packets across networks. It provides addressing (where to send) and fragmentation (breaking large packets into smaller pieces).
IP Addresses
An IP address identifies a host on the network.
IPv4 addresses are 32 bits, written as four decimal numbers separated by
dots: 192.168.1.100. There are about 4 billion possible addresses—not enough
for the modern Internet.
IPv6 addresses are 128 bits, written as eight groups of hexadecimal digits
separated by colons: 2001:0db8:0000:0000:0000:0000:0000:0001. Leading zeros
can be omitted, and consecutive groups of zeros can be replaced with ::, so
this becomes 2001:db8::1.
Special Addresses
| Address | Meaning |
|---|---|
|
Loopback—the local machine |
|
Loopback—the local machine |
|
All interfaces (for binding) |
|
All interfaces (for binding) |
Subnets and CIDR
Networks are divided into subnets using a netmask. The netmask defines which bits of the address identify the network versus the host.
CIDR (Classless Inter-Domain Routing) notation combines the address and prefix
length: 192.168.1.0/24 means the first 24 bits identify the network, leaving
8 bits (256 addresses) for hosts.
Common prefix lengths:
| CIDR | Netmask | Hosts |
|---|---|---|
/8 |
255.0.0.0 |
16 million |
/16 |
255.255.0.0 |
65,534 |
/24 |
255.255.255.0 |
254 |
/32 |
255.255.255.255 |
1 (single host) |
Public vs Private Addresses
Some address ranges are reserved for private networks and aren’t routable on the public Internet:
| Range | Common Use |
|---|---|
10.0.0.0/8 |
Large enterprise networks |
172.16.0.0/12 |
Medium networks |
192.168.0.0/16 |
Home and small office networks |
Your home devices likely use 192.168.x.x addresses.
NAT
Network Address Translation (NAT) allows multiple devices with private addresses to share a single public IP address. Your router maintains a translation table, mapping internal (private IP + port) to external (public IP + port).
NAT complicates peer-to-peer connections because devices behind NAT can’t directly receive incoming connections without port forwarding or NAT traversal techniques.
Packets and Fragmentation
IP transmits data in packets (also called datagrams). Each packet is independent and may take a different route to the destination.
If a packet is too large for a network link, IP can fragment it into smaller pieces. The receiving host reassembles fragments before delivering to the transport layer. Fragmentation adds overhead and can cause problems if any fragment is lost.
Routing
Routers forward packets toward their destination based on routing tables. Each router examines the destination IP address and forwards the packet to the next hop. The path from source to destination may traverse many routers.
TTL and Hop Counts
The Time To Live (TTL) field prevents packets from circulating forever due to routing loops. Each router decrements TTL; when it reaches zero, the packet is discarded. The sender sets the initial TTL (commonly 64 or 128).
Tools like traceroute exploit TTL to discover the network path by sending
packets with increasing TTL values and observing which router discards each one.
TCP: Reliable Streams
TCP provides reliable, ordered, byte-stream delivery between two endpoints. Understanding TCP’s mechanisms helps explain timeout behaviors, performance characteristics, and error conditions.
Connection-Oriented
TCP establishes a connection before transferring data. Both sides maintain state about the connection: sequence numbers, window sizes, timers. This contrasts with UDP, which just sends packets without setup.
Three-Way Handshake
TCP connections begin with a three-way handshake:
Client Server
| |
|------ SYN seq=x -------->| "I want to connect, my sequence starts at x"
| |
|<-- SYN-ACK seq=y ack=x+1-| "OK, my sequence starts at y, I got your x"
| |
|------ ACK ack=y+1 ------>| "I got your y, connection established"
| |
After the handshake, both sides can send and receive data.
Sequence Numbers and Acknowledgments
TCP numbers each byte in the stream. The sender assigns sequence numbers; the receiver acknowledges which bytes it has received. This enables detection of lost, duplicated, or reordered data.
If the sender doesn’t receive an acknowledgment within a timeout, it retransmits. The receiver uses sequence numbers to put out-of-order data back in sequence and discard duplicates.
Flow Control
The receiver advertises a window size: how many bytes it can buffer. The sender must not have more unacknowledged bytes in flight than the receiver’s window allows. This prevents a fast sender from overwhelming a slow receiver.
The window size dynamically adjusts. When the receiver processes data, it opens the window; when it falls behind, the window shrinks.
Congestion Control
TCP also limits sending rate to avoid overwhelming the network itself. Algorithms like slow start, congestion avoidance, and fast retransmit probe for available bandwidth.
When packet loss occurs (detected by missing acknowledgments), TCP assumes network congestion and reduces its sending rate. This is why TCP throughput can vary significantly depending on network conditions.
Retransmission
When TCP detects packet loss (no acknowledgment received), it retransmits. The retransmission timeout (RTO) is calculated from measured round-trip times. On a slow or variable network, RTO can be several seconds.
This is why network operations can take a long time to fail—TCP may retry multiple times before giving up.
Connection Teardown
TCP connections close with a four-way handshake:
Client Server
| |
|------ FIN -------------->| "I'm done sending"
| |
|<----- ACK ---------------| "Got it"
| |
|<----- FIN ---------------| "I'm done too"
| |
|------ ACK -------------->| "Got it, goodbye"
| |
Either side can initiate the close. The FIN indicates "I have no more data to send" but the connection remains half-open—the other side can still send.
A RST (reset) immediately terminates the connection without the graceful handshake, used when something goes wrong.
TCP States
A TCP connection moves through states:
| State | Meaning |
|---|---|
LISTEN |
Server waiting for connections |
SYN_SENT |
Client has sent SYN, waiting for response |
SYN_RECEIVED |
Server has received SYN, sent SYN-ACK |
ESTABLISHED |
Connection open, data can flow |
FIN_WAIT_1 |
Sent FIN, waiting for ACK |
FIN_WAIT_2 |
FIN acknowledged, waiting for peer’s FIN |
CLOSE_WAIT |
Received peer’s FIN, waiting for application to close |
TIME_WAIT |
Waiting to ensure peer received final ACK |
CLOSED |
Connection fully terminated |
The TIME_WAIT state lasts 2× the maximum segment lifetime (typically 1-2 minutes). This prevents old packets from a closed connection being mistaken for a new connection on the same port. It’s why restarting a server may fail with "address already in use."
UDP: Unreliable Datagrams
UDP provides a simpler, connectionless service. It sends datagrams without guaranteeing delivery, ordering, or duplicate detection.
Connectionless Communication
UDP doesn’t establish connections. You just send a datagram to a destination address and port. There’s no handshake, no state maintained, no guaranteed delivery.
If a UDP packet is lost, the sender doesn’t know (unless the application builds its own acknowledgment mechanism).
When to Use UDP vs TCP
| Property | TCP | UDP |
|---|---|---|
Reliability |
Guaranteed delivery |
Best effort |
Ordering |
Preserved |
Not guaranteed |
Connection |
Connection-oriented |
Connectionless |
Overhead |
Higher (headers, state) |
Lower |
Use cases |
Web, email, file transfer |
Video streaming, gaming, DNS |
UDP is appropriate when:
-
Timeliness matters more than reliability (live video—a late frame is useless)
-
The application can tolerate loss (VoIP—missing audio is better than delayed)
-
Messages are small and independent (DNS queries)
-
You need multicast or broadcast
TCP is appropriate for most applications where correctness matters.
Ports and Sockets
Ports and sockets connect the transport layer to applications.
Port Numbers
A port is a 16-bit number (0–65535) identifying an application on a host.
| Range | Usage |
|---|---|
0–1023 |
Well-known ports (require admin privileges) |
1024–49151 |
Registered ports (user applications) |
49152–65535 |
Ephemeral (dynamic) ports |
Standard services use well-known ports:
| Port | Service |
|---|---|
22 |
SSH |
25 |
SMTP (email) |
53 |
DNS |
80 |
HTTP |
443 |
HTTPS |
Socket as (IP, Port) Pair
A socket endpoint is the combination of an IP address and port number. A TCP connection is uniquely identified by four values:
-
Source IP address
-
Source port
-
Destination IP address
-
Destination port
This means a server on port 80 can have thousands of simultaneous connections from different clients—each connection has a unique tuple.
Listening vs Connected Sockets
A listening socket waits for incoming connections on a specific port. When a client connects, the operating system creates a new connected socket for that specific connection. The listening socket continues accepting more connections.
Socket API Primitives
The traditional socket API has operations that map to TCP concepts:
| Operation | Purpose |
|---|---|
socket() |
Create a socket handle |
bind() |
Associate a local address and port |
listen() |
Mark socket as accepting connections |
accept() |
Wait for and accept an incoming connection |
connect() |
Initiate a connection to a remote endpoint |
send()/recv() |
Transfer data on a connected socket |
close() |
Terminate the connection |
Corosio wraps these operations in coroutine-friendly abstractions.
DNS
The Domain Name System translates human-readable names to IP addresses.
Hostname Resolution
When you connect to www.example.com, your system must find its IP address.
This involves querying DNS servers, which form a hierarchical, distributed
database.
Record Types
| Type | Purpose |
|---|---|
A |
Maps hostname to IPv4 address |
AAAA |
Maps hostname to IPv6 address |
CNAME |
Alias (canonical name) pointing to another hostname |
MX |
Mail server for a domain |
TXT |
Arbitrary text (often used for verification) |
DNS Lookup Flow
-
Application calls
getaddrinfo()(or Corosio’sresolver::resolve()) -
System checks local cache
-
If not cached, queries configured DNS server(s)
-
DNS server may recursively query other servers
-
Response returns IP address(es)
DNS responses often include multiple addresses (for load balancing or redundancy). Your application should try each address until one succeeds.
Corosio provides the resolver class for asynchronous DNS lookups:
corosio::resolver r(ioc);
auto [ec, results] = co_await r.resolve("www.example.com", "https");
for (auto const& entry : results)
{
auto ep = entry.get_endpoint();
// Try connecting to ep...
}
Practical Considerations
Real-world network programming involves details that affect performance and correctness.
Localhost and Loopback
The loopback interface (127.0.0.1 or ::1) allows a machine to communicate
with itself. Traffic never leaves the host, making it useful for testing and
inter-process communication.
Binding to loopback restricts access to local connections only—external hosts cannot connect.
Firewalls and Port Blocking
Firewalls filter network traffic based on addresses and ports. Corporate networks often block all incoming connections and restrict outgoing to specific ports (80, 443). Your application may need to work within these constraints.
Keep-Alive
TCP keep-alive sends periodic probes on idle connections to detect if the peer has disappeared. Without keep-alive, a connection to a crashed host may appear open indefinitely.
Operating systems have configurable keep-alive intervals (often defaulting to hours). Applications needing faster detection should implement application-level heartbeats.
Nagle’s Algorithm
Nagle’s algorithm batches small writes. Instead of sending each small piece immediately, TCP waits briefly to accumulate more data, sending fewer, larger packets.
This improves efficiency for bulk transfers but adds latency for interactive applications (like games or terminals) that send small, frequent messages.
TCP_NODELAY
The TCP_NODELAY socket option disables Nagle’s algorithm, sending data
immediately regardless of size. Use this for latency-sensitive applications
where you’re sending small packets that shouldn’t be delayed.
// Note: Corosio doesn't currently expose TCP_NODELAY directly
What Corosio Provides
Corosio wraps the complexity of TCP programming in a coroutine-friendly API:
-
socket — Connect to servers, send and receive data
-
acceptor — Listen for and accept incoming connections
-
resolver — Translate hostnames to IP addresses
-
endpoint — Represent addresses and ports
All operations are asynchronous and return awaitables. You don’t manage raw socket handles or deal with platform-specific APIs directly.
Common Pitfalls
Partial Reads and Writes
TCP’s byte-stream nature means read_some() may return fewer bytes than you
requested, and write_some() may send fewer bytes than you provided.
Always loop or use composed operations (read(), write()) when you need
exact amounts:
// Wrong: might read less than buffer size
auto [ec, n] = co_await sock.read_some(buf);
// Right: reads until buffer is full or EOF
auto [ec, n] = co_await corosio::read(sock, buf);
Connection Refused
If no server is listening on the target port, connect fails with
connection_refused. Always handle this error—it’s common during development
and when servers restart.
Further Reading
For a deeper understanding of TCP/IP:
-
TCP/IP Illustrated, Volume 1 by W. Richard Stevens—the classic reference
-
Unix Network Programming by W. Richard Stevens—practical socket programming
Next Steps
-
Concurrent Programming — Coroutines and strands
-
I/O Context — The event loop
-
Sockets — Socket operations in detail