By Jeremy Abram — JeremyAbram.net
Executive summary
When you tap an app or load a page, a vast human-and-machine system hums to life: cables on ocean floors, humming data centers, rooftop antennas, quiet standards committees, volunteer open-source maintainers, and on-call engineers who sleep with a pager. This article maps that hidden workforce and the physical, organizational, and legal scaffolding they operate—so we can finally see the digital world as the engineered public utility it has become.
1) The physical backbone (you can touch it)
Undersea fiber & landing stations.
Most international traffic rides glass fibers inside armored submarine cables. Specialized ships lay and repair them; shore teams run the landing stations where cables surface, power is injected, and signals join terrestrial fiber rings.
Terrestrial fiber & rights-of-way.
City blocks hide “meet-me” vaults, handholes, and conduits. Field techs splice hair-thin fibers in the rain; utility crews negotiate pole attachments; municipalities manage dig permits so backhoes don’t take out half a city’s connectivity.
Internet exchange points (IXPs).
Unassuming rooms full of switches where networks swap traffic. Neutral operators maintain the fabric; participants manage BGP sessions; facilities teams ensure redundant power, cooling, and physical security.
Cell towers & last-mile plant.
Riggers climb towers to swap antennas; radio engineers tune spectrum; outside-plant crews repair drop lines for fiber-to-the-home and keep neighborhood cabinets warm, dry, and powered.
Data centers (the factories of information).
SREs and facility engineers keep the “MEP” stack—mechanical, electrical, plumbing—alive: UPS and battery rooms, diesel generators, switchgear, chillers/CRACs, hot/cold aisles, fire-suppression, and access control. Uptime depends as much on a correctly torqued breaker as on a perfectly tuned database.
2) The logical backbone (you can’t touch it, but it touches everything)
Routing & addressing (BGP, RPKI, RIRs).
Network engineers advertise and secure routes; regional internet registries (ARIN, RIPE, APNIC, LACNIC, AFRINIC) allocate number resources; operators deploy route filtering and RPKI to prevent hijacks; MANRS sets norms for good behavior.
Naming & time (DNS, ICANN, NTP).
DNS root operators, registries, and registrars keep names resolvable; resolvers cache wisely; DNSSEC chains trust. Meanwhile, NTP servers (and PTP in finance/5G) keep clocks aligned—without accurate time, encryption, logs, and distributed systems wobble.
Standards bodies & protocol stewards.
Volunteer engineers in the IETF write RFCs; W3C evolves the web platform; IEEE standardizes PHY/MAC layers; open governance and rough consensus create the rules everyone quietly follows.
Content delivery & edge.
CDNs place caches near users; edge orchestration schedules workloads across POPs; site reliability teams roll out config safely and revert faster than users can tweet “it’s down.”
3) The human layer (often invisible by design)
SREs and on-call engineers.
They define SLOs, watch error budgets, run incident response, and practice chaos engineering. When you sleep, they don’t.
Open-source maintainers.
Small teams (sometimes one person) maintain libraries and kernels the entire economy depends on. They triage issues, merge patches, publish advisories, and endure “bus factor” risk and burnout.
Trust & Safety and content moderators.
Human reviewers and policy teams handle abuse, spam, CSAM, misinformation, and legal takedowns. Their labor is psychological PPE for the rest of us.
Security engineers & responders.
From red/blue/purple teams to SOC analysts and DFIR specialists, they harden systems, hunt intrusions, and coordinate disclosure when zero-days drop.
Facilities, cleaners, and guards.
Physical operations—the people who badge you in, vacuum the raised floor, test the diesel weekly—are part of uptime. Without them, “the cloud” is just a warm room.
4) Economics and incentives (who pays and why)
Peering vs. transit.
Networks barter traffic at IXPs or pay for upstream transit. Contracts, traffic ratios, and business strategy drive who connects to whom, which in turn shapes latency for users.
SLAs, SLOs, and error budgets.
Providers promise SLAs to customers; internal SLOs keep teams honest about reliability tradeoffs. Every feature ships against a finite reliability budget.
The open-source funding paradox.
Critical code is free to use but expensive to maintain. Sponsorships, foundations, and corporate staffing help—but the mismatch between societal dependence and maintainer resources remains a systemic risk.
Externalities.
Grid stress, land use, e-waste, water consumption for cooling—costs not fully priced into bits. PPAs for renewables, heat-recovery, and demand-response programs are attempts to internalize them.
5) Fragility, failure, and how the internet learns
Route leaks & hijacks.
One misconfigured BGP announcement can black-hole traffic globally. RPKI, IRR hygiene, and max-prefix filters reduce blast radius; postmortems spread lessons.
Single points of failure.
Concentration in a few clouds, a few CDNs, and a few DNS providers creates correlated outages. Multi-homing, multi-region, and game-day drills are the antidote.
Open-source “bus factor.”
Log4j-style incidents exposed how a tiny maintainer pool can escalate into global risk. The remedy: diversified maintainers, paid time, security reviews, and memory-safe rewrites where appropriate.
Supply chain realities.
From fiber amplifiers to NIC firmware to CI/CD pipelines, dependencies stack deep. SBOMs, code signing, reproducible builds, and zero-trust architectures make tampering harder.
6) Governance, law, and geopolitics
ICANN & multistakeholder governance.
Names and numbers live under globally distributed stewardship to avoid capture by any single government.
Spectrum, rights-of-way, and safety codes.
National regulators allocate spectrum; local authorities grant pole access and trenching permits; building codes dictate battery chemistry, fuel storage, and fire safety.
Data localization & lawful access.
Where data sits and who can demand it varies by jurisdiction. Cloud regions, KMS/HSM controls, and contractual addenda thread the needle between performance, privacy, and compliance.
Critical infrastructure & national security.
Subsea landings, IXPs, and hyperscale campuses are now designated critical; operators run joint exercises with governments and publish transparency reports.
7) Sustainability: keeping the lights (and fans) on
Efficiency metrics.
PUE and WUE guide design; rear-door heat exchangers, liquid/immersion cooling, and free-air economization shrink energy and water footprints.
Grid integration.
Hyperscalers ink long-term renewable PPAs, build on-site storage, and participate in demand response. The future pairs data centers with clean generation and district-heating reuse.
Lifecycle thinking.
Right-to-repair, component harvesting, and e-waste recycling push against linear “buy-use-discard” cycles. Software efficiency matters, too: fewer CPU cycles, fewer emissions.
8) How reliability is actually practiced
- Observability: logs, metrics, traces, and real user monitoring.
- Progressive delivery: canaries, blue-green, feature flags, rapid rollback.
- Game days & chaos: practice failure to shorten MTTR when it’s real.
- Runbooks & postmortems: repeatable response and blameless learning.
- Defense-in-depth: segregation of duties, least privilege, hardware roots of trust.
- Tabletop governance: legal/PR/engineering drill coordinated crisis playbooks.
9) The next frontiers of maintenance
- Constellations & non-terrestrial networks: LEO satellites as backhaul and failover; managing space debris and spectrum sharing becomes part of “ops.”
- Edge compute: thousands of micro-sites at stores, towers, and factories—harder to visit, easier to orchestrate with automation and out-of-band control.
- Post-quantum cryptography: fleet-wide key rotation and protocol upgrades without bricking devices.
- AI for ops (AIOps): anomaly detection and auto-remediation—but humans still write the playbooks and own the consequences.
- Memory-safe rewrites: long-term migrations from vulnerable languages to safer runtimes in core infrastructure.
10) What you (and your organization) can do
- Fund what you depend on. Audit your OSS dependencies; sponsor maintainers; join foundations.
- Design for failure. Multi-region, multi-provider, chaos testing, and clear SLOs.
- Harden the edge. Implement RPKI, DNSSEC validation, strong route filtering, and MFA everywhere.
- Plan the human work. Sustainable on-call rotations, mental-health support for moderators and responders, and paid time for maintenance—not just features.
- Measure sustainability. Track PUE/WUE and code-level efficiency; include them in OKRs.
- Practice disclosure. Coordinated vulnerability disclosure and transparent postmortems build trust.
Who maintains the digital world?
- Cable-ship crews and fiber splicers.
- IXP operators and peering coordinators.
- Data-center MEP engineers and security staff.
- Network and SRE teams, on call at 3 a.m.
- Open-source maintainers merging your PR at lunch.
- Trust & Safety professionals reviewing the worst of the internet so you don’t have to.
- Standards volunteers writing the footnotes of civilization.
The digital world runs on people—skilled, undersung, and everywhere.
Leave a Reply