The Internet as Language: Why Open Source and Distributed Infrastructure Mirror Humanity's Oldest Network

the Institute
Nov 21, 2025
14 min read

People observe a framed painting depicting a woman leading a revolutionary charge, holding a tricolor flag. The room has a red wall.

In 1539, King Francis I of France issued the Ordinance of Villers-Cotterêts, mandating that all legal documents be written in French rather than Latin—but more importantly, in Parisian French rather than the hundreds of regional languages spoken across French territory. This began centuries of linguistic consolidation. By the French Revolution, revolutionary authorities actively suppressed regional languages—Breton, Occitan, Basque, Alsatian, Corsican—as threats to national unity. Teachers beat children for speaking their native tongues at school. The state needed citizens who spoke one language, used one system of weights and measures, followed one legal code. Diversity was illegibility. Standardization was control.

The project succeeded partially. French became the dominant language, regional languages retreated to private life. But complete elimination failed. Breton speakers still number over 200,000. Occitan persists in southern regions. The state could suppress, stigmatize, and marginalize—but it couldn't fully erase languages that communities continued speaking. Linguistic diversity survived because language's distributed architecture made comprehensive control impractical. You can ban Breton in schools, but you can't monitor every household conversation. You can stigmatize Occitan, but you can't prevent grandparents from teaching grandchildren. Geographic distribution created redundancy. Dialectical variation meant no single campaign could eliminate entire language families.

Now consider the internet. You access it through platforms owned by corporations: search through Google, socialize through Meta, store files with Amazon. These companies don't own the internet through legal decree like French kings mandating Parisian French. Instead, they control the infrastructure—the servers, platforms, and tools—where internet activity happens. The mechanism differs but the pattern matches: consolidation that makes users legible, standardization that enables control, centralization that eliminates alternatives.

The French state spent centuries trying to make its population linguistically uniform using the tools James C. Scott describes in "Seeing Like a State" as making populations legible to central authority. Tech platforms accomplish similar consolidation in a fraction of the time, not through state coercion but through network effects and infrastructure control. But here's what history teaches: distributed architecture resists centralized control. Language survived suppression not through ideology but through structure—every speaker a node, every household a backup system, every dialect a redundant pathway. The question for internet infrastructure is whether we replicate that resilient architecture or accept the centralized vulnerability that made telephone surveillance routine, postal privacy extinct, and platform monopolies inevitable.

Why Distributed Networks Resist Suppression

Green spheres connected by light blue lines form a network pattern on a textured gray background, creating an abstract structure.

Before telephone networks, telegraph wires, or paved roads, humans built language. Every language is a network: a system where meaning works only because participants agree—consciously or not—on how symbols map to concepts, how sounds carry ideas, how grammar structures thought. No central server coordinates this. No CEO approves vocabulary. No government allocates bandwidth.

States discovered they could increase administrative capacity by forcing linguistic uniformity. The French example is dramatic but not unique. The Soviet Union suppressed minority languages in favor of Russian. Spain under Franco banned Catalan, Basque, and Galician. The British Empire imposed English across colonized territories. The pattern repeats: states need legible populations to count them, tax them, conscript them, regulate them. Linguistic diversity makes this difficult. Solution: eliminate diversity. Force standardization. Make everyone speak the same language so state bureaucracies can process everyone uniformly.

Despite massive campaigns backed by state violence, linguistic diversity survived. UNESCO lists hundreds of endangered languages, many dying because suppression succeeded. But complete elimination proved remarkably difficult. This is distributed consensus operating at civilization scale. Every conversation is a synchronization event. Every parent teaching a child preserves the network. Every community that keeps speaking despite suppression is a node refusing to disconnect. The architecture made language nearly impossible to kill as long as communities wanted preservation.

Compare this to Facebook. When Facebook changes its algorithm, three billion users adapt or leave. They don't fork Facebook and run their own version. They don't create underground networks that preserve old features. The architecture is centralized—one company controls infrastructure, one decision affects billions who have no vote, no voice, no ability to preserve alternatives. This reveals the critical distinction: language could resist suppression because it was already distributed. Proprietary platforms can't be resisted because they're centralized by design. The lesson isn't just descriptive—it's prescriptive. If we want internet infrastructure that resists centralized control, we must build distributed architecture deliberately.

Infrastructure Control: From Writing Systems to Cloud Platforms

Street sign covered in stickers, padlocks below, with a dome-shaped historic building in the background under a clear blue sky.

Alphabets are infrastructure—technologies for encoding speech into persistent symbols. The Phoenician alphabet emerged around 1050 BCE. Greeks adapted it, adding vowels. Romans modified the Greek version. Each adaptation was a fork: take the existing system, modify for local needs, share improvements. No one asked permission. No one paid licensing fees.

But states recognized that controlling writing systems meant controlling who could participate in governance and commerce. China's imperial examinations required mastering Classical Chinese characters, making literacy a tool of bureaucratic selection. The Catholic Church's Latin literacy monopoly kept biblical interpretation within clerical control. Ottoman authorities restricted printing in Arabic script until 1727 to maintain control over texts.

France provides the clearest example of how infrastructure control compounds linguistic suppression. As the state imposed Parisian French, it simultaneously standardized French orthography through the

, which published dictionaries declaring official spellings and legitimate words. Regional languages often had no standardized written forms—Occitan existed primarily as spoken language, Breton had competing orthographies. When the state made French literacy mandatory for legal documents and government employment, it locked non-French speakers out not just because they spoke different languages, but because their languages lacked standardized written infrastructure that French possessed.

This same pattern appears when proprietary platforms control digital infrastructure. You can build alternatives, but if Google controls search indexing, Amazon controls web hosting, and Apple controls app distribution, your alternative faces enormous barriers. Not legal prohibition—that would be too obvious—but architectural obstacles that make participation effectively impossible without adopting dominant platforms' standards and paying their taxes: platform fees, API access charges, compliance costs.

Here's the contemporary parallel: when AWS runs proprietary modifications to open-source software, they don't share improvements. When Google modifies Android, those changes don't flow back to the commons. When Microsoft builds Azure infrastructure on Linux, they don't publish their tools. Improvements concentrate in corporate silos. Regional languages died when communities couldn't document them in standardized form. Open-source projects die when improvements get enclosed in proprietary forks that users can't access, inspect, or modify.

AGPL-3 prevents this enclosure. The Affero GPL requires: if you run modified software as a service, you must share source code with users who interact with it over a network. This creates standardized infrastructure that anyone can fork and modify—the digital equivalent of standardized orthography that gave threatened languages preservation tools. Every modification broadcasts back to the commons. No improvements can be locked away. The infrastructure stays open not through goodwill but through legal architecture.

Variation as Survival Strategy

A giant fork sculpture stands in calm water with misty mountains in the background. The scene is gray and serene.

Languages fork. Latin fragmented into Italian, Spanish, Portuguese, French, Romanian, Catalan—recognizably related but mutually unintelligible. States tried to prevent this variation. French linguistic policy aimed specifically at preventing fragmentation. But variation creates resilience. When revolutionary France tried to eliminate Occitan, dozens of regional varieties survived independently. When one valley's dialect faced suppression, another preserved theirs. When urban areas abandoned regional languages, rural communities maintained them. No single campaign could eliminate entire language families because dialectical variation created redundancy.

This is systems resilience through decentralization. Monocultures fail when disease strikes. Language families with dialectical variation survive because suppression must be comprehensive—every valley, every village, every household—to succeed completely.

Software forks operate identically when licensing permits. Linux distributions number in the hundreds: Ubuntu for desktop users, Debian for administrators, Arch for developers, Kali for security researchers. Each optimizes for different use cases while sharing the core kernel. When Ubuntu discovers a security vulnerability and patches it, that fix propagates to other distributions. When Debian improves package management, other distros adopt the innovation. No single entity controls "correct Linux." If authorities tried to ban Linux, they'd face the same problem France faced with Occitan: variations are so numerous, forks so distributed, that comprehensive suppression becomes technically impractical.

Proprietary platforms eliminate this resilience. You can't fork Facebook. When Twitter changed its character limit from 140 to 280, users who preferred the constraint couldn't preserve it. When Instagram switched from chronological to algorithmic feeds, users who wanted time-based sorting had no recourse. Changes propagate from headquarters to billions simultaneously. No variation provides refuge. No forks preserve alternatives.

Mastodon demonstrates the alternative: an open-source social network licensed under AGPL-3. Users can join different instances with different rules, start their own, fork the codebase, or propose changes. If authorities try suppression, they face distributed resistance: shut down servers in one jurisdiction, instances keep running elsewhere; ban the software, forks appear under different names. This isn't hypothetical resilience—it's architectural resilience that matches how linguistic diversity survived centuries of state pressure.

Network Effects: Natural Coordination vs. Enforced Lock-In

Green spheres connected by blue lines form a network on a gray background, creating a geometric pattern with a scientific feel.

Languages exhibit network effects—the value of speaking a language increases with the number of speakers. English dominates international business because network effects created a tipping point. But these effects had help. English dominance emerged from British Empire colonization, American economic power, and deliberate policies that suppressed indigenous languages. When India gained independence, it inherited English as administrative language because British rule had eliminated opportunities for Hindi, Tamil, and Bengali in government and commerce.

The French consolidation Scott describes reveals how supposedly "natural" monopolies reflect coercive standardization. French didn't become dominant through pure coordination benefits—it became dominant because the state banned regional languages in schools, excluded non-French speakers from government employment, and punished public use of minority languages. This is the critical distinction: natural network effects create coordination benefits but allow alternatives. Enforced network effects eliminate alternatives through deliberate architecture.

Social media platforms exploit this distinction. Facebook's three billion users create enormous network effects—your friends, family, and colleagues are probably there, incentivizing you to join even if you prefer alternatives. But Facebook doesn't just benefit from natural network effects. It deliberately makes interoperability impossible. You can't message between Facebook and Twitter. You can't follow Instagram users from Mastodon. You can't port your social graph without manual reconstruction.

Imagine if speaking English meant you couldn't understand Spanish speakers—not because the languages differ but because English LLC prohibited interoperability with Spanish LLC. This sounds dystopian because linguistic network effects never worked this way, even when states tried suppressing minority languages. A French speaker can learn Occitan. Bilingualism resists institutional pressure because language architecture permits it.

Current social media platforms operate differently. Network effects are partially natural—large networks provide coordination benefits—but lack of interoperability is architectural choice designed to create lock-in. Users can't leave Facebook without losing their social graph, not because migration is technically impossible but because Facebook deliberately prevents it.

This architectural lock-in enables rent extraction—profiting from controlled access rather than from productive innovation. Medieval landlords collected rents by controlling roads; they didn't build better roads, they just owned the only route through. Tech platforms operate identically: Facebook monetizes social graphs users created, charging advertisers for access to attention the platform didn't generate. AWS charges premiums for running open-source software where switching costs—data migration, retraining, architectural changes—make leaving prohibitively expensive. App stores extract 30% fees not for superior technology but for controlling the only access point to users. These are rents collected at digital tollbooths positioned on infrastructure built through collective contribution—open protocols, user-generated content, network effects from mass adoption—then enclosed by corporate control.

ActivityPub, the protocol underlying Mastodon, eliminates tollbooths through interoperability. Users on Mastodon can follow and interact with people on Pleroma, PixelFed, PeerTube, or any ActivityPub-compatible platform—like speaking English while understanding Spanish speakers. Network effects still exist, but they benefit the protocol rather than any single platform. Users can switch instances, fork platforms, or create new services without losing social connections. AGPL-3 ensures every ActivityPub implementation shares source code, so improvements propagate through the ecosystem rather than accumulating in proprietary forks.

Grammar, Protocols, and Who Controls Evolution

Close-up of a laptop keyboard with sunlight casting a shadow on the "fn" and "control" keys. Black keys on a silver background.

States tried controlling not just which languages people spoke but how they spoke them. The Académie Française publishes dictionaries declaring "proper" French grammar. The Real Academia Española regulates Spanish across 23 countries. These institutions don't legally own grammar, but they wield enormous influence through schools and governments.

Yet speakers ignore these authorities constantly. The Académie tried banning "email" in favor of "courriel"—French office workers kept writing "email." When French speakers adopted "weekend," the Académie proposed "fin de semaine" to prevent English borrowing. These efforts failed. Language evolved through actual use, not institutional decree. The pattern reveals fundamental tension: states want stable, predictable languages that bureaucracies can process uniformly. Living languages need flexibility to adapt to changing circumstances. When official standards conflict with user needs, users eventually win—but the distributed architecture of language is what makes their victory possible. Speakers could ignore the Académie because grammar operates through distributed usage, not centralized enforcement.

Software protocols are like grammar—rules for how systems communicate. HTTP defines how browsers request pages. SMTP governs email transmission. TCP/IP handles packet routing. These are open standards published by the Internet Engineering Task Force as public specifications. Anyone can implement them, propose improvements, build software that speaks these protocols.

This openness made the internet possible. If HTTP were proprietary, if SMTP required licenses, if TCP/IP belonged to a corporation, the internet would look like CompuServe or AOL—walled gardens where users accessed only content the platform owner approved. The internet succeeded because its foundational protocols operated like language: open standards anyone could use, modify, and extend.

But application-layer software increasingly operates under different rules. When you use Google Docs, the software runs on Google's servers following Google's rules. You can't inspect the code, modify how it works, or fork it to create your own version. The protocol might be open (HTTP), but the application is closed. Here's the architectural difference: the Académie Française can publish official grammar, but they can't prevent French speakers from ignoring it because grammar operates through distributed usage. Google controls Google Docs completely because the software runs on their servers, not users' devices. Language resists central control because it's distributed—every speaker implements it independently. Proprietary software resists user control because it's centralized—one company implements it, and users just access the results.

This is why AGPL-3 matters for infrastructure. Open-source software governed by AGPL-3 evolves like language when the Académie has no power: users encountering bugs or missing features can fork the code and implement fixes, submit improvements to maintainers, run modified versions on their own infrastructure, or propose changes the community evaluates. Software evolves through user-driven development. Bad features get deprecated through disuse. Security vulnerabilities get patched by whoever discovers them first. Changes propagate through adoption, not centralized approval.

Proprietary platforms enforce evolution from above. When Facebook changed its algorithm to deprioritize news content, publishers who depended on Facebook traffic had no recourse. When Twitter eliminated chronological timelines, users who preferred time-based feeds couldn't maintain that feature. Platform changes are technically enforced—when Facebook removes a feature, it's gone. When Twitter changes its interface, everyone sees the new version. The centralized architecture means corporate control succeeds where state linguistic control failed, not because corporate authority is more legitimate, but because the infrastructure enables comprehensive enforcement.

The lesson is prescriptive: if we want users to have the agency that French speakers had relative to the Académie, we must build distributed infrastructure deliberately. AGPL-3 ensures the application layer stays open like the protocol layer. If you run modified document editing software as a service, you share the source code. The entire stack—from TCP/IP routing through application features—remains a commons. Users can inspect, modify, and fork at every level. And critically: when corporate or state authorities try imposing unwanted changes, users can fork and maintain alternatives—but only if the architecture permits forking in the first place.

The Architecture Control: Postal Networks, Telephone Monopolies, Internet Consolidation

Close-up of a fountain pen writing in cursive on lined paper. The nib is silver, casting a shadow. Paper is off-white with handwritten text.

Before telephones, sending a message meant writing a letter and giving it to postal carriers. Messages traveled through infrastructure—postal roads, sorting facilities—but your ability to send mail didn't depend on a single company. The postal system was infrastructure, not platform. Multiple entities participated: national postal services, private carriers, international agreements. Geographic distribution meant suppressing mail communication required controlling vast territories and numerous independent operators.

Telephone systems centralized this. AT&T built the American telephone network and controlled it as a natural monopoly from 1913 to 1984. You couldn't choose providers or run your own switches. Long-distance calls routed through AT&T switching stations, giving AT&T—and government agencies with access—complete visibility into who called whom, when, and for how long. Surveillance became trivially easy. Suppressing communication became possible through pressure on one corporate entity.

This architectural difference explains divergent outcomes: linguistic diversity survived state suppression while telephone privacy didn't survive government surveillance. When France tried eliminating Occitan, they needed teachers in every school, police in every village, inspectors in every household. The distributed nature of linguistic communication made comprehensive suppression practically impossible—too many independent actors, too many private spaces, too much monitoring required. But when the NSA wanted telephone surveillance, they partnered with AT&T and gained access to nearly all American telephone communication through one relationship. Centralized architecture made what was impossible for linguistic suppression routine for telecommunications monitoring.

The internet began as distributed infrastructure. ARPANET connected research institutions through packet-switching networks that routed around failures. If one node went down, traffic found alternate paths. This architecture made the internet resilient against both technical failures and deliberate suppression.

Then came consolidation. Most internet traffic now flows through a handful of companies: Google, Amazon, Meta, Microsoft, and Cloudflare. Your email probably routes through Google's servers even if you don't use Gmail. Your web browsing probably passes through Cloudflare's network even if you've never heard of them. Your files probably live on AWS even if you think they're on Dropbox.

This centralization creates the same vulnerability telephone monopoly created: single points of control that make comprehensive surveillance and suppression practically achievable. When AWS went down in December 2021, Netflix, Disney+, Coinbase, and thousands of other services failed simultaneously. When governments want monitoring internet activity, they can compel cooperation from a handful of corporations rather than monitoring millions of independent operators.

Distributed infrastructure under AGPL-3 returns the internet to resilient architecture. Mesh networks let devices connect directly, routing around failures without central coordination. Federated social media lets servers communicate directly, not through corporate platforms. Distributed storage systems spread data across multiple nodes, eliminating dependence on single providers. Every server becomes a peer, not a client of centralized infrastructure. Every instance can fork and modify. Every community can maintain alternatives. Comprehensive suppression becomes as difficult as linguistic suppression was—requiring monitoring of millions of independent nodes rather than pressuring a handful of corporate gatekeepers.

What We're Building: Language-Like Architecture for Digital Infrastructure

Building covered in lush greenery and large planters, with a wooden facade against a blue sky, creating a natural, eco-friendly vibe.

The internet began as distributed infrastructure, moved toward centralized platforms, and now can return—through AGPL-3 and distributed architecture—toward its peer-to-peer origins. This is how language worked before states tried to standardize it: speakers communicating directly, innovations propagating through use, the system evolving through participation rather than central control. And it's how language survived despite centuries of state suppression: distributed architecture made comprehensive control impossible, preservation happened in spaces authorities couldn't fully monitor, and communities maintained alternatives even when official channels closed.

When NTARI built the Agrinet as distributed agricultural infrastructure with AGPL-3 licensing, we applied the architecture that kept Occitan alive despite French suppression: distributed nodes, peer-to-peer communication, and open standards that anyone can fork and preserve. Every farmer's node broadcasts and receives, creating transparent market information no corporation can monopolize. When governments try restricting agricultural data or corporations try locking farmers into proprietary systems, the distributed architecture makes comprehensive control difficult—farmers can fork the protocol, run independent instances, and maintain alternatives.

When we develop mesh networks for municipal broadband, we're returning internet connectivity to the architecture that made linguistic suppression impractical: device to device, neighbor to neighbor, without corporate intermediaries or centralized switching. When cellular towers fail or corporations restrict access, mesh networks route around the damage.

This is what the internet looks like when it operates with language-like architectural resilience:

Distributed infrastructure creating resilience through geographic dispersion
Open protocols providing transparency that prevents hidden control
Forking and evolution enabling preservation through variation
Peer-to-peer communication offering direct transmission that's harder to monitor
Network effects that benefit users instead of locking them in

Language survived the Académie Française's standardization attempts, French revolutionary suppression of regional languages, Soviet russification campaigns, Spanish prohibition of Catalan, and British colonial imposition of English. Not everywhere. Not unscathed. Hundreds of languages died. But linguistic diversity survived better than telephone privacy, better than postal confidentiality, better than any centralized infrastructure that states could control through single points of pressure.

The internet can achieve similar resilience—not through wishful thinking but through architecture: AGPL-3 ensuring code stays open, distributed infrastructure ensuring no single entity controls the network, open standards ensuring interoperability across implementations, and geographic distribution making comprehensive suppression as difficult as eliminating languages proved to be.

You already participate in the world's most successful distributed network that resisted suppression: language itself. Every conversation you have is peer-to-peer communication that no central authority fully controls. Every word you learn forks existing vocabulary that linguistic academies can't prevent. Every grammatical innovation you adopt is a pull request accepted by distributed consensus that official grammars can't block. You navigate this complexity without thinking about it because language architecture makes participation natural and comprehensive suppression impractical.

The internet worked this way once, before consolidation. It can again, if we build with architecture that learned from how distributed networks survive attempts to centralize them. The question is whether we rebuild with language-like resilience or accept centralized platforms imposing the architectural vulnerability that made telephone surveillance easy—extending it from voice calls to every form of digital communication.

Building the Infrastructure

Aerial view of construction workers in orange vests on a metal grid structure. Grey and black tones dominate the geometric background.

NTARI develops open-source infrastructure—quantum collaboration platforms, agricultural networks, municipal mesh systems, and the .ntari TLD—that makes language-like internet architecture technically viable, legally protected, and resilient against consolidation pressures. If you understand how language survived despite state suppression, you understand why AGPL-3 and federated systems matter. If you've seen how linguistic diversity persisted in private spaces even when banned in public, you understand how distributed infrastructure enables preservation despite hostile conditions. If you know how standardized orthography paradoxically enabled regional language preservation by giving communities tools to document and teach, you understand how open standards enable survival.

The technical infrastructure that provides language-like resilience is under development now. Join the community building it: NTARI Slack workspace. Or support the mission financially: ntari.org/give. Every contribution funds research, development, and policy work that makes language-like internet infrastructure—distributed, open, resilient, and impossible to fully suppress—become reality instead of historical parallel.