Expanding your Grid -Diff-


Sun Mar 3 04:08:38 CET 2013, mycroftiv

Plan 9 was designed as a distributed system. After you install the distribution from the cd, you have a self-sufficient one machine system, a standalone terminal. We will consider this as "Level 0" - how do you proceed from here to a network of Plan 9 machines and provide Plan 9 services to other clients? Note that this guide does not imply a strict dependence on the previous level, it is entirely possible to setup a Plan 9 DHCP/PXE boot server (described in level 7) without performing all the steps described in previous levels. This is a rough guide which proceeds in the order of increasing number of machines used and increasing elaboration and customization of configuration.

LEVEL 1: UPGRADING THE INSTALL TO A CPU SERVER

The traditional first step is following the Configuring a Standalone CPU Server wiki page. The transformation from terminal to cpu server provides many additional capabilities:

Even if you are only planning on making use of a single Plan 9 system, it is highly recommended to configure it as a cpu server. You probably use other operating systems as well as Plan 9, and configuring your Plan 9 system as a cpu server will make it vastly more useful to you by enabling it to share resources easily with non-Plan 9 systems.

It should also be noted that the division between terminals and cpus is mostly a matter of conventional behavior. It is possible to configure all services to run on a machine using the terminal kernel, but there is no particular advantage to this. The cpu kernel is easy to compile and can run a local gui and be used as a combined terminal/cpu server.

LEVEL 2: CPU AND DRAWTERM CONNECTIONS BETWEEN MULTIPLE MACHINES

The core functionality of a cpu server is to provide cpu(1) service to Plan 9 machines and Drawterm clients from other operating systems. The basic use of cpu(1) is similar to ssh in unix or remote desktop in Windows. You connect to the remote machine, and have full use of its resources from your terminal. However, Plan 9 cpu(1) also makes the resources of the terminal available to the cpu at /mnt/term. This sharing of namespace is actually the method of control also, because the cpu accesses the terminal's devices (screen, keyboard, mouse) by binding them over its own /dev. If you are a new Plan 9 user, you are not expected to understand this.

Drawterm from other operating systems behaves the same way. When you drawterm to a Plan 9 cpu server, your local files will be available at /mnt/term. This means you can freely copy files between Plan 9 and your other os without the use of any additional protocols. In other words, when working with drawterm, your environment is actually a composite of your local os and the Plan 9 system - technically it is a three node grid, because the Drawterm program acts as an ultra-minimal independent Plan 9 terminal system, connecting your host os to the Plan 9 cpu server.

For many home users, this style of small grid matches their needs. A single Plan 9 cpu/file/auth server functions both as its own terminal, and provides drawterm access to integrate with other operating systems. Some users like to add another light terminal only Plan 9 system as well. Recently (2013) Raspberry Pi's have become popular for this purpose. Another option with surprising benefits is using virtual machines for a cpu server. Because of Plan 9's network transparency, it can export all of its services and its normal working environment through the network.

LEVEL 3: A SEPARATE FILE SERVER AND TCP BOOT CPUS

A standard full-sized Plan 9 installation makes use of a separate file server which is used to tcp boot at least one cpu server and possibly additional cpus and terminals. A tcp booted system loads its kernel and then attaches to a root file server from the network. Some of the strengths of the Plan 9 design become more apparent when multiple machines share the same root fs.

If you have already configured a standalone cpu server, it can also act as a file server if you instruct its fossil(4) to listen on the standard port 564 for 9p file service. Choosing "tcp" at the "root is from" option during bootup allows you to select a network file server. You can use plan9.ini(8) to create a menu with options for local vs tcp booting. A single file server can boot any number of cpu servers and terminals.

The first time you work at a terminal and make use of a cpu server when both terminal and cpu are sharing a root fs from a network file server is usually an "AHA!" moment for understanding the full Plan 9 design. In Plan 9 "everything is a file" and the 9p protocol makes all filesystems network transparent, and applications and system services such as the sam(1) editor and the plumber(4) message passing service are designed with distributed architecture in mind. A terminal and cpu both sharing a root fs and controlled by the user at the terminal can provide a unified namespace in which you can easily forget exactly what software is running on which physical machine.

LEVEL 4: SHARING /SRV /PROC AND /DEV BETWEEN MULTIPLE CPUS

The use of synthetic (non-disk) filesystems to provide system services and the network transparent 9p protocol allow Plan 9 to create tightly coupled grids of machines. This is the point at which Plan 9 surpasses traditional unix - in traditional unix, many resources are not available through the file abstraction, nfs does not provide access to synthetic file systems, and multiple interfaces and abstractions such as sockets and TTYs must be managed in addition to the nfs protocol.

In Plan 9 a grid of cpus simply import(4) resources from other machines, and processes will automatically make use of those resources if they appear at the right path in the namespace(4). The most important file trees for sharing between cpus are /srv, /proc, and some files from /dev and /mnt. In general all 9p services running on a machine will post a file descriptor in /srv, so sharing /srv allows machines to make new attaches to services on remote machines exactly as if they were local. The /proc filesystem provides management and information about running processes so import of /proc allows remote machines to control each other's processes. If machines need to make use of each other's input and output devices (the cpu(1) command does this) access is possible via import of /dev. Cpus can run local processes on the display of remote machines by attaching to the remote rio(4) fileserver and then binding in the correct /mnt and /dev files.

LEVEL 5: SEPARATE ARCHIVAL DATA SERVERS AND MULTIPLE FILE SERVERS

This guide has been referring to "the file server" and not making a distinction between systems backed by venti(8) and those without. It is possible and recommended to use venti(8) even for a small single machine setup. As grids become larger or the size of data grows, it is useful to make the venti server separate from the file server. Multiple fossil(4) servers can all use the same venti(8) server. Because of data deduplication multiple independent root filesystems may often be stored with only a slight increase in storage capacity used.

Administering a grid should include a system for tracking the rootscores of the daily fossil snapshots and backing up the venti arenas. A venti-backed fossil by default takes one archival snapshot per day and the reference to this snapshot is contained in a single vac: score. See vac(1). Because fossil(4) is really just a temporary buffer for venti(8) data blocks and a means of working with them as a writable fs, fossils can be almost instantaneously reset to use a different rootscore using flfmt -v.

To keep a full grid of machines backed-up, all that is necessary is to keep a backup of the venti(8) arenas partitions and a record of the fossil(4) rootscores of each machine. The rootscores can be recovered from the raw partition data, but it is more convenient to track them independently for faster and easier recovery. The simplest and best system for keeping a working backup is keeping a second active venti server and using venti/wrarena to progressively backup between them. This makes your backup available on demand simply by formatting a new fossil using a saved rootscore and setting the backup venti as the target. If the data blocks have all been replicated, the same rootscores will be available in both. See venti-backup(8).

LEVEL 6: MULTI-OS GRIDS USING U9FS, 9PFUSE, INFERNO, PLAN9PORT, 9VX

The 9p protocol was created for Plan 9 but is now supported by software and libraries in many other operating systems. It is possible to provide 9p access to both files and execution resources on non Plan 9 systems. For instance, Inferno speaks 9p (also called "styx" within Inferno) and can run commands on the host operating system with its "os" command. Thus an instance of Inferno running on Windows can bring those resources into the namespace of a grid. Another example is using plan9port, the unix version of the rc shell, and 9pfuse to import a hubfs from a Plan 9 machine and attach a local shell to the hubs. This provides persistent unix rc access as a mountable fs to grid nodes.

Please see connecting to other OSes and connecting from other OSes.

LEVEL 7: STANDALONE AUTH SERVER, PLAN 9 DHCP, PXE BOOT, DNS SERVICE, /NET IMPORTS, /NET.ALT

Plan 9 provides mechanisms to manage system roles and bootup from a DHCP/PXE boot server. At the size of grid used by an institution such as Bell Labs itself or a university research department, it is useful to separate system roles as much as possible and automate their assignment by having Plan 9 function to assign system roles and ips to Plan 9 machines via pxe boot and Plan 9 specific dhcp fields. This requires configuring ndb(8) to know the ethernet addresses of client machines and which kernel to serve them and a well-controlled local network.

One of the most common specialized roles in a mid-to large sized grid is the standalone auth-only server. Because auth is so important and may be a single point of failure of a grid, as well as for security reasons, it is often a good idea to make the auth server an independent standalone box which runs nothing at all except auth services and is hardened and secured as much as possible against failure and presents the minimal attack surface. In an institution with semi-trusted users such as a university, the auth server should be in a physically separate and secure location from user terminals.

A grid of this size will probably also have use for DNS service. For personal users on home networks, variables such as the authentication domain are often set to arbitrary strings. For larger grids which will probably connect to public networks at some nodes, the ndb(8) and authsrv(6) configuration will usually be coordinated with the publicly assigned domain names. It is also at this point (public/private interface) where machines may be connected to multiple networks using /net.alt (simply a conventional place for binding another network interface or connection) and may make use of one of Plan 9's most famous applications of network transparency - the import(4) of /net from another machine. If a process replaces the local /net with a remote /net, it will transparently use the remote /net for outgoing and incoming connections.

LEVEL 8: RESEARCH AND COMPUTATION GRIDS WITH CUSTOM CONTROL LAYERS AND UNIQUE CAPABILITIES

In its role as a research operating system, the capabilities of Plan 9 as a distributed system are often extended by specific projects for specific purposes or to match specific hardware. Plan 9 does not include any built-in capability for things like task dispatch and load balancing between nodes. The Plan 9 approach is to provide the cleanest possible set of supporting abstractions for the creation of whatever type of high-level clustering you wish to create. Some examples of research grids with custom software and capabilities:

These are examples of projects which are built on 9p and the Plan 9 design and customize or extend the operating system for additional clustering, task management, and specific purposes. The flexibility of Plan 9 is one of its great virtues. Most Plan 9 users customize their setups to a greater or lesser extent with their own scripts or changes to the default configuration. Even if you aren't aspiring to build a 20-node "Manta Ray" swarm to challenge Nemo's Octopus, studying these larger custom systems may help you find useful customizations for your own system, and the Plan 9 modular design means that some of the software tools used by these projects are independently useful.

LEVEL 9: POWER SET GRIDS?

Because existing grids already have the label "9grid" it is theorized the as-yet unreached Ninth Level of Gridding corresponds to the power set operation. The previously described eight levels of gridding already encompass an infinite Continuum of possibilities. To surpass the existing level requires finding a way to construct the power set of this already infinite set. Level Nine grids are therefore transfinite, not merely infinite, and it is an open question if current physical reality could accommodate such structures. At the present moment, research indicates such hypothetical Level Nine grids would need to post mountable file descriptors from quasars and pulsars and store information by entropic dissipation through black hole event horizons. See astro(7) and scat(7).