/sys/doc/ Documentation archive

[originally taken from http://www.scs.stanford.edu/07wi-cs244b/notes/l13d.txt]

Plan 9

Why are we reading this paper?
  A different kind of distributed system--day-to-day computing infrastructure
  Addresses different kinds of problems:
    Simplicity, easy of use, ease of development
    Heterogeneity of nodes in a computing environment
    Making better use of hardware resources in day-to-day computing
  Takes a "clean slate" approach--re-build everything from the ground up
    Often lets one test new, better, but previously impractical ideas 
    Means maybe less likely to be relevant to other work
      ...but always good to have seen some such ideas just in case
  In particular:  We've talked about Network Objects/RPC
    But here's a distributed system based entirely around network file systems

What is the motivation for this work?
  People dislike overloaded, "beureaucratic" time-sharing systems

What are the system's fundamental principles?
  1. Resources are named and accessed like files
       I.e., like Unix you might have a /dev/cd for the cdrom
       But you have more extreme examples, like /proc, /net, /dump, etc.
  2. Resources are accessed through a standard protocol, 9P
       Basically a network file system protocol.  Why is this important?
       Combined with 1, means you can access all kinds of devices remotely
       (Try this with NFS on UNIX and you will not get the desired results...)
  3. Disjoint hierarchies of services joined into *private* namespaces
       In other words, my /bin != your /bin.  Very different from UNIX.  Why?
       Compare "my office" to "353 Serra Mall #290":
         Former's meaning depends on who says it, but simpler & more intuitive

How do namespaces work?
  Three system calls manipulate the namespace: mount, bind, unmount
  - int mount(int fd, int afd, char *old, int flag, char *aname) 
      attaches file descriptor "speaking" the 9P protocol to namespace
      Think of descriptor as pipe or socket
      Mount regular file?  Kernel just writes attach message to file
      afd is socket to authentication process, if you want authentication
      aname sent to server in attach message so it can have multiple FSes
  - int bind(char *name, char *old, int flag)
      Replicates name at old
  - int unmount(char *name, char *old)
      name can be NULL, else it specifies one component of union to remove
  flags: MREPL, MBEFORE, MAFTER, MCREATE
    When using bind w/o MREPL, creates a union directory
      What's this?  Example: /bin
    Can you create files in a union directory?
      Yes, if MCREATE is set on one or more entries
  Note:  Unlike UNIX, can mount over either a directory, or a file.  Why?
    E.g., might want to replace /dev/kbd
  Can they get rid of the PATH environment variable?  How?
    Paper claims yes, but short answer is almost, by manipulating namespace
    But some people still want "." in PATH, so they kept it

Serial port example:  /dev/eia1, /dev/eia1ctl - how do these work?
  On plan9, eia1 is just for I/O
  eia1ctl (control device) sets baud rate w. text commands like "b1200".  Why?
    In distributed environment, no worries about byte-order
    Easy to use/debug stuff from shell scripts - this is actually a big deal!
  C.f. UNIX /dev/tty00:  set baud rate, etc. w. complicated ioctl calls

How does /dev/eia1 get into your namespace in the first place?
  Each kerel device has a letter -- t for the UART (serial port)
  Access "root" of device with special pathname #t
  Startup scripts during boot run:  bind -a '#t' /dev

What is system like to user?
  Login process - type your name and password to terminal
    Password is used to authenticate you to file & CPU servers
    Terminal doesn't care who you are--you already have console access!
    Just reboot to log in as a different user
  How does the 8½ windowing system work?
    Click right mouse button, menu allows you to create a new window
    8½ then runs new shell
      Binds over /dev/mouse, /dev/bitblt, and /dev/cons with pipes
      Filters input events, so shell only gets them when window selected
    Different from X windows--typically graphical apps take over window run in
    Note:  Can run rio recursively inside a window!
 "The text-editing features of 8½ are strong enough to displace
  special features such as history in the shell, paging and
  scrolling, and mail editors." (p. 4) - How does this work?
    Everything is editable
      Can edit current line you are typing
      For history, scroll back, edit old command, highlight, and "send"
      Note many people configure their prompt as no-op alias
      Means you can copy and send old command including prompt (easier)
    Terminal has "hold mode" toggled by ESC key
      In hold mode, even pressing return does not send input to program
      Mail program puts terminal in hold mode by default--easy to edit messages
   
What is the cpu command?
  Opens shell on another machine--Plan9's SSH.  How does it work?
  Runs a shell on a remote machine (authenticates with your password)
    Attempts to replicate your local namespace on remote machine
    Binds /dev/mouse, /dev/bitblt, /dev/cons on server to local devices
    Re-creates your file server mounts (using proxy authentication)
    What about /bin?  Might not want exact same /bin
      Substitutes binaries for CPU server arch, which might be different
  Isn't there a limit to how seamless heterogenous hardware can be?
    What if CPU server has different endianness?
      That's why almost everything uses text commands
    Won't cc produce wrong output?
      There is no cc command.  Explicitly specify architecture in command
        8a, 8c, 8l - x86 assembler, compiler, linker; object files named .8
        ka, kc, kl - sparc assembler, compiler, linker, use ".k" files

How does 9P network protocol work?
  All operations done in terms of "fids" - fid is 32-bit handle
    fids are chosen by the client
    First fid is set in attach message, will correspond to root directory
  Some operations on fids:
    walk - traverses the namespace, like "cd" for an individual fid
    clone - creates new fid as copy of current fid
    open - pins fid to file (performs access check), can then read/write
           (can no longer walk a fid after opening)
    read/write - as expected
    stat - return attributes:
      type - type of file on server (e.g., 't' for UART device)
      dev - instance of device on server (for devices w. multiple mount points)
      path - 8-byte unique id for file ( is unique for server)
      version - 32-bit number changed every time file modified
    wstat - set attributes
    remove - deletes the file
    clunk - closes a fid; fid becomes invalid
  Can you have hard links?
    Protocol doesn't support creating hard links
    Might also be hard with remove message?  Which link would you remove?
  How does version field compare to UNIX  combo?
    Incremental copies/backups much easier with UNIX model
  How do they implement a file cache (p. 15)?
    Not built into kernel--run user-level proxy if on slow network
    Keep cached contents until version field changes
    What about writes?  They claim it's write through
      But what if someone else is writing, too?
      Maybe flush cache if version field increases by > # of writes you did

For comparison, how does 9P differ from NFS?
  NFS does not show opens and closes
    Makes it hard to implement the kinds of user-level file servers they have
    E.g., connection server cs needs to garbage collect client channels
  NFS handles are server-chosen identifiers bound to particular files
    Opaque, but usually 
    fids are easier to pipeline (but doesn't sound like they do that)
  NFS3 WRITE reply tells you  after AND before the operation
    So you know if someone else wrote file since your cached version

Kernel architecture
  What is the channel abstraction in the kernel?  Structure contains:
      - Type of device, which indexes table of function pointers
           (w. eiaread for serial port, procread for proc, etc)
      - Server device number (in case multiple instances of Type)
      - Qid = 
      - Flags, device-specific information
    Where are channels used?
      File descriptors, text segment, current working directory, mount device
  How does mount device work?
    Mount device instance allocated by mount system call
        int mount(int fd, int afd, char *old, int flag, char *aname) 
    Creates a new channel, of type 'M', functions
    Associates that channel with target channel corresponding to 'fd'
    Any operations on new channel will invoke, e.g., mountread, mountwrite, ...
      Translate these into 9P messages sent to fd's channel
  What happens if you write to fd after mounting it?
    Would be bad, because could mess up kernel's 9P messages
    Kernel actually sets flag (CMGS) on mounted fd's channel, so can't do this
  Process/thread model
    To create process: cp /bin/date /proc/clone/mem?
      No - e.g., would cause problems when mounting /proc from other arch
    Instead?  rfork - control over namespace, environment, memory, fds,

What does exportfs do?
  Implements 9P protocol in terms of the open/close/read/write/etc. syscalls
  Is this straight-forward?
    Actually, ensuring unique type,dev,path is very annoying
      Why do you need to do this?  E.g., so that mount tables work properly
  Particularly bad if multiple instances of mount device
    Proposal: allow streams to be 'popped' off the mount device
    Allow reading and writing 9P messages to fd when CMSG flag set
    I actually implemented this; was not too hard
      Just need to do tag mapping (tags are their version of RPC xids)
  Possible lesson here--always strive for lowest com denom interfaces:
    It's easy to implement system call interface in terms of 9P
      This is what mount device does
    It's harder to implement 9P in terms of system call interface
      Which is what exportfs does
    So 9P is more general--maybe should replace syscall interface
    And Plan 9 syscalls orders of magnitude easier to distribute that UNIX
      Imagine trying to export the functionality of the UNIX syscall interface!

What is file server architecture
  Have memory, hard disks, and WORM drive - multi-layer architecture
  View WORM drive as infinite
    Assume hardware will improve faster than group generates data

How do you set up a TCP connection?
  Usually don't care that it's a TCP connection!
  E.g., want to talk SMTP (mail protocol) to host mail.stanford.edu
  Open /net/cs,  Write that you want SMTP channel to mail.stanford.edu
  Cs says:  open file /net/tcp/clone, write "connect 171.67.20.25!25"
  Do that, then read clone file, get back, e.g., "5".
  This means the file you opened as clone is really /net/tcp/5/ctl
  So open /net/tcp/5/data to read and write data to SMTP server

How do gateways work?
  Just import /net from gateway machine to get outside
How does ftp work?
  Just use ls and cp, it shows up as a file system

What is IL and why?
  Need reliability and order (which UDP doesn't have)
  Need message boundaries (which TCP doesn't have)
  In keeping with clean-slate approach, just design new protocol

Why is Plan9 better than an old time-sharing system?
  Couldn't you have multiple machines people can log into for less overload?
  Use NSF to give a single-system image on a bunch of old-school UNIX machines?
  Point is users have control of their terminals
    Can construct your own namespace even on CPU servers
    On UNIX, can be a pain to install software if you don't have root
    In Plan9 there isn't even really a notion of root

What is the difference between a file system interface and RPC / Net Objects?
  Network Objects have inheritance, while methods fixed for FS protocol
  FS has uniform and familiar access, protection, and naming mechanisms
    "the way things are named has profound influence on the system" (p. 20)

What has been the impact of plan 9?
  /proc file system now standard
  UTF-8 now standard

Why isn't everyone using Plan 9 today instead of Linux?
  Plan 9 was certainly ready in time
  Licensing issues prevented redistribution
  Maybe would-be early adopters didn't like centralized storage model
    Administrative issues Plan 9 addresses might not matter to basement hackers
    Not fundamental to Plan 9, but large installation was focus of Plan 9 group
  Building from the ground up made for a different user experience
    Goal of Linux was to replicate experience people already had
    E.g., people might want emacs, not sam / acme
  Software portability issues
    Had POSIX environment, but still easier to build UNIX software on Linux