13 Dec 2015
Some time back, a paper authored by Rob Pike titled The Hideous Name convinced me of the value of hierarchical naming conventions, and especially, of kernel-side support for such. When porting STS to run on the Kestrel-3 emulator, I used the opportunity to retrofit hierarchical naming support into the kernel. As the sole user of STS at the moment, breaking backward compatibility with existing software was not a consideration, and it seemed like as good a time as any to explore the concept.
This might have been a mistake, at least from an implementation complexity point of view; while I’ve regained feature parity with the latest Kestrel-2 version of STS, getting the web of interacting data structures and their respective memory management inside the kernel has been much harder to debug.
Looking back on my experience implementing STS, both the flat namespaced versions 1.0 through 1.2 for the Kestrel-2, and of the hierarchically namespaced version 1.5 for the Kestrel-3, it remains unclear to me that hierarchical namespaces offer any significant value over a flat name-space.
Before defending my position, I should probably explain what I mean by hierarchical filenaming and flat filenaming.
The first example comes from IBM, with their System/360 through z/Architecture mainframe family. With the exception of Unix or Linux ports, most operating systems for IBM’s mainframe family identify files using up to 44 characters. (Note: IBM literature refers to them as datasets, not as files; for our purposes, they’re the same thing.) These names are typically formatted as a dot-separated sequence of “qualifiers,” each of which can vary between one to eight characters. Here are a few examples:
As you can see,
the mainframe environment does have a hierarchy of sorts,
albeit one imposed mostly by convention rather than enforced by the kernel.
Many mainframe applications
are built to accept
DEPTNAME.USERNAME as a common prefix
for filenames it uses when processing related sets of data.
z/OS and its predecessors identify volumes
by name, such as
However, z/OS treats volume naming separately from file naming.
This explains why “uncataloged” files
must have their name and volume name
specified separately in job steps or user interfaces.
This is a gross over-simplification, of course. Things become significantly clearer and easier, though, if you catalog your file. This associates all the relevant metadata about how to find the file in a centralized directory service. That way, you need only provide the data set name, and magically, your application knows which volume, which cylinders on that volume, which record format to use, and all sorts of other useful information needed to perform file I/O.
In general, I’m not going to consider cataloging. I’m strictly keeping things simple, by maintaining a two-dimensional space of names: a file with the given name, located on a volume with a given name.
Because volumes were identified by label, and not by a specific device number, the system software can locate the physical device using a lookup of the volume name. Thus, most file references were resolved using two linear lookups: the first to resolve the volume, and the second to resolve the name on that volume.
The second example
comes from my experience
with Commodore Business Machines’ 8-bit computers.
Starting with the Commodore PET,
and lasting all the way through to the Commodore 128,
CBM DOS always implemented a flat naming environment.
Each filename could be no more than 16 characters long.
Unlike IBM mainframes,
no particular convention for hierarchy existed,
either in the system software or in applications.
Nonetheless, prefix wildcards like
supported by external disk units,
could be used to fake it.
For example, from Commodore 64 BASIC,
you can get a directory of all files
starting with the common prefix
FOO.BAR. like this:
then listing the final result. Likewise, the equivalent to removing an entire directory looked like this:
OPEN 15, 8, 15, "S0:FOO.BAR.*":CLOSE 15
instructed the disk controller
to walk its directory,
and for all files matching the given filename pattern,
scratch (delete) the file.
CLOSE command would block until the scratches have completed.
As with mainframe environments, applications needed to explicitly identify volumes on which files were located. However, instead of a volume name lookup, it did so by identifying the ordinal device number of the storage device. It’s device independent in the sense that applications generally didn’t care what kind of disk unit appeared at device 8, only that a disk unit appeared there. Specifics of the hardware only mattered when you had to deal with, among other things, serial I/O accelerators or disk copy protection mechanisms. Thus, CBM DOS and its interactions on the host computer could be said to be device independent as well; but, it clearly was not location independent.
Atari TOS, MS-DOS, Windows, et. al. provide hierarchical naming, but it’s not completely uniform. These operating systems provide two kinds of hierarchy: the device or volume layer, and the filename itself.
To illustrate these two levels of hierarchy, I’ve listed a few such filenames below:
The former example might reference a file on the computer’s startup harddrive,
most typically assigned the device
since the two preceding devices are often reserved for floppy disks.
The latter example might reference a network mount,
and thus might reference files on a dedicated network server.
DOS, Windows, etc. know how to handle
the different access methods
because of the information in front of the colon.
The handler associated with “device C” knows how to access the local harddrive,
while the handler associated with “device U” knows how to use the network.
In both cases,
everything after the colon is treated opaquely
by whatever filesystem is associated with the associated (real or virtual) device.
Now, I’ll provide an example of a pure hierarchical filenaming convention.
Unix is perhaps the prototypical example of such a thing.
Indeed, MS-DOS, Windows, etc. gets their directory syntax from Unix.
(However, a backslash
\ is used instead of a forward slash
because in MS-DOS,
/ was already commonly used
for other command-related syntax.)
Typical filenames appear like these:
/etc/aliases /usr/local/lib/libcurses.so /home/kc5tja/Desktop/lessons-learned.txt
Note that Unix-style hierarchical naming is basically the same as in Windows,
except no indication of a device is readily visible.
Unix determines which filesystem to access based on a prefix of the pathname.
/ might correspond to
C: in Windows,
/usr/local might correspond to a different disk drive (maybe
D: in Windows),
and so forth.
Unix provides something called a mount table to keep track of these associations.
This means that you can replicate everything MVS and Windows does with clever naming conventions in the namespace. For example:
C:\WINDOWS\WIN.EXE /mnt/hd0/c/windows/win.exe U:\USR\LOCAL\LIB\LIBCURSES.SO /usr/local/lib/libcurses.so DSNAME(DEPT.USER.FILENAME) VOLUME(VOL123) /mnt/mainframe/vol123/dept.user.filename
Based on this knowledge, one must wonder why one wouldn’t want to use hierarchical filenaming!
While I do not deny the user benefits of hierarchical naming, I can state with some authority that getting things right inside the OS is, well, hard.
It’s easy enough to get name resolution working correctly;
however, doing this requires a ton more data structures
than if you just used flat namespaces or an abstraction thereof.
For example, to implement the file “/rom/m2” in STS V1.5’s ROM filesystem,
I need a
DirObj structure to represent
DirObj to represent
DirEnt to name it
RomFile to represent
DirEnt to name the
Had I used a flat namespace,
this could have been implemented in only two data structures:
the first being a linked list header,
and the second being a node on that list
describing the file and giving it its name.
Some might say that with hierarchical namespaces you can rely on links (hard and soft) to provide multiple names to files and directories. GoboLinux uses these to excellent effect to make a trivial to implement, and virtually uncorruptable, package manager for the Linux environment. For a long time, I conceived of a package manager for STS and its successors strongly influenced by Gobo’s technology.
First, I’d like to point out that symbolic links are not a feature unique to hierarchical filesystems. Proof of this can be found in any Unix-like OS where you create a plurality of symbolic or hard links in a single directory.
Second, the value of links is questionable. Yes, I use them frequently, and when applied judiciously, they provide excellent value. Nonetheless, in the face of these links, you lose the benefits of hierarchy quickly. Getting something as simple as:
cd .. # change to parent directory pwd # print current working directory name
correct is actually a hard enough problem that a genius-level systems software engineer, Rob Pike himself in fact, can only “solve” it for Plan 9 using a memory-inefficient, brute-force solution (to summarize, storing a complete copy of the filename for every tree node encountered as directory traversal happens. If you get to a specific point in the filesystem through N paths, that’s just too bad: that one point now has N chains of nodes in kernel space, each named for its unique path). This is actually an attack vector against any kernel using Pike’s method: using millions of processes, create a unique mount point or link, with a long name, to a common, well-known directory, then change into that directory via the mount or link. Do this enough times, and you’ll exhaust the kernel’s memory. (This is also why kernels should be as stateless as possible, but that’s a different lesson for a different post.)
Rob Pike is clearly aware of problems with links, as documented by his overall dislike for links in general, and symbolic links in particular. He’s written several papers on the topic, and I suggest you read them. They’re quite fascinating; and, even if you disagree with Pike, you’ll at least find them thought provoking.
flat namespaces have evolved for decades
(albeit in its own niches)
without the need for links.
The closest thing to symbolic links in z/OS is the VSAM catalog.
In other environments, like AmigaOS,
logical device names and environment variables
serve similar roles to symbolic links,
without incurring their expense,
cd .. cannot cross a device or logical name boundary.
Above, I alluded to the GoboLinux package management method, and how I’d like to borrow heavily from it to implement STS’ own package management system.
So, how would I replicate GoboLinux’s package management semantics in a flat namespace without supporting symbolic links? One approach is to borrow an idea from z/OS, and use something called a “partitioned data set”, or on Unix, an “archive file.” A program is considered installed and available for a user to use if the program is present in a well-known archive file. Similar archive files could exist to hold different kinds of program resources; for example, we could store binaries in a file simply called ARCHIVE.PRG, libraries for binaries in ARCHIVE.LIBS, system-wide configurations in ARCHIVE.CONFIG, and so on. However, I think this will get messy over time.
Instead, I think a better solution is to stay closer to Gobo’s example.
Given a future version of the STS command shell,
and given that I type a command named
that shell can simply look for a file
PRG/cmd to run.
We know it’s executable because,
if it appears at all,
it appears with the
PRG/ prefix and the file’s header checks out.
We needn’t concern ourselves with the current version of the command,
since by definition, it is the version the user specifically requested.
Typically, it will be the latest version installed,
but simple tools would be provided to switch versions as well.
This program can depend on various libraries as well;
for example, to provide a text-based user interface,
PRG/cmd might elect to load
Configuration for the program might be stored in a variety
of files, such as
CFG/cmd for system-wide configuration,
username/CFG/cmd for a per-user configuration on a multi-user system,
and so on.
Plug-in modules for
can be stored on the filesystem as
PLUGIN/cmd/whatever, or whatever’s appropriate for the command’s design.
What if the user wants to invoke an older version of
Since all the shell does is prepend the typed filename with
it follows that you can just name the version explicitly,
This results in an attempt to run the executable
NOTE: In STS versions 1.0 through 1.2,
I used the full-stop as a qualifier separator (
In STS V1.5, I switched to using a slash (
since I thought hierarchical namespaces would make
mounting POSIX filesystems easier later on.
I actually prefer using full-stops,
and in all honesty, I’ll probably go back to using them for V2.0.
This has the benefit that someone can make a filesystem driver for Linux or BSD,
and not have to worry about faking a hierarchy to the OS where there is none.
Besides, if I had used slashes,
and my directory contains two files
how do you map that into a Posix-compatible directory listing?
B would have to appear as a file, or as a directory;
it cannot appear as both concurrently.
A common criticism concerning flat namespaces is that they do not support large directories effectively. This was especially true for earlier filesystem implementations, but it was a problem just as poignant with hierarchical filesystems (such as with Unix System V Release 4), particularly with degenerate directory layouts.
Imagine a disk formatted in STS V1.0’s filesystem, SL5. SL5 maintained a single, flat directory, allocated as a single, contiguous chunk of disk space. SL5 can pack information for 8 files in a single disk sector. To support a directory of 256 files, we need at least 32 512-byte sectors. For a typical PC hard drive, this is typically about three tracks of data. To locate any file on the volume, you can reasonably expect to take around 10 to 20 milliseconds using a sequential search.
To support 256 million files, however, is a different matter. That would require 32 million sectors of alloted space, and probably would require close to 20,000 seconds to find a file using the same linear search algorithm.
Obviously, the continued reliance upon sequential searches is the problem. To fix this, one would need a tree of some kind. You could use a hierarchical naming system, where the structure of the tree is exposed to the user and directly under his or her control. Or, you can create an index, managed by the filesystem for you, to help locate arbitrary files quickly.
The problem with the former approach
is that it leads inexorably to suboptimal directory layouts.
when I wrote GCOM
back in the day,
I decided to use the filesystem to store the registry mappings
from COM CLSIDs
(effectively, strings of arbitrary hexadecimal values)
to shared object files.
I knew ahead of time
that the directory in which this would happen
could grow to be rather huge if GCOM ever took off.
I was prepared to introduce a convention where
you relied on subdirectories to improve directory lookup performance.
For instance, if a CLSID had the value of AAABBBCCC…,
then I could place the actual mapping in a file
You also see this approach used in e-mail servers,
and other facilities working with tens of thousands of files, or more.
The most obvious problem with this approach
can be seen when you consider what happens
when a disproportionate number of COM CLSIDs all share a common prefix.
Do you then switch to two levels of this hierarchy over all,
or just for that one subdirectory?
Then, there’s all the text processing that must happen inside the library
to make this magic all come together and work.
In this case, hierarchy gets in the way.
Meanwhile, IBM’s linear filenaming approach for OS/360 (introduced 1964) has remained intact all the way through to today’s latest z/OS release, supporting everything from tape drives, to drum drives, to disk drives, to CD-ROMs, to network storage devices, and who knows what else will come in the future, handling everything from kilobytes to petabytes of data across tens to thousands of different volumes in high load, low response-time situations. It performs this well because, among other tricks of the trade, it relies on a system-managed index. The “hierarchy”, if you will, is computed dynamically as the filesystem and its flat namespace evolves over time.
Flat name-spaces using explicitly named volumes solve 90% of the problems I’d ever use a filesystem to solve, namely locating persistent data. It’s substantially simpler to implement than a Unix-semantics hierarchical namespace, with or without link support.
One legitimate criticism to a flat namespace
is when you must rename a qualifier.
mv oldDir newDir,
oldDir/foo now is known as
To do this with a flat namespace,
you’d have to iterate through the directory
and rename every file with a matching prefix.
Finding the first entry should be fast enough
(as fast as opening the file);
it’s the need to rename subsequent files that becomes the issue.
With a good index data structure,
such as a Skip List,
you can compute both starting and ending locations
to minimize the amount of work needed.
There are more efficient ways of solving this efficiently as well,
but again, is the cost actually justifiable?
Except for this one issue, which I consider minor based both on STS’ own application domain and on my experience managing thousands of Linux servers in data centers, I’m hard-pressed to think of any real problem with flat namespaces today. There’s nothing inherently wrong with Unix-like semantics; I just don’t think it’s worth the expense of putting it into the kernel.
One aside though:
unlike z/OS, I would use longer filenames.
I’d choose, say, minimum of 160 byte filenames,
allowing me to use 9 qualifiers of 16 characters each,
with some slop left over.
I also wouldn’t impose a hard limit on qualifier length.
a.someReallyLongQualifierHere.b would be a valid filename,
despite the middle qualifier being longer than 16 characters.
But, that’s just me.
Software engineer by day. Amateur computer engineer by night. Founded the Kestrel Computer Project as a proof-of-concept back in 2007, with the Kestrel-1 computer built around the 65816 CPU. Since then, he's evolved the design to use a simple stack-architecture CPU with the Kestrel-2, and is now in the process of refining the design once more with a 64-bit RISC-V compatible engine in the Kestrel-3.
Samuel is or was:
Samuel seeks inspirations in many things, but is particularly moved by those things which moved or enabled him as a child. These include all things Commodore, Amiga, Atari, and all those old Radio-Electronics magazines he used to read as a kid.
Today, he lives in the San Francisco Bay Area with his beautiful wife, Steph, and four cats; 13, 6.5, Tabitha, and Panther.