cs357 Lecture Notes
Spring 2000
Week 13, Tuesday

For today you should have skimmed Chapter 10 and read
Chapter 11 in detail.

For next time you should print and read the paper available
from:

http://rocky.colby.edu/cs357/handouts/fast.filesys.ps

I will spend most of the next two lectures talking about
file systems.  The last homework is due on Thursday and
then the final exam is Friday 12 May from 12:30 to 2:30.

The exam will be comprehensive.  The format will be similar
to the other exams, but about 50% longer.

Files
-----

In MacOS and Windows one is tempted to think of files as
application-specific.

From the OS point of view, a file is just a sequence of
bytes.

The format, if any, is managed by applications.

Some environments distinguish

1) text files: all characters between 0-127, or all printable chars

2) binary files: true byte sequence (all values possible)

But again, this is just a statement about what applications
can read a file.


File attributes
---------------

1) name

2) location (set of disk blocks)

3) size

4) access: who can read/write/execute

5) accounting information


Multilevel Abstraction
----------------------

1) Basic disk operations

2) Basic file operations

3) File access patterns


Disk operations
---------------

allocate an available block

free an allocated block

read a block from disk into memory

write data from memory to a disk block

      (partial reads and writes may or may not be possible)


File operations
---------------

Basic operations

open: given a file name, find the location and check the
      permissions.  Return a handle used by other operations.

read: move some number of bytes into memory.  there is a
      current location pointer that is updated

seek: update the current pointer location explicitly

write: add some bytes to the end of a file and update the size.
       allocate new blocks if necessary

close: indicate that we are done reading/writing


Access methods
--------------

1) sequential access  (only open/read/write/close)

2) direct access (use seek with absolute arguments)

3) indexed access (use seek with information from the file)


Directory operations
--------------------

Directories map file names onto locations
	    
    (and organize other file info)

1) search for file

2) create a file

3) delete a file

4) list files

5) rename a file

UNIX introduces the notion of an inode, a data structure
that contains all the OS-level info pertaining to a file.

Directories map names to inodes.

inodes can then be used to access the file.


Consistency semantics
---------------------

What happens if concurrent threads/processes:

1) write the same file

2) read and write the same file

3) create and delete the same file


Should the OS provide implicit synchronization
for the file system or leave it up to the processes?

In shared memory land, the OS provides primitive but
no implicit help.

At the file system level, there are reasons we might
want help from the OS

1) the processes that are contending for the file system
   are not necessarily part of the same program.  they
   might be owned by different users.  harder for programmers
   to coordinate 

2) there are fewer likely access patterns.  more likely
   that the OS can do the right thing.


UNIX provides a simple consistency semantic

1) multiple processes can open a file at the same time

2) abstractly there is a single physical image and all
   file operations are atomic

The OS provides sync primitives, too.

See the man pages for flock and fcntl.


Disk block allocation
---------------------

Contiguous, linked, indexed.

Contiguous

1) not much info in the inodes

2) fast sequential access

3) fast random access

4) very difficult allocation

Linked

1) not much info in the inodes

2) easy allocation

3) a little overhead at the end of each block

4) possibly bad sequential access

5) very bad random access

Indexed

1) same overhead as linked, but collected into index block

2) easy allocation

3) good sequential and random access

4) inodes are now variable size

Multi-level indexed (UNIX inodes)

1) first few links are in the inode itself

2) single indirect block is similar to indexed

3) double indirect block / triple indirect block

4) indirection always comes with a price

   (either cache space or access time)


UNIX inode performance
----------------------

How big is the biggest file in UNIX?

32-bit block numbers, 4K disk blocks.

Can we refer to all of that with a single inode?

12 indices in inode

Indirect block is 4K, each address is 32 bits.

   (How many bytes can the double indirect block span?)

   (If we switch to a 64-bit file system address, can
   a single inode handle the whole thing?)


Free space management
---------------------

FAT = bitmap

Linked list of free blocks

Hybrids.  For example, keep a free inode.

Out of core data structures!


Directory implementation
------------------------

It's a mapping from strings to inodes.  Hash table.


File system caching
--------------------

Use some of physical memory to cache frequently-used
file blocks

1) directory blocks

2) inodes

3) recently read-written data

Limitation: has to be non-volatile or write-through.

Interesting conflict:

1) is it ever a good idea to page out a process's memory
   in order to cache a disk block?

2) how can we unify the page replacement policy and the
   disk cache policy?