Sifter: file system transfer program

The program sifter is used to keep the files on two different computers the same or mirrors of each other. It scans the directory tree for files that have changed since the last time sifter was run, and writes these files to a disk or file. On the second machine it reads this disk and writes the files in the proper place.

It is called sifter because it will sift out only the files that you specify as important, and send these.

The low level file I/O for sifter requires the Java/C native interface, so this code will probably only compile on a computer running Unix. With some tweaking of the configuration files, it will also compile and run using cygwin.

The two major features that make this program useful (instead of just using tar or rsync) are:

  1. A configuration file allows you to rename files. For example:
           .bashrc      (on home machine) <-->  .bashrc.home (on work machine)
           .bashrc.work (on home machine) <-->  .bashrc      (on work machine)
           
  2. You can sift out certain files, and not send them. For example, any file named *.class, a.out, *.o, junk.*, or "any file that is executable and has magic type ELF".

The files may be transferred through the Internet (which uses Java's Remote Method Invocation), or on floppy disks, or any removable disk. I use a memory stick since I work behind a firewall.


What The Program Does

Here is a description of how the program behaves. In order to use sifter, you would have it installed on two different computers, and you would have the same configuration file on both computers. I have a computer at work and one at home, so I will describe how I use sifter.

At the end of the work day, I start sifter. It first reads my configuration file and offers to do a transfer with my home computer. After I accept the offer, it reads a file of persistent data related to my home computer. It notices that it had most recently done a receive from the home computer, so it offers to do a send, which I accept.

The program prompts me to insert a floppy disk, which I do. (I can't use an Internet connection because I work behind a firewall).

It then searches through my home directory, and all subdirectories for any file that has changed since the last time sifter was run. When sifter finds a file that has changed recently, it runs through a list of rules that are in the configuration file, and tries to decide what to do with the file. It can either send the file with a new name, with the same name, or leave it alone. For the most part, I send source code and text files as they are, I don't transfer compiled files, and I rename config files, like .Xdefaults to something like .Xdefaults.work. If it finds a directory that has changed, it sends a listing of all files in the directory.

If it fills the first disk, it asks for another. When it has written everything it wants to, it saves the persistent data, and quits.

When I get home I start sifter on my home computer. (I usually eat dinner first...) As before, it reads the configuration file and persistent data, and offers to receive data from work. It also goes through and keeps track of any files that have changed since the last time sifter was run, and makes a note of them in the persistent data file. The next time I send data, it will also send these files.

For each file and directory in the data stream, it will attempt to do some sanity checking in case I have accidentally changed files on both machines recently. If an old file is in one directory, but not in the other, it will be deleted. If both have changed, It offers to keep the more recent, or to pass both files to emacs so that I can run ediff on them.

For symmetry, translations are applied when the files are sent, and the inverse of the transformation is applied when they are received. This way the configuration file can be identical on both machines.


Version Compatibility

At the moment, I intend to make future versions backward compatible, and partially forward compatible.

This means that if you install a new version, the old persistent data will still be valid, or will be converted to the new version with a minimum of hardship.

You may upgrade sifter on one machine and not on another. As long as the major versions of the two different sifter's are not more than two off, they should still be able to talk to each other. For example version 1.3 can work with 3.5, but not with 4.0. This might end up being impractical, but it is a noble sentiment.


The first time you run sifter,it sets a flag in the persistent data file that there is no "previous send" date. It will send any file that is one day old, and send a listing of all directories. This forces the two computers to be in sync the first time.

Unfortunately, this takes a while if you have a large directory structure.


See sifter-install.html for more information about how to install it.

See sifter-configure.html for more information on how to write a configuration file.

See gui.html for more information on the GUI interface.

See the Source Forge project page for up-to-date information, downloads, and the CVS repository.


PENDING: (to be documented)


SourceForge Logo
Fred Gylys-Colwell Last Revised: January 19, 2005