Python Script for Automatically Copying Modified Files

I had a situation at work where I needed to copy static resources from one directory to another, any time they changed. A simple symlink wouldn’t do, because the destination location was on a shared drive, and not all users would have access to my machine. Although, there are existing solutions for this problem, it seemed a simple enough problem to write my own python script. Today’s article will describe my python script for monitoring file changes and copying those files to another location.

How do it…

The complete script is available on as a gist. It is a commandline script that uses the built in python argparse to parse three arguments: source, destination, and source_map:

import argparse
import os
import time

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description='Listen for file changes and mirror changed files to a second location.')
    parser.add_argument('-s', '--source', dest='source', action='store',
        default=None, help='The source file', required=False)
    parser.add_argument('-d', '--destination', dest='destination',
        action='store', default=None, help='The destination file',
        required=False)
    parser.add_argument('-m', '--source_map', dest='source_map', action='store',
        default=None, help='A CSV file mapping multiple source files to '
                           'multiple targets', required=False)
    args = parser.parse_args()

    if not (args.source_map or (args.source and args.destination)):
        raise ValueError(
            "You must provide either a source_map of files or "
            "a source and destination file")

Then it creates a list of files that need to be monitored for changes:

    file_list = []

    if args.source and args.destination:
        file_list.append((args.source, args.destination,))

    if args.source_map:
        source_map = os.path.normpath(args.source_map)
        with open(source_map, 'rb') as f:
            for line in f.readlines():
                file_list.append(tuple(line.strip().split(",")))

The meat of the function is to loop indefinitely, while checking to see if the modify time for any of the source files has changed, and updating the destination files when that happens:

    last_checked_map = {}

    # currently only ctrl+c will terminate
    while (True):
        for t in file_list:
            source_file = os.path.normpath(t[0])
            destination_file = os.path.normpath(t[1])
            try:
                stat = os.stat(source_file)
            except OSError as e:
                print "Encountered a OSError, skipping file:"
                print e
                continue
            last_time = last_checked_map.get(source_file)

            if not last_time or stat.st_mtime > last_time:
                f = open(source_file, 'rb')
                filedata = f.read()
                f.close()
                with open(destination_file, 'w') as f:
                    f.writelines(filedata)
                last_checked_map[source_file] = stat.st_mtime
                print "File %s changed, updated %s" % (
                    source_file, destination_file)

        time.sleep(1)

How it works…

The argparse defines and checks the arguments, then creates an object that can be used to reference the arguments. We raise an exception when source and destination or source_map are not provided, as these define what files to monitor and where to copy them to, if they change.

The provided files are appended to a list of tuples, where the first value is the source and the second value is the destination. The file specified by source_map allows for many files to be monitored, and should be a CSV with source_path,destination_path on each line.

The infinite while loop iterates over the files, using os.stat to see if the modified time has changed. The last modified time for the source file is stored in the last_checked_map dictionary, with the path to the source file as its key. If for some reason a source file does not exist, we print the error, but continue. The logic is written so that the source files always copy to the destination when the script starts.

To copy the source file, we read its data completely and then write it (line by line) to the destination file, using basic python file functions. The for loop is then slept for a second, to free up the processor.

For now, I just use ctrl+c to stop the script.

There’s more…

This is a basic script and probably a good homework assignment for a beginning class for Python, but it is also really useful. There is a lot of room for improvement, such as monitoring all files in an entire directory tree, listening for standard input to stop the script, or supporting the running of a command against the source file before populating the destination file. The source code is on github, so please clone, use, and improve.