Importing files from a hotfolder directory

The Catmandu data processing toolkit facilitates many import, export, and conversion tasks by support of common APIs (e.g. SRU, OAI-PMH) and databases (e.g. MongoDB, CouchDB, SQL…). But sometimes the best API and database is the file system. In this brief article I’ll show how to use a “hotfolder” to automatically import files into another Catmandu store.

A hotfolder is a directory in which files can be placed to automatically get processed. To facilitate the creation of such directories I created the CPAN module File::Hotfolder. Let’s first define a sample importer and storage in catmandu.yml configuration file:

---
importer:
  json:
    package: JSON
    options:
      multiline: 1
store:
  couchdb:
    package: CouchDB
    options:
      default_bag: import
...

We can now manually import JSON files into the import database of a local CouchDB like this:

catmandu import json to couchdb < filename.json

Manually calling such command for each file can be slow and requires access to the command line. How about defining a hotfolder to automatically import all JSON files into CouchDB? Here is an implementation:

use Catmandu -all;
use File::Hotfolder;
use File::Basename;
    
my $hotfolder = "import";
my $importer  = "json";
my $store     = "couchdb";
my $suffix    = qr{\.json};
    
my $store    = store($store);

watch( $hotfolder, 
    filter   => $suffix,
    scan     => 1,    
    delete   => 1,
    print    => WATCH_DIR | FOUND_FILE | CATCH_ERROR,
    callback => sub {
        $store->add_many( importer($importer, file => shift) );
    },
    catch    => 1,
)->loop;

The directory import is first scanned for existing files with extension .json and then watched for modified or new files. As soon as a file has been found, it is imported. The CATCH_ERROR options ensures to not kill the program if an import failed, for instance because of invalid JSON.

The current version of File::Hotfolder only works with Unix but it may be extended to other operating systems as well.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s