WOODS

_images/banner.png

WOODS is a project aimed at investigating the implications of Out-of-Distribution generalization problems in sequential data along with it’s possible solution. To that goal, we offer a DomainBed-like suite to test domain generalization algorithms on our WILDS-like set of sequential data benchmarks inspired from real world problems of a wide array of common modalities in modern machine learning.

Quick Installation

WOODS is still under active developpement so it is still only available by cloning the repository on your local machine.

Installing requirements

With Conda

First, have conda installed on your machine (see their installation page if that is not the case). Then create a conda environment with the following command:

conda create --name woods python=3.7

Then activate the environment with the following command:

conda activate woods

With venv

You can use the python virtual environment manager virtualenv to create a virtual environment for the project. IMPORTANT: Make sure you are using python >3.7.

virtualenv /path/to/woods/env

Then activate the virtual environment with the following command:

source /path/to/env/woods/bin/activate

Clone locally

Once you’ve created the virtual environment, clone the repository.

git clone https://github.com/jc-audet/WOODS.git
cd WOODS

Then install the requirements with the following command:

pip install -r requirements.txt

Run tests

Run the tests to make sure everything is in order. More tests are coming soon.

pytest

Downloading the data

Before running any training run, we need to make sure we have the data to train on.

Direct Preprocessed Download

The repository offers direct download to the preprocessed data which is the quickest and most efficient way to get started. To download the preprocessed data, run the download module of the woods.scripts package and specify the dataset you want to download:

python3 -m woods.scripts.download_datasets DATASET\
        --data_path /path/to/data/directory

Source Download and Preprocess

For the sake of transparency, WOODS also offers the preprocessing scripts we took for all datasets in the preprecessing module of the woods.scripts package. You can also use the same module to download the raw data from the original source and run preprocessing yourself on it. DISCLAIMER: Some of the datasets take a long time to preprocess, especially the EEG datasets.

python3 -m woods.scripts.fetch_and_preprocess DATASET\
        --data_path /path/to/data/directory

Datasets Info

The following table lists the available datasets and their corresponding raw and preprocessed sizes.

Datasets Modality Requires Download Preprocessed Size Raw Size
Basic_Fourier 1D Signal No - -
Spurious_Fourier 1D Signal No - -
TMNIST Video Yes, but done automatically 0.11 GB -
TCMNIST_seq Video Yes, but done automatically 0.11 GB -
TCMNIST_step Video Yes, but done automatically 0.11 GB -
CAP EEG Yes 8.7 GB 40.1 GB
SEDFx EEG Yes 10.7 GB 8.1 GB
PCL EEG Yes 3.0GB 13.5 GB
LSA64 Video Yes 0.26 GB 1.5 GB
HHAR Sensor Yes 0.16 GB 3.1 GB

Running a Sweep

In WOODS, we evaluate the performance of a domain generalization algorithm by running a sweep over the hyper parameters definition space and then performing model selection on the training runs conducted during the sweep.

Running the sweep

Once we have the data, we can start running the sweep. The hparams_sweep module of the woods.scripts package provides the command line interface to create the list of jobs to run, which is then passed to the command launcher to launch all jobs. The list of jobs includes all of the necessary training runs to get the results from all trial seeds, and hyper parameter seeds for a given algorithm, dataset and test domain.

All datasets have the SWEEP_ENVS attributes that defines which test environments are included in the sweep. For example, the SWEEP_ENVS attribute for the Spurious Fourier dataset is only 1 test domain while for most real datasets SWEEP_ENVS consists of all domains.

In other words, for every combination of (algorithm, dataset, test environment) we train 20 different hyper parameter configurations on which we investigate 3 different trial seeds. This means that for every combination of (algorithm, dataset, test environment) we run 20 * 3 = 60 training runs.

python3 -m woods.scripts.hparams_sweep \
        --dataset Spurious_Fourier TCMNIST_seq \
        --objective ERM IRM \
        --save_path ./results \
        --launcher local

Here we are using the local launcher to run the jobs locally, which is the simplest launcher. We also offer other lauchers in the command_launcher module, such as slurm_launcher which is a parallel job launcher for the SLURM workload manager.

Compiling the results

Once the sweep is finished, we can compile the results. The compile_results module of the woods.scripts package provides the command line interface to compile the results. The –latex option is used to generate the latex table.

python3 -m woods.scripts.compile_results \
        --results_dir path/to/results \
        --latex

It is also possible to compile the results from multiple directories containing complementary sweeps results. This will put all of those results in the same table.

python3 -m woods.scripts.compile_results \
        --results_dir path/to/results/1 path/to/results/2 path/to/results/3 \
        --latex

There are other mode of operation for the compile_results module, such as --mode IID which takes results from a sweep with no test environment and report the results for each test environment separately.

python3 -m woods.scripts.compile_results \
        --results_dir path/to/results/1 path/to/results/2 path/to/results/3 \
        --mode IID

There is also --mode summary which reports the average results for every dataset of all objectives in the sweep.

python3 -m woods.scripts.compile_results \
        --results_dir path/to/results/1 path/to/results/2 path/to/results/3 \
        --mode summary

You can also use the --mode hparams which reports the hparams of the model chosen by model selection

python3 -m woods.scripts.compile_results \
        --results_dir path/to/results/1 path/to/results/2 path/to/results/3 \
        --mode hparams

Advanced usage

If 60 jobs is too many jobs for you available compute, or too few for you experiments you can change the number of seeds investigated, you can call the --n_hparams and --n_trials argument.

python3 -m woods.scripts.hparams_sweep \
        --dataset Spurious_Fourier TCMNIST_seq \
        --objective ERM IRM \
        --save_path ./results \
        --launcher local \
        --n_hparams 10 \
        --n_trials 1

If some of the test environment of a dataset is not of interest to you, you can specify which test environment you want to investigate using the --unique_test_env argument

python3 -m woods.scripts.hparams_sweep \
        --dataset Spurious_Fourier TCMNIST_seq \
        --objective ERM IRM \
        --save_path ./results \
        --launcher local \
        --unique_test_env 0

You can run a sweep with no test environment by specifying the --unique_test_env argument as None.

python3 -m woods.scripts.hparams_sweep \
        --dataset Spurious_Fourier TCMNIST_seq \
        --objective ERM IRM \
        --save_path ./results \
        --launcher local \
        --unique_test_env None

Adding an Algorithm

In this section, we will walk through the process of adding an algorithm to the framework.

Defining the Algorithm

We first define the algorithm by creating a new class in the objectives module. In this example we will add scaled_ERM which is simply ERM with a random scale factor between 0 and max_scale for each environment in a dataset, where max_scale is an hyperparameter of the objective.

Let’s first define the class and its int method to initialize the algorithm.

class scaled_ERM(ERM):
    """
    Scaled Empirical Risk Minimization (scaled ERM)
    """

    def __init__(self, model, dataset, loss_fn, optimizer, hparams):
        super(scaled_ERM, self).__init__(model, dataset, loss_fn, optimizer, hparams)

        self.model = model
        self.loss_fn = loss_fn
        self.optimizer = optimizer

        self.max_scale = hparams['max_scale']
        self.scaling_factor = self.max_scale * torch.rand(len(dataset.train_names)) 

We then need to define the update function, which take a minibatch of data and compute the loss and update the model according to the algorithm definition. Note here that we do not need to define the predict function, as it is already defined in the base class.

    def update(self, minibatches_device, dataset, device):

        ## Group all inputs and send to device
        all_x = torch.cat([x for x,y in minibatches_device]).to(device)
        all_y = torch.cat([y for x,y in minibatches_device]).to(device)
        
        ts = torch.tensor(dataset.PRED_TIME).to(device)
        out = self.predict(all_x, ts, device)

        ## Reshape the data so the first dimension are environments)
        out_split, labels_split = dataset.split_data(out, all_y)

        env_losses = torch.zeros(out_split.shape[0]).to(device)
        for i in range(out_split.shape[0]):
            for t_idx in range(out_split.shape[2]):     # Number of time steps
                env_losses[i] += self.scaling_factor[i] * self.loss_fn(out_split[i, :, t_idx, :], labels_split[i,:,t_idx])

        objective = env_losses.mean()

        # Back propagate
        self.optimizer.zero_grad()
        objective.backward()
        self.optimizer.step()

Adding necessary pieces

Now that our algorithm is defined, we can add it to the list of algorithms at the top of the objectives module.

OBJECTIVES = [
    'ERM',
    'IRM',
    'VREx',
    'SD',
    'ANDMask',
    'IGA',
    'scaled_ERM',
]

Before being able to use the algorithm, we need to add the hyper parameters related to this algorithm in the hyperparams module. Note: the name of the funtion needs to be the same as the name of the algorithm followed by _hyper.

def scaled_ERM_hyper(sample):
    """ scaled ERM objective hparam definition 
    
    Args:
        sample (bool): If ''True'', hyper parameters are gonna be sampled randomly according to their given distributions. Defaults to ''False'' where the default value is chosen.
    """
    if sample:
        return {
            'max_scale': lambda r: r.uniform(1.,10.)
        }
    else:
        return {
            'max_scale': lambda r: 2.
        }

Run some tests

We can now run a simple test to check that everything is working as expected

pytest

Try the algorithm

Then we can run a training run to see how the algorithm performs on any dataset

python3 -m woods.scripts.main train \
        --dataset Spurious_Fourier \
        --objective scaled_ERM \
        --test_env 0 \
        --data_path ./data

Run a sweep

Finally, we can run a sweep to see how the algorithm performs on all the datasets

python3 -m woods.scripts.hparams_sweep \
        --objective scaled_ERM \
        --dataset Spurious_Fourier \
        --data_path ./data \
        --launcher dummy

Adding a Dataset

In this section, we will walk through the process of adding an dataset to the framework.

Defining the Algorithm

We first define the dataset by creating a new class in the datasets module. In this example we will add flat_MNIST which is the MNIST dataset, but the image is fed to a sequential model pixel by pixel and the environments are different orders of the pixels.

First let’s define the dataset class and its init method.

class flat_MNIST(Multi_Domain_Dataset):
    """ Class for flat MNIST dataset

    Each sample is a sequence of 784 pixels.
    The task is to predict the digit

    Args:
        flags (argparse.Namespace): argparse of training arguments

    Note:
        The MNIST dataset needs to be downloaded, this is automaticaly done if the dataset isn't in the given data_path
    """
    ## Dataset parameters
    SETUP = 'seq'
    TASK = 'classification'
    SEQ_LEN = 28*28
    PRED_TIME = [783]
    INPUT_SHAPE = [1]
    OUTPUT_SIZE = 10

    ## Environment parameters
    ENVS = ['forwards', 'backwards', 'scrambled']
    SWEEP_ENVS = list(range(len(ENVS)))

    def __init__(self, flags, training_hparams):
        super().__init__()

        if flags.test_env is not None:
            assert flags.test_env < len(self.ENVS), "Test environment chosen is not valid"
        else:
            warnings.warn("You don't have any test environment")

        # Save stuff
        self.test_env = flags.test_env
        self.class_balance = training_hparams['class_balance']
        self.batch_size = training_hparams['batch_size']

        ## Import original MNIST data
        MNIST_tfrm = transforms.Compose([ transforms.ToTensor() ])

        # Get MNIST data
        train_ds = datasets.MNIST(flags.data_path, train=True, download=True, transform=MNIST_tfrm) 
        test_ds = datasets.MNIST(flags.data_path, train=False, download=True, transform=MNIST_tfrm) 

        # Concatenate all data and labels
        MNIST_images = torch.cat((train_ds.data.float(), test_ds.data.float()))
        MNIST_labels = torch.cat((train_ds.targets, test_ds.targets))

        # Create sequences of 784 pixels
        self.TCMNIST_images = MNIST_images.reshape(-1, 28*28, 1)
        self.MNIST_labels = MNIST_labels.long().unsqueeze(1)

        # Make the color datasets
        self.train_names, self.train_loaders = [], [] 
        self.val_names, self.val_loaders = [], [] 
        for i, e in enumerate(self.ENVS):

            # Choose data subset
            images = self.TCMNIST_images[i::len(self.ENVS),...]
            labels = self.MNIST_labels[i::len(self.ENVS),...]

            # Apply environment definition
            if e == 'forwards':
                images = images
            elif e == 'backwards':
                images = torch.flip(images, dims=[1])
            elif e == 'scrambled':
                images = images[:, torch.randperm(28*28), :]

            # Make Tensor dataset and the split
            dataset = torch.utils.data.TensorDataset(images, labels)
            in_dataset, out_dataset = make_split(dataset, flags.holdout_fraction)

            if i != self.test_env:
                in_loader = InfiniteLoader(in_dataset, batch_size=training_hparams['batch_size'])
                self.train_names.append(str(e) + '_in')
                self.train_loaders.append(in_loader)
            
            fast_in_loader = torch.utils.data.DataLoader(in_dataset, batch_size=64, shuffle=False, num_workers=self.N_WORKERS, pin_memory=True)
            self.val_names.append(str(e) + '_in')
            self.val_loaders.append(fast_in_loader)
            fast_out_loader = torch.utils.data.DataLoader(out_dataset, batch_size=64, shuffle=False, num_workers=self.N_WORKERS, pin_memory=True)
            self.val_names.append(str(e) + '_out')
            self.val_loaders.append(fast_out_loader)

        # Define loss function
        self.log_prob = nn.LogSoftmax(dim=1)
        self.loss = nn.NLLLoss(weight=self.get_class_weight().to(training_hparams['device']))

Note: you are required to define the following variables: * SETUP * SEQ_LEN * PRED_TIME * INPUT_SHAPE * OUTPUT_SIZE * ENVS * SWEEP_ENVS you are also encouraged to redefine the following variables: * N_STEPS * N_WORKERS * CHECKPOINT_FREQ

Adding necessary pieces

Now that our algorithm is defined, we can add it to the list of algorithms at the top of the objectives module.

DATASETS = [
    # 1D datasets
    'Basic_Fourier',
    'Spurious_Fourier',
    # Small images
    "TMNIST",
    # Small correlation shift dataset
    "TCMNIST_seq",
    "TCMNIST_step",
    ## EEG Dataset
    "CAP_DB",
    "SEDFx_DB",
    ## Financial Dataset
    "StockVolatility",
    ## Sign Recognition
    "LSA64",
    ## Activity Recognition
    "HAR",
    ## Example
    "flat_MNIST",
]

Before being able to use the dataset, we need to add the hyper parameters related to this dataset in the hyperparams module. Note: the name of the funtion needs to be the same as the name of the dataset followed by _train and _model.

def flat_MNIST_train(sample):
    """ flat_MNIST model hparam definition 
    
    Args:
        sample (bool): If ''True'', hyper parameters are gonna be sampled randomly according to their given distributions. Defaults to ''False'' where the default value is chosen.
    """
    if sample:
        return {
            'class_balance': lambda r: True,
            'weight_decay': lambda r: 0.,
            'lr': lambda r: 10**r.uniform(-4.5, -2.5),
            'batch_size': lambda r: int(2**r.uniform(3, 9))
        }
    else:
        return {
            'class_balance': lambda r: True,
            'weight_decay': lambda r: 0,
            'lr': lambda r: 1e-3,
            'batch_size': lambda r: 64
        }

def flat_MNIST_model():
    """ flat_MNIST model hparam definition 
    
    Args:
        sample (bool): If ''True'', hyper parameters are gonna be sampled randomly according to their given distributions. Defaults to ''False'' where the default value is chosen.
    """
    return {
        'model': lambda r: 'LSTM',
        'hidden_depth': lambda r: 1, 
        'hidden_width': lambda r: 20,
        'recurrent_layers': lambda r: 2,
        'state_size': lambda r: 32
    }

Run some tests

We can now run a simple test to check that everything is working as expected

pytest

Try the algorithm

Then we can run a training run to see how algorithms performs on your dataset

python3 -m woods.scripts.main train \
        --dataset flat_MNIST \
        --objective ERM \
        --test_env 0 \
        --data_path ./data

Run a sweep

Finally, we can run a sweep to see how the algorithms performs on your dataset

python3 -m woods.scripts.hparams_sweep \
        --objective ERM \
        --dataset flat_MNIST \
        --data_path ./data \
        --launcher dummy

Contributing

Woods is still under developpement and is open to contributions. Just fork the repository and start coding! When you think you have something to contribute, open an issue or a pull request.

If you have a published algorithm that you want to be added as a benchmark please open a pull request we will be happy to add it to the list of available algorithms.

If you have a sequencial dataset that you think has a generalization problem, please open a pull request and we will be happy to add it to the list of available datasets.

API Documentation

woods

woods.command_launchers module

Set of functions used to launch lists of python scripts

Summary

Functions:

dummy_launcher

Doesn't launch any scripts in commands, it only prints the commands.

local_launcher

Launch all of the scripts in commands on the local machine serially.

slurm_launcher

Parallel job launcher for computationnal cluster using the SLURM workload manager.

Reference
woods.command_launchers.dummy_launcher(commands)

Doesn’t launch any scripts in commands, it only prints the commands. Useful for testing.

Taken from : https://github.com/facebookresearch/DomainBed/

Parameters

commands (List) – List of list of string that consists of a python script call

woods.command_launchers.local_launcher(commands)

Launch all of the scripts in commands on the local machine serially. If GPU is available it is gonna use it.

Taken from : https://github.com/facebookresearch/DomainBed/

Parameters

commands (List) – List of list of string that consists of a python script call

woods.command_launchers.slurm_launcher(commands)

Parallel job launcher for computationnal cluster using the SLURM workload manager.

Launches all the jobs in commands in parallel according to the number of tasks in the slurm allocation. An example of SBATCH options:

#!/bin/bash
#SBATCH --job-name=<job_name>
#SBATCH --output=<job_name>.out
#SBATCH --error=<job_name>_error.out
#SBATCH --ntasks=4
#SBATCH --cpus-per-task=8
#SBATCH --gres=gpu:4
#SBATCH --time=1-00:00:00
#SBATCH --mem=81Gb

Note

–cpus-per-task should match the N_WORKERS defined in datasets.py (default 4)

Note

there should be equal number of –ntasks and –gres

Parameters

commands (List) – List of list of string that consists of a python script call

woods.datasets module

woods.hyperparams module

woods.model_selection module

woods.models module

woods.objectives module

woods.train module

woods.utils module

woods.command_launchers

Set of functions used to launch lists of python scripts

woods.scripts

woods.scripts.compile_results module
woods.scripts.download module
woods.scripts.fetch_and_preprocess module
woods.scripts.hparams_sweep module
woods.scripts.main module
woods.scripts.visualize_results module

woods.scripts

Library

woods

woods.command_launchers

Set of functions used to launch lists of python scripts

Scripts

woods.scripts

Indices and tables