WOODS

WOODS is a project aimed at investigating the implications of Out-of-Distribution generalization problems in sequential data along with it’s possible solution. To that goal, we offer a DomainBed-like suite to test domain generalization algorithms on our WILDS-like set of sequential data benchmarks inspired from real world problems of a wide array of common modalities in modern machine learning.
Quick Installation
WOODS is still under active developpement so it is still only available by cloning the repository on your local machine.
Installing requirements
With Conda
First, have conda installed on your machine (see their installation page if that is not the case). Then create a conda environment with the following command:
conda create --name woods python=3.7
Then activate the environment with the following command:
conda activate woods
With venv
You can use the python virtual environment manager virtualenv to create a virtual environment for the project. IMPORTANT: Make sure you are using python >3.7.
virtualenv /path/to/woods/env
Then activate the virtual environment with the following command:
source /path/to/env/woods/bin/activate
Clone locally
Once you’ve created the virtual environment, clone the repository.
git clone https://github.com/jc-audet/WOODS.git
cd WOODS
Then install the requirements with the following command:
pip install -r requirements.txt
Run tests
Run the tests to make sure everything is in order. More tests are coming soon.
pytest
Downloading the data
Before running any training run, we need to make sure we have the data to train on.
Direct Preprocessed Download
The repository offers direct download to the preprocessed data which is the quickest and most efficient way to get started. To download the preprocessed data, run the download module of the woods.scripts package and specify the dataset you want to download:
python3 -m woods.scripts.download_datasets DATASET\
--data_path /path/to/data/directory
Source Download and Preprocess
For the sake of transparency, WOODS also offers the preprocessing scripts we took for all datasets in the preprecessing module of the woods.scripts package. You can also use the same module to download the raw data from the original source and run preprocessing yourself on it. DISCLAIMER: Some of the datasets take a long time to preprocess, especially the EEG datasets.
python3 -m woods.scripts.fetch_and_preprocess DATASET\
--data_path /path/to/data/directory
Datasets Info
The following table lists the available datasets and their corresponding raw and preprocessed sizes.
Datasets | Modality | Requires Download | Preprocessed Size | Raw Size |
---|---|---|---|---|
Basic_Fourier | 1D Signal | No | - | - |
Spurious_Fourier | 1D Signal | No | - | - |
TMNIST | Video | Yes, but done automatically | 0.11 GB | - |
TCMNIST_seq | Video | Yes, but done automatically | 0.11 GB | - |
TCMNIST_step | Video | Yes, but done automatically | 0.11 GB | - |
CAP | EEG | Yes | 8.7 GB | 40.1 GB |
SEDFx | EEG | Yes | 10.7 GB | 8.1 GB |
PCL | EEG | Yes | 3.0GB | 13.5 GB |
LSA64 | Video | Yes | 0.26 GB | 1.5 GB |
HHAR | Sensor | Yes | 0.16 GB | 3.1 GB |
Running a Sweep
In WOODS, we evaluate the performance of a domain generalization algorithm by running a sweep over the hyper parameters definition space and then performing model selection on the training runs conducted during the sweep.
Running the sweep
Once we have the data, we can start running the sweep. The hparams_sweep module of the woods.scripts package provides the command line interface to create the list of jobs to run, which is then passed to the command launcher to launch all jobs. The list of jobs includes all of the necessary training runs to get the results from all trial seeds, and hyper parameter seeds for a given algorithm, dataset and test domain.
All datasets have the SWEEP_ENVS
attributes that defines which test environments are included in the sweep. For example, the SWEEP_ENVS
attribute for the Spurious Fourier
dataset is only 1 test domain while for most real datasets SWEEP_ENVS
consists of all domains.
In other words, for every combination of (algorithm, dataset, test environment) we train 20 different hyper parameter configurations on which we investigate 3 different trial seeds. This means that for every combination of (algorithm, dataset, test environment) we run 20 * 3 = 60 training runs.
python3 -m woods.scripts.hparams_sweep \
--dataset Spurious_Fourier TCMNIST_seq \
--objective ERM IRM \
--save_path ./results \
--launcher local
Here we are using the local launcher to run the jobs locally, which is the simplest launcher. We also offer other lauchers in the command_launcher module, such as slurm_launcher which is a parallel job launcher for the SLURM workload manager.
Compiling the results
Once the sweep is finished, we can compile the results. The compile_results module of the woods.scripts package provides the command line interface to compile the results. The –latex option is used to generate the latex table.
python3 -m woods.scripts.compile_results \
--results_dir path/to/results \
--latex
It is also possible to compile the results from multiple directories containing complementary sweeps results. This will put all of those results in the same table.
python3 -m woods.scripts.compile_results \
--results_dir path/to/results/1 path/to/results/2 path/to/results/3 \
--latex
There are other mode of operation for the compile_results module, such as --mode IID
which takes results from a sweep with no test environment and report the results for each test environment separately.
python3 -m woods.scripts.compile_results \
--results_dir path/to/results/1 path/to/results/2 path/to/results/3 \
--mode IID
There is also --mode summary
which reports the average results for every dataset of all objectives in the sweep.
python3 -m woods.scripts.compile_results \
--results_dir path/to/results/1 path/to/results/2 path/to/results/3 \
--mode summary
You can also use the --mode hparams
which reports the hparams of the model chosen by model selection
python3 -m woods.scripts.compile_results \
--results_dir path/to/results/1 path/to/results/2 path/to/results/3 \
--mode hparams
Advanced usage
If 60 jobs is too many jobs for you available compute, or too few for you experiments you can change the number of seeds investigated, you can call the --n_hparams
and --n_trials
argument.
python3 -m woods.scripts.hparams_sweep \
--dataset Spurious_Fourier TCMNIST_seq \
--objective ERM IRM \
--save_path ./results \
--launcher local \
--n_hparams 10 \
--n_trials 1
If some of the test environment of a dataset is not of interest to you, you can specify which test environment you want to investigate using the --unique_test_env
argument
python3 -m woods.scripts.hparams_sweep \
--dataset Spurious_Fourier TCMNIST_seq \
--objective ERM IRM \
--save_path ./results \
--launcher local \
--unique_test_env 0
You can run a sweep with no test environment by specifying the --unique_test_env
argument as None
.
python3 -m woods.scripts.hparams_sweep \
--dataset Spurious_Fourier TCMNIST_seq \
--objective ERM IRM \
--save_path ./results \
--launcher local \
--unique_test_env None
Adding an Algorithm
In this section, we will walk through the process of adding an algorithm to the framework.
Defining the Algorithm
We first define the algorithm by creating a new class in the objectives module. In this example we will add scaled_ERM which is simply ERM with a random scale factor between 0 and max_scale for each environment in a dataset, where max_scale is an hyperparameter of the objective.
Let’s first define the class and its int method to initialize the algorithm.
class scaled_ERM(ERM):
"""
Scaled Empirical Risk Minimization (scaled ERM)
"""
def __init__(self, model, dataset, loss_fn, optimizer, hparams):
super(scaled_ERM, self).__init__(model, dataset, loss_fn, optimizer, hparams)
self.model = model
self.loss_fn = loss_fn
self.optimizer = optimizer
self.max_scale = hparams['max_scale']
self.scaling_factor = self.max_scale * torch.rand(len(dataset.train_names))
We then need to define the update function, which take a minibatch of data and compute the loss and update the model according to the algorithm definition. Note here that we do not need to define the predict function, as it is already defined in the base class.
def update(self, minibatches_device, dataset, device):
## Group all inputs and send to device
all_x = torch.cat([x for x,y in minibatches_device]).to(device)
all_y = torch.cat([y for x,y in minibatches_device]).to(device)
ts = torch.tensor(dataset.PRED_TIME).to(device)
out = self.predict(all_x, ts, device)
## Reshape the data so the first dimension are environments)
out_split, labels_split = dataset.split_data(out, all_y)
env_losses = torch.zeros(out_split.shape[0]).to(device)
for i in range(out_split.shape[0]):
for t_idx in range(out_split.shape[2]): # Number of time steps
env_losses[i] += self.scaling_factor[i] * self.loss_fn(out_split[i, :, t_idx, :], labels_split[i,:,t_idx])
objective = env_losses.mean()
# Back propagate
self.optimizer.zero_grad()
objective.backward()
self.optimizer.step()
Adding necessary pieces
Now that our algorithm is defined, we can add it to the list of algorithms at the top of the objectives module.
OBJECTIVES = [
'ERM',
'IRM',
'VREx',
'SD',
'ANDMask',
'IGA',
'scaled_ERM',
]
Before being able to use the algorithm, we need to add the hyper parameters related to this algorithm in the hyperparams module. Note: the name of the funtion needs to be the same as the name of the algorithm followed by _hyper.
def scaled_ERM_hyper(sample):
""" scaled ERM objective hparam definition
Args:
sample (bool): If ''True'', hyper parameters are gonna be sampled randomly according to their given distributions. Defaults to ''False'' where the default value is chosen.
"""
if sample:
return {
'max_scale': lambda r: r.uniform(1.,10.)
}
else:
return {
'max_scale': lambda r: 2.
}
Run some tests
We can now run a simple test to check that everything is working as expected
pytest
Try the algorithm
Then we can run a training run to see how the algorithm performs on any dataset
python3 -m woods.scripts.main train \
--dataset Spurious_Fourier \
--objective scaled_ERM \
--test_env 0 \
--data_path ./data
Run a sweep
Finally, we can run a sweep to see how the algorithm performs on all the datasets
python3 -m woods.scripts.hparams_sweep \
--objective scaled_ERM \
--dataset Spurious_Fourier \
--data_path ./data \
--launcher dummy
Adding a Dataset
In this section, we will walk through the process of adding an dataset to the framework.
Defining the Algorithm
We first define the dataset by creating a new class in the datasets module. In this example we will add flat_MNIST which is the MNIST dataset, but the image is fed to a sequential model pixel by pixel and the environments are different orders of the pixels.
First let’s define the dataset class and its init method.
class flat_MNIST(Multi_Domain_Dataset):
""" Class for flat MNIST dataset
Each sample is a sequence of 784 pixels.
The task is to predict the digit
Args:
flags (argparse.Namespace): argparse of training arguments
Note:
The MNIST dataset needs to be downloaded, this is automaticaly done if the dataset isn't in the given data_path
"""
## Dataset parameters
SETUP = 'seq'
TASK = 'classification'
SEQ_LEN = 28*28
PRED_TIME = [783]
INPUT_SHAPE = [1]
OUTPUT_SIZE = 10
## Environment parameters
ENVS = ['forwards', 'backwards', 'scrambled']
SWEEP_ENVS = list(range(len(ENVS)))
def __init__(self, flags, training_hparams):
super().__init__()
if flags.test_env is not None:
assert flags.test_env < len(self.ENVS), "Test environment chosen is not valid"
else:
warnings.warn("You don't have any test environment")
# Save stuff
self.test_env = flags.test_env
self.class_balance = training_hparams['class_balance']
self.batch_size = training_hparams['batch_size']
## Import original MNIST data
MNIST_tfrm = transforms.Compose([ transforms.ToTensor() ])
# Get MNIST data
train_ds = datasets.MNIST(flags.data_path, train=True, download=True, transform=MNIST_tfrm)
test_ds = datasets.MNIST(flags.data_path, train=False, download=True, transform=MNIST_tfrm)
# Concatenate all data and labels
MNIST_images = torch.cat((train_ds.data.float(), test_ds.data.float()))
MNIST_labels = torch.cat((train_ds.targets, test_ds.targets))
# Create sequences of 784 pixels
self.TCMNIST_images = MNIST_images.reshape(-1, 28*28, 1)
self.MNIST_labels = MNIST_labels.long().unsqueeze(1)
# Make the color datasets
self.train_names, self.train_loaders = [], []
self.val_names, self.val_loaders = [], []
for i, e in enumerate(self.ENVS):
# Choose data subset
images = self.TCMNIST_images[i::len(self.ENVS),...]
labels = self.MNIST_labels[i::len(self.ENVS),...]
# Apply environment definition
if e == 'forwards':
images = images
elif e == 'backwards':
images = torch.flip(images, dims=[1])
elif e == 'scrambled':
images = images[:, torch.randperm(28*28), :]
# Make Tensor dataset and the split
dataset = torch.utils.data.TensorDataset(images, labels)
in_dataset, out_dataset = make_split(dataset, flags.holdout_fraction)
if i != self.test_env:
in_loader = InfiniteLoader(in_dataset, batch_size=training_hparams['batch_size'])
self.train_names.append(str(e) + '_in')
self.train_loaders.append(in_loader)
fast_in_loader = torch.utils.data.DataLoader(in_dataset, batch_size=64, shuffle=False, num_workers=self.N_WORKERS, pin_memory=True)
self.val_names.append(str(e) + '_in')
self.val_loaders.append(fast_in_loader)
fast_out_loader = torch.utils.data.DataLoader(out_dataset, batch_size=64, shuffle=False, num_workers=self.N_WORKERS, pin_memory=True)
self.val_names.append(str(e) + '_out')
self.val_loaders.append(fast_out_loader)
# Define loss function
self.log_prob = nn.LogSoftmax(dim=1)
self.loss = nn.NLLLoss(weight=self.get_class_weight().to(training_hparams['device']))
Note: you are required to define the following variables: * SETUP * SEQ_LEN * PRED_TIME * INPUT_SHAPE * OUTPUT_SIZE * ENVS * SWEEP_ENVS you are also encouraged to redefine the following variables: * N_STEPS * N_WORKERS * CHECKPOINT_FREQ
Adding necessary pieces
Now that our algorithm is defined, we can add it to the list of algorithms at the top of the objectives module.
DATASETS = [
# 1D datasets
'Basic_Fourier',
'Spurious_Fourier',
# Small images
"TMNIST",
# Small correlation shift dataset
"TCMNIST_seq",
"TCMNIST_step",
## EEG Dataset
"CAP_DB",
"SEDFx_DB",
## Financial Dataset
"StockVolatility",
## Sign Recognition
"LSA64",
## Activity Recognition
"HAR",
## Example
"flat_MNIST",
]
Before being able to use the dataset, we need to add the hyper parameters related to this dataset in the hyperparams module. Note: the name of the funtion needs to be the same as the name of the dataset followed by _train and _model.
def flat_MNIST_train(sample):
""" flat_MNIST model hparam definition
Args:
sample (bool): If ''True'', hyper parameters are gonna be sampled randomly according to their given distributions. Defaults to ''False'' where the default value is chosen.
"""
if sample:
return {
'class_balance': lambda r: True,
'weight_decay': lambda r: 0.,
'lr': lambda r: 10**r.uniform(-4.5, -2.5),
'batch_size': lambda r: int(2**r.uniform(3, 9))
}
else:
return {
'class_balance': lambda r: True,
'weight_decay': lambda r: 0,
'lr': lambda r: 1e-3,
'batch_size': lambda r: 64
}
def flat_MNIST_model():
""" flat_MNIST model hparam definition
Args:
sample (bool): If ''True'', hyper parameters are gonna be sampled randomly according to their given distributions. Defaults to ''False'' where the default value is chosen.
"""
return {
'model': lambda r: 'LSTM',
'hidden_depth': lambda r: 1,
'hidden_width': lambda r: 20,
'recurrent_layers': lambda r: 2,
'state_size': lambda r: 32
}
Run some tests
We can now run a simple test to check that everything is working as expected
pytest
Try the algorithm
Then we can run a training run to see how algorithms performs on your dataset
python3 -m woods.scripts.main train \
--dataset flat_MNIST \
--objective ERM \
--test_env 0 \
--data_path ./data
Run a sweep
Finally, we can run a sweep to see how the algorithms performs on your dataset
python3 -m woods.scripts.hparams_sweep \
--objective ERM \
--dataset flat_MNIST \
--data_path ./data \
--launcher dummy
Contributing
Woods is still under developpement and is open to contributions. Just fork the repository and start coding! When you think you have something to contribute, open an issue or a pull request.
If you have a published algorithm that you want to be added as a benchmark please open a pull request we will be happy to add it to the list of available algorithms.
If you have a sequencial dataset that you think has a generalization problem, please open a pull request and we will be happy to add it to the list of available datasets.
API Documentation
woods
woods.command_launchers module
Set of functions used to launch lists of python scripts
Summary
Functions:
Doesn't launch any scripts in commands, it only prints the commands. |
|
Launch all of the scripts in commands on the local machine serially. |
|
Parallel job launcher for computationnal cluster using the SLURM workload manager. |
Reference
- woods.command_launchers.dummy_launcher(commands)
Doesn’t launch any scripts in commands, it only prints the commands. Useful for testing.
Taken from : https://github.com/facebookresearch/DomainBed/
- Parameters
commands (List) – List of list of string that consists of a python script call
- woods.command_launchers.local_launcher(commands)
Launch all of the scripts in commands on the local machine serially. If GPU is available it is gonna use it.
Taken from : https://github.com/facebookresearch/DomainBed/
- Parameters
commands (List) – List of list of string that consists of a python script call
- woods.command_launchers.slurm_launcher(commands)
Parallel job launcher for computationnal cluster using the SLURM workload manager.
Launches all the jobs in commands in parallel according to the number of tasks in the slurm allocation. An example of SBATCH options:
#!/bin/bash #SBATCH --job-name=<job_name> #SBATCH --output=<job_name>.out #SBATCH --error=<job_name>_error.out #SBATCH --ntasks=4 #SBATCH --cpus-per-task=8 #SBATCH --gres=gpu:4 #SBATCH --time=1-00:00:00 #SBATCH --mem=81Gb
Note
–cpus-per-task should match the N_WORKERS defined in datasets.py (default 4)
Note
there should be equal number of –ntasks and –gres
- Parameters
commands (List) – List of list of string that consists of a python script call
woods.datasets module
woods.hyperparams module
woods.model_selection module
woods.models module
woods.objectives module
woods.train module
woods.utils module
Set of functions used to launch lists of python scripts |
woods.scripts
woods.scripts.compile_results module
woods.scripts.download module
woods.scripts.fetch_and_preprocess module
woods.scripts.hparams_sweep module
woods.scripts.main module
woods.scripts.visualize_results module
Library
Set of functions used to launch lists of python scripts |