Complex Operations

Enable/Disable CFA Operations

PADOCC uses the Climate Forecast Aggregations as a basis of comparison for any generated cloud formats, to use in the validation process. This base operation can now be disabled using the cfa_enabled property of each project. This can be set to False for any group taking too long to compute in CFA (watch out for this in the scan section as if a sample is taking a very long time it is likely the full dataset will take much longer).

Parallelisation with SLURM

Padocc now supports parallelisation via LOTUS on JASMIN, through the SLURM batch job manager. Use the CLI flag --parallel to deploy sets (arrays) of jobs to the cluster. Typically an array job will constitute processing multiple datasets at once, i.e all datasets in a given group. To deploy a subset of a group, see the documentation around repeat IDs and how to create subsets within the group based on past runs and other conditions.

Here is an example of deploying a set of jobs to SLURM.

$ padocc compute -G my-test-group --parallel -vv

This will deploy all jobs with debug logs, meaning all log outputs will be displayed. All projects covered by this deployment have their own log files that can be accessed using interactive means per project, and the results from a deployment can be most easily viewed using the status special feature (see Extra Details).

Other parameters that can be passed as flags for parallel deployments are listed here:

-t TIME_ALLOWED: Time limit for this set of jobs (default applied per phase unless specified.)
-M MEMORY: Memory limit for each job (default applies per phase.)
-e VENVPATH: Must be supplied for all jobs to have the correct environment, or will use the current active environment.

Lotus 2 configurations

PADOCC is now configured for Lotus 2 deployments on JASMIN. Simply create a Lotus 2 config file which matches the following options.

{
    "lotus_vn": 2,
    "partition": "standard",
    "account": "no-project",
    "qos": "standard"
}

See the JASMIN help docs for guidance on how to configure for Lotus 2 deployments.

This config file is discovered by padocc using the $LOTUS_CFG environment variable, which must be set to the location of this file. After this step, no further actions are required, the normal parallelisation methods listed above apply as standard.