Experimental support for fault tolerance & asymmetric PGAS
Experimental failed-image detection
This feature is experimental and requires an MPI implementation with certain experimental, proposed MPIX
functions and constants. These are present in MPICH 3.2 which is now the default, officially-supported MPI back end. Some/most/all of these features are available in OpenMPI through the ULFM project. If the build systems detects the required features are present it will default to enabling failed images support.
See the src/tests/unit/fail_images subdirectory for demonstrations of the new support for Fortran 2015 features related to fault-tolerance, including the following:
- The
iso_fortran_env
intrinsic module now contains a newstat_failed_image
value that the compiler and runtime library assign to thestat
argument of parallel synchronization and communication statements to signal that an image has ceased responding, a scenario considered increasingly likely as computing platforms approach exaflop scalability. - A new
failed_images()
function returns an array containing the image numbers failed images.
Richer support for fault-tolerant execution necessitates the Fortran 2015 team
feature. However, this release enables users to start experimenting with fault-tolerance in advance of anticipated team
support.
Additional experimental support for derived-type coarrays with allocatable components:
This adds on an incomplete implementation in the 1.8.0 release for supporting derived type coarrays with allocatable components. Fortran requires that array coarrays have the same shape and bounds on each image. For intrinsic coarrays, this implies memory allocations that are invariant under image-number transformations. With coarrays of derived type, however, one can allocate data that are of varying size and shape across images:
type foo
real, allocatable :: bar(:)
end type
type(foo) :: foobar[*]
which is a powerful enabler when used judiciously in problems that require such flexibility of distributed, non-uniform memory allocations. This feature requires GCC/GFortran 7.1 since compiler side interface changes were required to support this feature. This features is still considered experimental and is not yet fully implemented in all regards, so use we do not yet recommend using it in production.
Bug fixes
- #309
stop
statements with numeric and string arguments were not handled correctly and are now fixed. - #342 A maintainer flag was added to turn on tests intended only for OpenCoarrays developers. This can be turned on using
OPENCOARRAYS_DEVELOPER=TRUE
as an environment variable or by turning on theCAF_RUN_DEVELOPER_TESTS
advanced CMake option. - #354
sync (all|images)
withoutstat=
was not erroring out under certain error conditions. This is now resolved. - #376 The CI build matrix was expanded for more complete test coverage using GCC 6 and 7 for compiling the library.
- #383
cafrun
had a typo (missing space) with the-v
flag. Thanks to @LaHaine for pointing this out. - #384
install.sh
does not work on HPC Linux. A new script was added to install OpenCoarrays on HPC Linux. - #385
install.sh
was not correctly reporting the path to the newly installed CMake under certain circumstances. This is now fixed. - #388 Better build system robustness and diagnostics
- Excessive debug output has been reduced when building the
Debug
configuration - Tests' oversubscription is now reduced
Installation
Please see the installation instructions for more details on how to build and install this version of OpenCoarrays