Failure building fv3-jedi on Orion

Hello,
following advice from JEDI Modules on selected HPC systems — JEDI Documentation 1 documentation
with intel-impi and the fresh clone is not successful.
Can someone advise how to proceed?
Thanks,
Mariusz

/work/noaa/gsd-fv3-dev/pagowski/jedi/code/fv3-bundle/fms/exchange/xgrid.F90(4866): warning #6843: A dummy argument with an explicit INTENT(OUT) declaration is not given an explicit value. [D]
subroutine get_side1_from_xgrid_ug(d, grid_id, x, xmap, complete)
-----------------------------------^
/work/noaa/gsd-fv3-dev/pagowski/jedi/code/fv3-bundle/fms/exchange/xgrid.F90(3346): warning #6843: A dummy argument with an explicit INTENT(OUT) declaration is not given an explicit value. [D]
subroutine get_side1_from_xgrid(d, grid_id, x, xmap, complete)
--------------------------------^
[ 13%] Building Fortran object fms/CMakeFiles/fms.dir/coupler/atmos_ocean_fluxes.F90.o
[ 13%] Linking Fortran shared library …/lib/libfms.so
[ 13%] Built target fms
make: *** [all] Error 2

Hi Mariusz

Are you using the Intel or GNU modules to build fv3-jedi?

Orion is currently down for maintenance, we’ll have a look when it’s back up.

Rick, the failure is with intel, haven’t tried gnu,
Mariusz

Mariusz,

I think you are not showing the error message.

for example:

make[2]: *** [bin/soca_ensvariance.x] Error 1
make[1]: *** [soca/src/mains/CMakeFiles/soca_ensvariance.x.dir/all] Error 2
make: *** [all] Error 2

Yes, you are right, I just mistakenly grabbed stdout before the compilation aborted (because of parallel make it showed some unrelated warnings)

@mpagowski - how is it going with this? Are you still having problems? Also - are you building in debug mode? I have seen problems with fms and crtm when built with intel 19 in debug mode.

I haven’t tried it since there were no updates on this topic. The flag is -DNDEBUG. Were there any updates to the modules and should I try again?

The problem I referred to is independent of the modules. So there is no need to rebuild the modules. Where do you use -NDEBUG? Do you build with ecbuild --build=Debug or do you specify the compiler options differently?

If the failure is with a debug build of fms, then one immediate way around it is to just not build fms in debug mode with intel 19. If that is indeed the problem, we could put a temporary fix in our fork of fms but a longer-term solution would require a bug report to NOAA-GFDL, where the fms repo originates. Would you mind trying again? After it gets to the build failure, enter make again (with no -j option) to continue with a serial build and then post the error message here. In the mean time, I’ll see if I can reproduce the error.

No, no debug in ecbuild

ecbuild -DMPIEXEC_EXECUTABLE=/opt/slurm/bin/srun -DMPIEXEC_NUMPROC_FLAG="-n" $SRC

That produces -DNDEBUG flag for the compiler

Ok, I will try it again


I recloned it and ran ecbuild/make.

Long error message

[ 3%] Linking Fortran executable fckit_test_shared_ptr
…/…/…/lib/libfckit.so: undefined reference to eckit::BackTrace::dump()' ../../../lib/libfckit.so: undefined reference to eckit::Main::displayName() const’
…/…/…/lib/libfckit.so: undefined reference to eckit::Abort::Abort(std::string const&, eckit::CodeLocation const&)' ../../../lib/libfckit.so: undefined reference to eckit::AssertionFailed::AssertionFailed(std::string const&, eckit::CodeLocation const&)’
/work/noaa/da/grubin/opt/modules/intel-2020/impi-2020/eckit/jcsda-1.11.6.jcsda2/lib64/libeckit_mpi.so: undefined reference to `std::basic_ostream<char, std::char_traits >& std::operator<< <char, std::char_traits, std::allocator >(std::basic_ostream<char, std::char_traits >&, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&)@GLIBCXX_3.4.21

make[2]: *** [fckit/src/tests/fckit_test_shared_ptr] Error 1
make[1]: *** [fckit/src/tests/CMakeFiles/fckit_test_shared_ptr.dir/all] Error 2
make: *** [all] Error 2

I believe this is solved now, thanks to @mpagowski and @rickgrubin. The problem appeared to be one or more module files, specifically gcc, were not readable for general users. So, the intel compiler was not able to access compatible gcc headers. This has been corrected for all users. Thanks for the report!