Failure of bumpparameters_nicas_gfs_aero at C96

hbo9955 · December 7, 2020, 5:57pm

Hello,

I tried to use “bumpparameters_nicas_gfs_aero.yaml” (default at C12 resolution) to generate BUMP files for aerosols at C96 resolution by setting npx=97 and npy=97. But “fv3jedi_parameters.x” program aborted while writing out the BUMP files showing the errors below

“”"
4335 -------------------------------------------------------------------
4336 — Write NICAS parameters
4337 Write NICAS data of task 1
4338 !!! ABORT in nicas_blk_write on task #0005: dimension nc1b has a different size in file
4339 !!! ABORT in nicas_blk_write on task #0002: dimension nc1b has a different size in file
4340 !!! ABORT in nicas_write on task #0004: NetCDF: HDF error
“”"

I also tried “bumpparameters_nicas_gfs.yaml” for [T, ps] at C96 resolution. This test passed. There seems to be a bug in the BUMP codes for aerosols. I wonder if anyone have any comments or suggestions on this?

Thanks in advance.
Bo

TingLei-NOAA · December 8, 2020, 3:08pm

Hi, Bo,
Maybe there are two things you can try. First, make sure the open_mp thread is set to 1, Second, to make sure cleaning previous results might help.
These helped me resolve the similar error message for me (though, then, I can’t reproduce that error:)). Hope this help.
Ting

hbo9955 · December 8, 2020, 5:23pm

Hi Ting,

Thanks for your suggestions.

I tried to run this program with 6 MPI tasks and 1 OpenMP thread and cleaned the output from previous run. But I still got similar error message of “dimension nc1b has a different size in file” during the writing process.

Bo

TingLei-NOAA · December 8, 2020, 6:33pm

Bo,
Thanks for the update.
Then, let us see other experts ’ s comments/suggestions:).

danholdaway · December 8, 2020, 8:50pm

That error usually results from the files already being present and attempting to overwrite them but with different data. Did you remove everything from the Data/bump directory before running again?

hbo9955 · December 9, 2020, 3:29am

@danholdaway

Thanks for your suggestions.

Yes, I cleaned everything in Data/bump. I run this “fv3jedi_parameters.x” program with no problems using
(1) bumpparameters_nicas_gfs.yaml at C12 and C96 resolution
and (2) bumpparameters_nicas_gfs_aero.yaml at C12 resolution.

I only got these errors from bumpparameters_nicas_gfs_aero.yaml at C96 resolution. I also tried to only include two variables (sulf, bc1) in bumpparameters_nicas_gfs_aero.yaml like bumpparameters_nicas_gfs.yaml (T, ps). But still the same errors. Seems there is some bugs while writing out the bump files for aerosols.

Bo

benjaminmenetrier · December 10, 2020, 3:44pm

Hi @hbo9955, I agree with the solutions proposed in previous answers. This kind of error happens when you try to write a NICAS dataset in a file where it is already written but with different dimensions (for instance if the resolution or the length-scales are different). Could you send me your yaml files for both aero runs (benjamin.menetrier@irit.fr)? I can check that BUMP parameters are correct.

hbo9955 · December 10, 2020, 4:32pm

@benjaminmenetrier Thanks for your comments. I will send you the yaml files for check.

benjaminmenetrier · January 7, 2021, 9:11am

Hi @hbo9955, I copy/paste here the email exchange we had with @danholdaway last month, which could be useful for other users. I just reordered the messages to make the discussion easy to read.

From @hbo9955:

Hi Benjamin,

I am Bo Huang. Thanks for your comments on my question posted on JEDI forum. Since Mariusz was also testing this function in JEDI, I included him in this email.

Attached includes four yaml files and sbatch job script.
(1) bumpparameters_nicas_gfs_c12.yaml for [T, ps] at C12
(2) bumpparameters_nicas_gfs_c96.yaml for [T, ps] at C96
(3) bumpparameters_nicas_gfs_aero_c12.yaml for [sulf, …, seas5] at C12
(4) bumpparameters_nicas_gfs_aero_c96.yaml for [sulf, …, seas5] at C96
(5) sbatch_bump_gfs_aero.sh

I only have errors in (4) showing “!!! ABORT in nicas_blk_write on task #0005: dimension nc1b has a different size in file”. The other three work fine. If you need more information, please let me know.

In addition, I wonder if there is an option to set the vertical localization scale in the “logpres” unit in the bump yaml files? By default, this is controlled by “rv=0.3” using “Sigma-Level” unit (I assume). In the “lgetkf.yaml”, it uses the “logpres” unit for vertical localization length scale. like
“”"
87 local ensemble DA:
88 solver: GETKF
89 vertical localization:
90 fraction of retained variance: .5
91 lengthscale: 1.5
92 lengthscale units: logp
“”"

Thanks.
Bo

From @benjaminmenetrier:

Hi Bo, [cc. Mariusz and Dan]

Thanks for the yaml files.

The important point for your issue is the key “prefix” in the “bump” section. For the “bumpparameters_nicas_gfs_aero_c96” run, it is set to “…/bump_aero/fv3jedi_bumpparameters_nicas_gfs_aero”. This means that all the files produced by BUMP will be written as “…/bump_aero/fv3jedi_bumpparameters_nicas_gfs_aero_XXX.nc” where XXX is a suffix depending on the data that are written. So before running this test, you have to make sure that the directory “…/bump_aero” is empty. Can you try again? Unfortunately, I don’t have access to the NOAA machine you are probably using. If it doesn’t work, please send me the output log file, I’ll check it too.

Regarding the vertical coordinate, BUMP is not really aware of it since it is provided in the model interface. For instance in FV3-JEDI, a “fake” sigma coordinate is passed, see fv3-jedi/fv3jedi_geom_mod.f90 at 6b0b1806c9ac3d9262301465cb4483972e83a33f · JCSDA/fv3-jedi · GitHub Thus, the unit of “rv” in the yaml file for FV3-JEDI is the “sigma”-unit.
If the vertical coordinate was pressure, the unit of “rv” would be pascals, and so on. So if you want to use the logarithm of pressure, you have to change the vertical coordinate in the “fill_atlas_fieldset” subroutine of fv3jedi_geom_mod.f90. You could look at the “getVerticalCoordLogP” subroutine as an example. Let me know if you need some more help about this issue.

Have a good day,
Benjamin

From @hbo9955:

Hi Benjamin,

Thanks for your prompt response. I attached the log files from four runs in my last email. I empty the directory where the BUMP files are written to before each run. Only “bump_gfs_aero_c99.out” has the writing error message at the end of this file.

I also tried to only include two variables [sulf, bc1] in bump_gfs_aero_c96.yaml like bump_gfs_c96.yaml. It also shows similar errors.

Thanks for your response to the vertical coordinate question. We will look into it and will let you know if we need help from you.

Best,
Bo

From @danholdaway:

Hi Bo,

Can you send your bump_gfs_aero_c96.yaml?

Thanks,
Dan

From @hbo9955:

Hi Dan,

Yaml file is attached,

Thanks,
Bo

From @danholdaway:

new_nicas: 1 is likely wrong in this yaml file. It is telling bump to create a new operator, when you’ve already created it in the prior step and are trying to read it in the prefix line. That means it will try to write to the existing files.

From @hbo9955:

That makes sense.

I also see new_nicas;1 is also in the bumpparameters_nicas_gfs.yaml for [T, ps] variables. Is this error possibly caused by the codes related to aerosol BUMP? If so, I think I will need to talk to Andrew who developed the aerosol BUMP capability.

Bo

From @danholdaway:

Bo,

It is correct to have it in that file. There is a two step process.

Run BUMP parameters to generate localization and or covariance model (new_nicas:1 in bumpparameters_nicas_gfs.yaml).
Run assimilation that reads precomputed BUMP models (new_nicas:0 in bump_gfs_aero_c96.yaml)

Thanks,
Dan.

From @hbo9955:

Hi Dan,

Thanks for the clarification. I think I did not interpret my problem clearly.

My goal is to run Step 1 to generate localization and/or covariance model for aerosols using bumpparameters_nicas_gfs_aero.yaml at C96. So new_nicas: 1 should be fine in this yaml file. But it caused writing errors.

The four log files I uploaded are from running bumpparameters_nicas_gfs.yaml and bumpparameters_nicas_gfs_aero.yaml at C12 and C96. I only got the writing errors, when running bumpparameters_nicas_gfs_aero.yaml at C96. The other three worked fine.

Thanks,
Bo

From @benjaminmenetrier:

Hi Bo,

I think I understand your problem now, sorry I didn’t notice the issue earlier.

When running bumpparameters_nicas_gfs_aero.yaml at C12 or C96, you are generating the NICAS operator with fixed length-scales several times (once for each variable) and overwriting each time in the same NetCDF group because of how the “io_keys” and “io_values” keys are set. Even if the NICAS subgrid size (nc1) is the same each time, the random subsampling can lead to different local subgrid sizes for each MPI task when the subsampling process is repeated for each variable. Thus, nc1b is different for each variable (as you can see in the log file of C96), which explains the crash. It does not crash at resolution C12 because the grid is so coarse that all points are kept in the NICAS subsampling (nc1b = 144), so the NICAS variables are overwritten without any problem.

More generally, I think that with the current setup where only two NICAS operators are generated with fixed length-scales in bumpparameters_nicas_gfs_c12.yaml and bumpparameters_nicas_gfs_c96.yaml (one 3D and one 2D), it is useless to regenerate specific NICAS operators aerosols. No need to run bumpparameters_nicas_gfs_aero_c12.yaml and bumpparameters_nicas_gfs_aero_c96.yaml, you can simply use the NICAS files produced by bumpparameters_nicas_gfs_c12.yaml and bumpparameters_nicas_gfs_c96.yaml when running your variational applications.

So to extend Dan’s answer:

Run BUMP parameters to generate localization and or covariance model - bumpparameters_nicas_gfs_c12.yaml and bumpparameters_nicas_gfs_c96.yaml - with “new_nicas: 1”.
Run assimilation that reads precomputed BUMP models with load_nicas:1, at the correct resolution (use the correct prefix from bumpparameters_nicas_gfs_c12.yaml or bumpparameters_nicas_gfs_c96.yaml), and specify your own “io_keys” / “io_values” set (the one from your bumpparameters_nicas_gfs_aero_c12.yaml was correct).

Dan: what is done in the current bumpparameters_nicas_gfs_aero.yaml is not really useful and can be misleading as Bo noticed (sorry Bo!), we should update it or remove it.

Benjamin

From @hbo9955:

Hi Benjamin,

Thanks for your detailed reply.

I will try what you suggested to use NICAS files produced by bumpparameters_nicas_gfs_c96.yaml and adjust “io_keys” / “io_values” sets in our aerosol variational update.

Thanks,
Bo

mpagowski · January 25, 2021, 4:14pm

Benjamin,
this is a continuation of the thread that Bo started. Only selected extensions can be attached so am just pasting the yaml below. Pls let us know your comments on its content, many thanks,
Mariusz

geometry:
nml_file_mpp: Data/fv3files/fmsmpp.nml
trc_file: Data/fv3files/field_table
akbk: Data/fv3files/akbk64.nc4

input.nml

layout: [3,8]
io_layout: [1,1]
npx: 97
npy: 97
npz: 64
ntiles: 6
fieldsets:
- fieldset: Data/fieldsets/dynamics.yaml
- fieldset: Data/fieldsets/aerosols_gfs.yaml
input variables: &aerovars [sulf,bc1,bc2,oc1,oc2,
dust1,dust2,dust3,dust4,dust5,
seas1,seas2,seas3,seas4,seas5]

date: ‘2018-04-15T00:00:00Z’
background:
filetype: gfs
datapath: Data/inputs/gfs_aero_c96/bkg/
filename_core: 20180415.000000.fv_core.res.nc
filename_trcr: 20180415.000000.fv_tracer.res.nc
filename_cplr: 20180415.000000.coupler.res
state variables: [sulf]
bump:
prefix: Data/bump_aero/fv3jedi_bumpparameters_nicas_gfs_aero
verbosity: main
universe_rad: 2500.0e3
strategy: specific_univariate
new_nicas: 1
ntry: 10
nrep: 2
resol: 6.0
mpicom: 2

Forced length-scales

--------------------

forced_radii: 1
rh:
sulf: [2500.0e3]
bc1: [2500.0e3]
bc2: [2500.0e3]
oc1: [2500.0e3]
oc2: [2500.0e3]
dust1: [2500.0e3]
dust2: [2500.0e3]
dust3: [2500.0e3]
dust4: [2500.0e3]
dust5: [2500.0e3]
seas1: [2500.0e3]
seas2: [2500.0e3]
seas3: [2500.0e3]
seas4: [2500.0e3]
seas5: [2500.0e3]
rv:
sulf: [1.5]
bc1: [1.5]
bc2: [1.5]
oc1: [1.5]
oc2: [1.5]
dust1: [1.5]
dust2: [1.5]
dust3: [1.5]
dust4: [1.5]
dust5: [1.5]
seas1: [1.5]
seas2: [1.5]
seas3: [1.5]
seas4: [1.5]
seas5: [1.5]

Write C matrix

--------------

write_cmat: 0
io_keys:

“sulf-sulf”
io_values:
“fixed_2500km_1.5”
new_var: 1
ne: 8
var_filter: 1
var_niter: 5
var_rhflt: 2500.0e3
ensemble:
date: ‘2018-04-15T00:00:00Z’
members:
filetype: gfs
state variables: *aerovars
datapath: Data/inputs/gfs_aero_c96/mem001/
filename_trcr: 20180415.000000.fv_tracer.res.nc
filename_cplr: 20180415.000000.coupler.res
filetype: gfs
state variables: *aerovars
datapath: Data/inputs/gfs_aero_c96/mem002/
filename_trcr: 20180415.000000.fv_tracer.res.nc
filename_cplr: 20180415.000000.coupler.res
filetype: gfs
state variables: *aerovars
datapath: Data/inputs/gfs_aero_c96/mem003/
filename_trcr: 20180415.000000.fv_tracer.res.nc
filename_cplr: 20180415.000000.coupler.res
filetype: gfs
state variables: *aerovars
datapath: Data/inputs/gfs_aero_c96/mem004/
filename_trcr: 20180415.000000.fv_tracer.res.nc
filename_cplr: 20180415.000000.coupler.res
filetype: gfs
state variables: *aerovars
datapath: Data/inputs/gfs_aero_c96/mem005/
filename_trcr: 20180415.000000.fv_tracer.res.nc
filename_cplr: 20180415.000000.coupler.res
filetype: gfs
state variables: *aerovars
datapath: Data/inputs/gfs_aero_c96/mem006/
filename_trcr: 20180415.000000.fv_tracer.res.nc
filename_cplr: 20180415.000000.coupler.res
filetype: gfs
state variables: *aerovars
datapath: Data/inputs/gfs_aero_c96/mem007/
filename_trcr: 20180415.000000.fv_tracer.res.nc
filename_cplr: 20180415.000000.coupler.res
filetype: gfs
state variables: *aerovars
datapath: Data/inputs/gfs_aero_c96/mem008/
filename_trcr: 20180415.000000.fv_tracer.res.nc
filename_cplr: 20180415.000000.coupler.res
output:
parameter: stddev
exp: stddev
type: an
filetype: gfs
datapath: Data/bump_aero/
filename_core: bumpparameters_nicas_gfs_aero.stddev.fv_core.res.nc
filename_trcr: bumpparameters_nicas_gfs_aero.stddev.fv_tracer.res.nc
filename_cplr: bumpparameters_nicas_gfs_aero.stddev.coupler.res
date: ‘2018-04-15T00:00:00Z’
parameter: cor_rh
exp: cor_rh
type: an
filetype: gfs
datapath: Data/bump_aero/
filename_core: bumpparameters_nicas_gfs_aero.cor_rh.fv_core.res.nc
filename_trcr: bumpparameters_nicas_gfs_aero.cor_rh.fv_tracer.res.nc
date: ‘2018-04-15T00:00:00Z’
parameter: cor_rv
exp: cor_rv
type: an
filetype: gfs
datapath: Data/bump_aero/
filename_core: bumpparameters_nicas_gfs_aero.cor_rv.fv_core.res.nc
filename_trcr: bumpparameters_nicas_gfs_aero.cor_rv.fv_tracer.res.nc
date: ‘2018-04-15T00:00:00Z’

benjaminmenetrier · February 2, 2021, 6:35am

Hi @mpagowski. I recently updated the BUMP parameters for the test fv3jedi_bumpparameters_nicas_gfs_aero. Please find hereafter some comments about these parameters:

prefix: Data/bump/fv3jedi_bumpparameters_nicas_gfs_aero >>> File path
verbosity: main >>> Verbosity level (here only the root MPI task is writing a log)
universe_rad: 3000000.0 >>> “Universe” radius, defining the area surrounding the domain of each MPI task, with which the MPI task can communicate data (i.e. maximum halo extension). If the NICAS length-scales are increased, this value should be increased too.
strategy: specific_univariate >>> “Strategy” refers to the multivariate strategy. Here, a specific NICAS operator is computed for each diagonal block of the B matrix, no cross-covariances.
new_nicas: 1 >>> Activate the NICAS operators computation.
ntry: 3 >>> Internal parameter in the subsampling generation (number of random trials for each point), not very important.
nrep: 2 >>> Internal parameter in the subsampling generation (number of points that are re-positioned after a first pass), not very important.
resol: 6 >>> NICAS subgrid resolution (very important!). We can think about it in 1D: it corresponds to the number of points used to describe a Gaspari and Cohn function from its top to the end of its support. The computation cost of applying a NICAS operator scales with the square of this resolution. 6 seems a good compromise to start with, and could be increased or reduced depending the on the tests cost and accuracy.
mpicom: 2 >>> Number of internal communication steps in NICAS, should remain at 2.
forced_radii: 1 >>> Activate the use of yaml-specified length-scales for NICAS (instead of length-scales diagnosed from an ensemble).
rh: >>> Horizontal support radius of the convolution function
sulf: [3000000.0] >>> Profile of support radius for a given variable. If a single value is given, it is used for all levels.
rv: >>> Vertical support radius of NICAS functions
sulf: [0.2] >>> Profile of support radius for a given variable. If a single value is given, it is used for all levels.
write_cmat: 0 >>> Deactivate the writing of interpolated length-scale fields (useless here).
io_keys: ["sulf-sulf"] >>> Keys in an array of “key/value” couples used to write NICAS operator data in NetCDF files. For instance here, the autocovariance block of the sulf variable, interally named sulf-sulf, is written in the NetCDF file in the block fixed_3000km_0.2. When reading this NICAS operator in another application (e.g. 3DVar), several variables can use the same block fixed_3000km_0.2, taking advantage of the array of “key/value” couples.
io_values: ["fixed_3000km_0.2"] >>> Values in the array of “key/value” couples.
new_var: 1 >>> Activate variance computation from the ensemble
ne: 8 >>> Number of ensemble members used to compute the variance.
var_filter: 1 >>> Activate an iterative horizontal filtering of the variance field.
var_niter: 5 >>> Number of iterations for the filtering
var_rhflt: 2500.0e3 >>> Initial filtering length-scale in the iterative process (this looks OK).

andytangborn · February 4, 2021, 7:59pm

I’ve gotten this new version of fv3jedi_bumparameters_nicas_gfs_aero to work, with an ensemble input and output of error standard deviation. I’m setting up a hybrid assimilation run that will use this output as the bump directory. Do I need a new hyb-3dvar_gfs_aero.yaml file as well? If so, I’d like to see an example of what it looks like.

Topic		Replies	Views
Error in nicas_blk_compute_sampling_c1 JEDI	2	368	February 17, 2021
Construction of bump files from yaml JEDI	2	437	January 7, 2021
Error while generating BUMP localization files JEDI	1	63	November 4, 2024
Nicas_norm component is too large JEDI	2	206	April 19, 2023
Trouble splitting NICAS JEDI	2	181	September 29, 2023

Failure of bumpparameters_nicas_gfs_aero at C96

input.nml

Forced length-scales

--------------------

Write C matrix

--------------

Related topics