parallel version breaks on multi determinant WFN

General discussion of the Cambridge quantum Monte Carlo code CASINO; how to install and setup; how to use it; what it does; applications.

parallel version breaks on multi determinant WFN

Postby Vladimir_Konjkov » Fri Dec 29, 2017 7:23 pm

Hello CASINO developers. I've found after upgrade to v2.13.673 from v2.13.639 that newest version fails when running with ANY multi determinant WFN in parallel mod with message:

--Job's stderr--

[vladimir-Kubuntu-16:17932] *** An error occurred in MPI_Bcast
[vladimir-Kubuntu-16:17932] *** reported by process [234618881,2]
[vladimir-Kubuntu-16:17932] *** on communicator MPI_COMM_WORLD
[vladimir-Kubuntu-16:17932] *** MPI_ERR_TRUNCATE: message truncated
[vladimir-Kubuntu-16:17932] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[vladimir-Kubuntu-16:17932] *** and potentially your MPI job)
[vladimir-Kubuntu-16:17924] 1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal
[vladimir-Kubuntu-16:17924] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages


what could be the cause of the error?

Best, Vladimir.

single determinant WFN works fine in both versions.
I compile my CASINO with linuxpc-gcc-parallel.openblas.arch
In Soviet Russia Casino play you.
Vladimir_Konjkov
 
Posts: 115
Joined: Wed Apr 15, 2015 3:14 pm

Re: parallel version breaks on multi determinant WFN

Postby Neil Drummond » Sat Dec 30, 2017 10:07 pm

Dear Vladimir,

Thanks for the report. Do you have an example that fails? (The MDET examples in CASINO/examples/TEST seem to work OK, at least with the gfortran and NAG compilers.)

Best wishes,

Neil.
Neil Drummond
 
Posts: 79
Joined: Fri May 31, 2013 10:42 am
Location: Lancaster

Re: parallel version breaks on multi determinant WFN

Postby Vladimir_Konjkov » Sun Dec 31, 2017 1:33 am

Neil Drummond wrote:Dear Vladimir,

Thanks for the report. Do you have an example that fails? (The MDET examples in CASINO/examples/TEST seem to work OK, at least with the gfortran and NAG compilers.)

Best wishes,

Neil.


Hello Neil.

My example is in the attachment. I'm still using the old version, it works completely.

Vladimir.

Happy New Year!!!!
Attachments
MPI_issue.tgz
example
(22.55 KiB) Downloaded 16 times
In Soviet Russia Casino play you.
Vladimir_Konjkov
 
Posts: 115
Joined: Wed Apr 15, 2015 3:14 pm

Re: parallel version breaks on multi determinant WFN

Postby Neil Drummond » Sun Dec 31, 2017 10:55 pm

Dear Vladimir,

Thanks very much for reporting the problem and sorry for any inconvenience. The bug was introduced in 2.13.650. The issue is that mdet_max_mods needs to be broadcast before it is used in READGW in gaussians.f90. I've attached the git patch that I've just sent to Mike.

Happy New Year!

Best wishes,

Neil.
Attachments
0001-Fixed-bug-affecting-Gaussian-and-Slater-type-multide.patch.gz
(1.24 KiB) Downloaded 12 times
Neil Drummond
 
Posts: 79
Joined: Fri May 31, 2013 10:42 am
Location: Lancaster

Re: parallel version breaks on multi determinant WFN

Postby Mike Towler » Mon Jan 01, 2018 11:50 am

Neil's fix is now in the public distribution.

Happy New Year to all!

M.
Mike Towler
 
Posts: 233
Joined: Thu May 30, 2013 11:03 pm
Location: Florence


Return to The CASINO program

Who is online

Users browsing this forum: No registered users and 0 guests

cron