Equilibration vs statistics accumulation steps in DMC

Vladimir_Konjkov · Post by **Vladimir_Konjkov** » Sun May 31, 2015 9:12 am

Hi, all.

CASINO suggests that the DMC calculation is as follows:
We do n1 - equilibration steps, and then n2 - statistics accumulation steps.

So OK, let's assume that DMC energy - E (n) depends on step number - n as follows:
E(n) = E(infinity) * (1 + exp(-alpha*n)) + Noise(n)

where:
(1 + exp(-alpha*n)) - describes equilibration process
and
Noise(n) - Gaussian noise
Mean(Noise(n)) = 0
Var(Noise(n)) is independent of n.

After n1 equilibration steps and n2 statistics accumulation steps we've got:

E = SUM[E(n)]<from n1 to n2>/(n2-n1)

average over the ensemble of realizations:

Mean(E) = E(infinity) + SUM[exp(-alpha*n)]<from n1 to n2>/(n2-n1) + Mean(Noise(n))

recall that Mean(Noise(n)) = 0

we've got:
Mean(E) = E(infinity) + SUM[exp(-alpha*n)]<from n1 to n2>/(n2-n1)

that is, we always have a biased estimate of E(infinity) with:
bias(n1, n2) = SUM[exp(-alpha*n)]<from n1 to n2>/(n2-n1)

The question is how should relate n1 and n2, to satisfy:

bias(n1, n2) < Variance(Noise(n))/sqrt(n2-n1)

I'm not sure that the noise is Gaussian and equilibration is exponential, but who knows maybe n1/n2 have to be constant?

Vladimir

Post by **Mike Towler** » Sun May 31, 2015 4:13 pm

Hi Vladimir,

Just being dragged out of house by family to go out for dinner.

Som aspects of these questions are addressed in my talk from last years TTI conference:

http://www.tcm.phy.cam.ac.uk/~mdt26/tti ... ti2014.pdf

Have a read of that and let me know what you think - see the stuff about empirical convergence at the end.. I accidentally deleted the last few pages of the talk not long before I gave it so the slides are incomplete.

For the stuff about Gaussian noise, see John Trails two papers about the central limit theorem etc..

More when I get back.

Mike

Vladimir_Konjkov · Post by **Vladimir_Konjkov** » Mon Jun 01, 2015 9:24 am

Mike Towler wrote:Hi Vladimir,

For the stuff about Gaussian noise, see John Trails two papers about the central limit theorem etc..

Mike

Hi, Mike

http://journals.aps.org/pre/abstract/10 ... .77.016703
http://journals.aps.org/pre/abstract/10 ... .77.016704

The articles are very helpful, they answered many of the questions have not been asked by me.

Vladimir.

Vladimir_Konjkov · Post by **Vladimir_Konjkov** » Mon Jun 01, 2015 4:06 pm

Mike Towler wrote:Hi Vladimir,
Som aspects of these questions are addressed in my talk from last years TTI conference:

http://www.tcm.phy.cam.ac.uk/~mdt26/tti ... ti2014.pdf

Have a read of that and let me know what you think - see the stuff about empirical convergence at the end.. I accidentally deleted the last few pages of the talk not long before I gave it so the slides are incomplete.

Although I have not tried it in practice, stop method keyword looks cool, but I like more control over process than just set target error = 0.005.

For example, do you know why some programs can not converge f-ane test to lower energy state with broken symmetry but other can?
not always programs automatically select the best way to converge, but by hands I can do better, unless of course if it is allowed to do something by hands.
I apologize if my words seem sarcastic.

with great respect Vladimir.

Post by **Mike Towler** » Thu Jun 04, 2015 11:15 am

Hi Vladimir,

Although I have not tried it in practice, stop_method keyword looks cool,

Glad to hear it! However, you're unlikely to be able to try it in practice, as it only exists in my private development version of CASINO. I got about 95% of the way through implementing it in preparation for the TTI conference last year, and after that I've just got involved in one thing or another, and I haven't got around to finishing it and putting it in the main distribution yet. Hopefully very soon (certainly within the next two or three weeks..).

but I like more control over process than just set target error = 0.005.

So how would you do it differently? (Seriously, as I'm about to finish the implementation, now would be a good time to tell me..)

For example, do you know why some programs can not converge f-ane test to lower energy state with broken symmetry but other can? not always programs automatically select the best way to converge, but by hands I can do better, unless of course if it is allowed to do something by hands.

You're not the first person to encounter this problem. I seem to recall Mike Deible having a similar issue with f-ane and the PSI-4 code.

You know, of course, that a converged HF wave function is guaranteed to be at a stationary point in the space of orbital rotations, and this is usually but not necessarily at a minimum. If it is a minimum then all orbital rotations will increase the energy. However, if the 2nd derivative of the energy with respect to orbital rotations is negative, then there exist certain orbital rotations which will lower the energy (often by breaking the point group symmetry, which may or may not be imposed in any given calculation).

You can analyze the stability of your solution by diagonalizing the matrix of second derivatives with respect to orbital rotations. Zero eigenvalues (with possible noise) indicate degeneracies, negative eigenvalues indicate instabilities which can be followed by applying the corresponding orbital rotation. So basically, there are lots of tricks you can do, and each code may or may not be able to do them (automatically or otherwise), and will often require a particular keyword to be set. For example, for the Gaussian09 program, you use the 'Stable' keyword:

http://www.gaussian.com/g_tech/g_ur/k_stable.htm

I think by now you've used many more quantum chemistry codes than I have, so I'm probably not the best person to give you specific advice for individual codes, but of course Google and the code authors are your friends (and Mike D. and Albert D. and their colleagues might be able to help..).

I apologize if my words seem sarcastic.

I think Russian sarcasm must be too subtle for me, I missed that..

Vladimir_Konjkov · Post by **Vladimir_Konjkov** » Thu Jun 04, 2015 12:12 pm

Mike Towler wrote:
For example, do you know why some programs can not converge f-ane test to lower energy state with broken symmetry but other can? not always programs automatically select the best way to converge, but by hands I can do better, unless
of course if it is allowed to do something by hands.
You know, of course, that a converged HF wave function is guaranteed to be at a stationary point in the space of orbital rotations, and this is usually but not necessarily at a minimum. If it is a minimum then all orbital rotations will increase the energy. However, if the 2nd derivative of the energy with respect to orbital rotations is negative, then there exist certain orbital rotations which will lower the energy (often by breaking the point group symmetry, which may or may not be imposed in any given calculation).

You can analyze the stability of your solution by diagonalizing the matrix of second derivatives with respect to orbital rotations. Zero eigenvalues (with possible noise) indicate degeneracies, negative eigenvalues indicate instabilities which can be followed by applying the corresponding orbital rotation. So basically, there are lots of tricks you can do, and each code may or may not be able to do them (automatically or otherwise), and will often require a particular keyword to be set. For example, for the Gaussian09 program, you use the 'Stable' keyword:

Sorry Mike, I need to be more precise. When you use HCORE (eigenvectors of the one-electron Hamiltonian. ) as an initial guess in SCF HF calculation, you always get wrong energy state in f-ane test (wrong levels order), in any program, and the energy in all programs are the same, regardless of whether you are using 'Stable' keyword or not. But in CFOUR you can choose only HCORE or MOREAD (means to read from previous calculation). What can I do with this?

but I like more control over process than just set target error = 0.005.
So how would you do it differently? (Seriously, as I'm about to finish the implementation, now would be a good time to tell me..)

I spent a few calculations in Cassino, but everything works fine: Jastrow optimisation, pseudopotential, DMC, force calculation with VMC.
I didn't know what to say about target error. Let it be.
I think that we can always choose VMC decorrelation period so that data will not be autocorrelated. I always do this.
As for DMC we precisely know the correlation length from VMC, so reblock is redundant.

PS.
I just read Article PHYSICAL REVIEW E 77, 016703 ͑2008͒ Heavy-tailed random error in quantum Monte Carlo that provides 4 sources of variance in QMC calculation:
1. Nuclear cusp condition
2. e-e cusp condition
3. quality of the nodal surface (backflow, multideterminant, etc)
4. infinity asymptotic behavior of WFN (I think it's wrong for gaussian basis set).
I'm trying to understand what contributes most and what to do with 4-th item.

PPS.
perhaps it would be better if the utility rebloсk save reblocked forces in the file, but one always can use

Code: Select all

$reblock | tee filename

with respect Vladimir.

Post by **Mike Towler** » Sun Jun 07, 2015 10:01 am

But in CFOUR you can choose only HCORE or MOREAD (means to read from previous calculation). What can I do with this?

No experience with CFOUR at all, but what about starting with a slightly distorted nuclear framework? Move one of the atoms by a very small amount so it isn't covered by the point group symmetry operations, then the SCF can converge to a broken symmetry state. You could then use this as a starting point with MOREAD in a second calculation where the atoms are in their normal positions?

I think that we can always choose VMC decorrelation period so that data will not be autocorrelated. I always do this.

Yeah, but you don't know the decorrelation period in advance, and if you just jack it up to some large value you can increase the computer time unnecessarily. As you know, with current versions of CASINO both VMC and DMC print the energies with error bars automatically corrected for serial correlation (using both on-the-fly reblocking and 'the correlation time method' i.e. multiplying the error bar by the square root of the average correlation time). These two techniques should give approximately the same corrected error bar. For example:

Code: Select all

 VMC energy (au)    Standard error      Correction for serial correlation

 -6.299284243152 +/- 0.002175524045      No correction
 -6.299284243152 +/- 0.003484863685      Correlation time method
 -6.299284243152 +/- 0.003659126872      On-the-fly reblocking method

As for DMC we precisely know the correlation length from VMC, so reblock is redundant.

You have to use a much smaller timestep in DMC - in terms of number of steps, the correlation length is much longer.

perhaps it would be better if the utility rebloсk save reblocked forces in the file

I'll add this to the list..

The CASINO forum

Equilibration vs statistics accumulation steps in DMC

Equilibration vs statistics accumulation steps in DMC

Re: Equilibration vs statistics accumulation steps in DMC

Re: Equilibration vs statistics accumulation steps in DMC

Re: Equilibration vs statistics accumulation steps in DMC

Re: Equilibration vs statistics accumulation steps in DMC

Re: Equilibration vs statistics accumulation steps in DMC

Re: Equilibration vs statistics accumulation steps in DMC