Write a CASINO interface to your code

For real systems containing atoms CASINO requires input from another program – such as a Hartree-Fock, DFT, or quantum chemistry code – to define a ‘trial wave function’ which the DMC process is then supposed to improve. Quantum Monte Carlo codes therefore require the development and long-term support of  standard interfaces with multiple external programs.

The current status of CASINO’s links with all the external codes that we supposedly support can be found here.

This page exists to remind developers that defining such an interface can actually be pretty easy. For someone who knows the internals of the external program, or who has at their disposal a formal description of its output, an interface to CASINO can be constructed in only a day or so of reasonably hard work. We would therefore like to encourage you to do so, and we provide some of the technical information and advice that you will need here..

Essentially one just needs to write a little bit of general information plus the geometry, the basis set, determinant data, and the orbital coefficients to a file – in one of four standard formats. Your choice of format reflects the basis set you wish to use.

The four file formats are pwfn.data (plane-waves), bwfn.data (blips), gwfn.data (Gaussians), and stowfn.data (Slater functions). External codes generally don’t use blip function basis sets; rather, bwfn.data files are produced by a mathematical transformation of the numbers in a pwfn.data file. The blip transformation can also be done internally by the external code (as is done currently by PWSCF).  Other formats exist in which the orbitals are represented as function values on a grid, but this is only suitable for atoms and diatomics and we shall not consider it further here.

These files can be produced either by modifying the external program directly to write CASINO-formatted output (preferred) or – perhaps if you do not have permission to alter the external code – by writing a utility to convert its standard output into CASINO format. Such utilities can then be kept in the official CASINO distribution for everyone to use.

Here are some (hopefully self-explanatory) examples of the four basis set types with comments. The files are truncated in that, in long sequences of identical data such as plane-wave G vectors or orbital coefficients, only a few examples are shown with the word <snip> indicating where irrelevant data has been removed:

Plane-wave pwfn.data example
Blip bwfn.data example
Gaussian gwfn.data example
Slater stowfn.data example

Full untruncated examples of all the above for various different systems can be found in the CASINO examples directory, and their formal specification is given in the relevant section of the CASINO manual.

Before we get onto specifics, make sure you don’t lose precision in your write statements. So if some internal number a=1.2345678901234 exists as a double precision real, then write it like that with all the significant digits. Don’t attempt to do e.g. write(6,'(f12.8)’)  to give a=1.23455678 in order to ‘make it look neater’. As CASINO does all sorts of internal checking it will probably conclude that some numbers which are supposed to have some sort of relationship with each other now no longer do so, and it will stop with an error.

Other  helpful advice on interface design is now given for each of the four basis sets in turn.

Plane-waves (pwfn.data) and blips (bwfn.data)

Writing a plane-wave file should be trivial. Here is a tar file of the relevant routines which write the pwfn.data file from the CASTEP, PWSCF, GP, and ABINIT interfaces – you can just copy your favourite and copy the relevant bits into your own code. More details of these can be found in the corresponding directories in CASINO/utils/wfn_converters in the CASINO distribution.

The only issue which cropped up the last time someone did this was as follows:

You can write the Fourier coefficients as a column of pure real numbers (as they must be if the system has inversion symmetry). If the coefficients are complex with a potentially finite imaginary part they must appear in the pwfn.data file in ‘complex form’ i.e. as (1.45389539530495, 2.2349582304982) with brackets and a comma. This is what you get in Fortran if you just do ‘write(6,*)c‘ where c is of complex type. Do not write them as two columns of real numbers, as the CASINO read into vectors of complex type will then end up setting the imaginary part to zero (this is a standard feature of Fortran, of course!).

You don’t need to bother with blips to establish an interface with a plane-wave code, since a CASINO-provided utility ‘blip‘ will convert a pwfn.data from whatever source into a bwfn.data blip file whenever you like. Should you feel like incorporating a blip converter into the external code, you should study how this is achieved in PWSCF (the source code is in CASINO/utils/wfn_converters/pwscf/pwscf_routines in the CASINO distribution).

You should also read question B10 of the FAQ to get a better understanding of how the various codes deal with unoccupied orbitals (this will be unified in a future release of CASINO).

Gaussians (gwfn.data)

The Gaussian converters are necessarily more complicated.

First find out whether your Gaussian code supports writing of the MOLDEN format. If it does then you may be able to use Mike Deible and Vladimir Konjkov’s converter molden2qmc. Note that the original idea for a MOLDEN converter was that it would produce a gwfn.data file for any quantum chemistry package, instead of writing a converter for each individual code. Unfortunately, all MOLDEN files are not created equally, and so this converter sometimes has to be tailored to particular codes. Each MOLDEN file has its own subtleties.

Does your code use Cartesian Gaussians or harmonic Gaussians? CASINO can only evaluate harmonic Gaussians, so if you’re a Cartesian person you need to do the appropriate transformations first. We currently provide no documentation of how to do this though we hope to do so in the future.

Decide what is the highest angular momentum you want to support (remembering that higher ones are not necessarily as useful in QMC as they are in high-order quantum chemistry methods). In principle CASINO will calculate s, p, d, and f functions for periodic systems, and s, p, d, f, and g for finite systems. (Most Gaussian basis set quantum chemistry codes can’t handle periodic systems). Derivatives of higher angular momentum harmonic Gaussians become incredibly complicated very quickly. The conventions for normalization and how to treat the numerical factors in the real solid harmonics are also complicated and easily confused. These issues are discussed with reference to the CRYSTAL code in Mike Towler’s document here.

The above document can also be found in CASINO/examples/generic/gauss_dfg which contains a ‘test suite’ to verify that higher angular momentum functions are being treated correctly. Any implementer of a new Gaussian interface – if the CASINO developers are to acknowledge the external code as ‘supported’ – must produce four sets of files for the methane molecule; one with only d functions in the basis, one with only f functions, and one with only g, together with a mixed s/p/d/f one such as the methane with avtz basis given in the examples (the latter is necessary since the single-l examples only show it is correct up to an l-dependent scaling factor. ). A ‘set of files’ in this context means an input and output file for the generating program, a gwfn.data CASINO Gaussian wave function file produced with that input, and a VMC-HF output file produced by CASINO using that gwfn.data. To do this, one should simply change the format of the existing CRYSTAL files in this directory which have suitable exponents and contraction coefficients (the provided CASINO input file provided may be re-used for the VMC calculations). It is required that the CASINO VMC-HF run with no Jastrow factor reproduces the Hartree-Fock energy reported by the generating program. One should also informally check that the interface works for spin-polarized systems and systems with pseudopotentials.

For reference, the second column of the following tables shows the forms of the real solid harmonics as assumed by CASINO. These are taken from p. 170 of the ‘orange book’ of the CRYSTAL program ‘HF ab initio treatment of crystalline systems’ by C. Pisani, R. Dovesi, and C. Roetti 1988, which is freely downloadable from Springer; the relevant appendix is available here.

One final note: in order for CASINO to provide watertight idiot-proofing for the case where the user forgets to supply the x_pp.data pseudopotential file(s) during a QMC calculation, the external code (and/or the converter utility) needs to flag which atoms are pseudoatoms in the gwfn.data file (recall CASINO can treat systems with a mixture of all-electron atoms and pseudoatoms where necessary). When writing your interface you should make sure this is done by adding 200 to the atomic number of pseudoatoms.

Slater (stowfn.data)

As the implementation of this feature was done by a transient postdoc, no additional support is currently provided for Slater functions, beyond what you can find in the specification of the stowfn.data file and the discussion of the ADF code in the manual, and in the various files in the directory CASINO/utils/wfn_converters/adf of the CASINO distribution.

We love your help

Thanks to everyone who helps with maintaining CASINO’s interfaces with external codes – we know how boring it is and we gratefully appreciate your assistance.

If anyone has any additional mantras or advice or other material for this page then please send it to Mike Towler (mdt26@cam.ac.uk).

Leave a Reply