Error PPOTS

General discussion of the Cambridge quantum Monte Carlo code CASINO; how to install and setup; how to use it; what it does; applications.
elaheh
Posts: 10
Joined: Mon Jun 03, 2013 10:24 am

Error PPOTS

Post by elaheh »

As you know running VMC and DMC is time consuming. Usually my jobs are killed due to time limitations of clusters and I have to continue running them. But sometimes they are stopped with the "ERROR : READ_PPOTS. Error opening c_pp.data file" though I do not change anything and just continue running the program. Then, I start running them with copying c_pp.data and pasting it into the folder including input files.

I do not know what happens and whether it is normal or not. I was wondering if the only way to prevent this kind of errors is just every time copy and paste *.data.

Kind reagrds,

Elaheh.
Mike Towler
Posts: 239
Joined: Thu May 30, 2013 11:03 pm
Location: Florence
Contact:

Re: Error PPOTS

Post by Mike Towler »

Hi Elaheh,

I can't answer these two questions properly without knowing how jobs are run on your cluster, or what you're doing with your files. Are you using runqmc to run them on a properly set-up computer (what's the CASINO_ARCH?)? Are there pseudopotential files that are readable by a human in the directory containing your input files?

Is it possible your supposed pseudopotential files are broken symbolic links?

As for time limits, it's reasonable to do a short test calculation first to see how long a block of moves takes. Then make sure you don't set the number of moves in the input file to be greater than can be done in the available time. If you really need to do more moves than that, then note that runqmc has an 'auto-continue' option which will split long runs over imposed time limits. Type 'runqmc --help' or see the manual to find out how it works. You might also find the max_cpu_time input keyword useful (which is used by runqmc in the auto-continue procedure)..

Mike
elaheh
Posts: 10
Joined: Mon Jun 03, 2013 10:24 am

Re: Error PPOTS

Post by elaheh »

Hi Mike,
>I can't answer these two questions properly without knowing how jobs are run on your cluster, or what you're doing with your files.
I am running DMC calculations by using dmc_stats runtype and have to do more moves.
>Are you using runqmc to run them on a properly set-up computer (what's the CASINO_ARCH?)?
I am using runqmc -p 512 -T 48h --shmem. Time limitation in the cluster is 48 hours and CASINO_ARCH is "linuxpc-ifort-sge-parallel.polaris". I had the same problem with the CASINO_ARCH "linuxpc-ifort-lsf-parallel.lancaster".
>Are there pseudopotential files that are readable by a human in the directory containing your input files?
Yes. I am using c_pp.data and c_pp.awfn files which are always in the directory and readable.
>Is it possible your supposed pseudopotential files are broken symbolic links?
I do not know. How can I check it?

Thanks.

Elaheh.
Mike Towler
Posts: 239
Joined: Thu May 30, 2013 11:03 pm
Location: Florence
Contact:

Re: Error PPOTS

Post by Mike Towler »

I am running DMC calculations by using dmc_stats runtype and have to do more moves.
So do more moves. You can do this using the auto-continue feature of runqmc in the way I suggested.
Yes. I am using c_pp.data and c_pp.awfn files which are always in the directory and readable.
First off, CASINO doesn't read c_pp.awfn files unless you're doing a single atom calculation with the orbitals tabulated on a grid and this is the main wave function file (in which case it should be renamed awfn.data).

As for the c_pp.data, the error message was clear enough: it means that there was an error opening the file. It doesn't mean that the file doesn't exist - since then CASINO would just treat the atom as all-electron. It doesn't mean that the format of the file is wrong, as it hasn't started to formally read it yet. Any pseudopotential files are read once only at the start of the run..

I don't understand what you mean when you say "Then, I start running them with copying c_pp.data and pasting it into the folder including input files." First of all, what's all this 'copying and pasting'? You're not using Windows, are you? At what point are you doing this copying and pasting?

Could you just tell me:

(1) what files are present in the directory before you type the runqmc command?
(2) At this point are you able to e.g. 'vi c_pp.data' and see a correctly formatted pseudopotential file like the one in the manual or in the online pseudopotential library?
(3) Are any of the files symbolic links, and if so where do they point to? (see below)
(4) You then type runqmc, and then what? Sometimes it starts and runs out of time? And sometimes it stops immediately with 'Error opening c_pp.data'? If so, what's the difference in setup between the two cases?

Weird file errors are almost always due to (1) users filling up a disk or running over their disk quota (are you? Try 'quota' and 'df'), or (2) Windows-specific newline characters appearing in files which for some reason are in DOS format. However, these days runqmc should automatically convert them back to the correct Unix format using the built-in dos2unix utility. Note such files can also have their origin on Apple machines e.g. there is a bug in the 'Apple Mail' program, which results in the newlines of plain text files sent as attachments being changed to DOS format by some Unix email clients (such as pine/alpine). So just to check: what operating system are you using? Where did you get your pseudopotential files?
I do not know. How can I check it?
To check if a file is a symbolic link, type 'ls -CF' - links will have an @ character after them e.g. c_pp.data@ . If it is, then type 'ls -l' to see where the link points to e.g. lrwxrwxrwx 1 mdt26 tcm 25 Jun 27 2013 c_pp.data -> /my/arse/c_pp.data
elaheh
Posts: 10
Joined: Mon Jun 03, 2013 10:24 am

Re: Error PPOTS

Post by elaheh »

Thank you for your advice.
> I don't understand what you mean when you say "Then, I start running them with copying c_pp.data and pasting it into the folder including input files." First of all, what's all this 'copying and pasting'? You're not using Windows, are you? At what point are you doing this copying and pasting?
I mean I copy xx_pp.data from CASINO library and paste to my directory. I am using Linux system not Windows.
>(1) what files are present in the directory before you type the runqmc command?
input, correlation.data, xx_pp.data, pwfn.data, bwfn.data.bin and config.in (is present for continuing my calculation)
>(2) At this point are you able to e.g. 'vi c_pp.data' and see a correctly formatted pseudopotential file like the one in the manual or in the online pseudopotential library?
Yes, I can read it.
>(3) Are any of the files symbolic links, and if so where do they point to? (see below)
None of them are symbolic.
>(4) You then type runqmc, and then what? Sometimes it starts and runs out of time? And sometimes it stops immediately with 'Error opening c_pp.data'? If so, what's the difference in setup between the two cases?
I am using runqmc -p 512 -T 48h. When it runs out of time, .err and .runqmc.lock are present in the directory while they are not created while it stops with Error. I do not know if I answered your question.
>Weird file errors are almost always due to (1) users filling up a disk or running over their disk quota (are you? Try 'quota' and 'df'), or (2) Windows-specific newline characters appearing in files which for some reason are in DOS format.
The disk is not full. I checked it before.

Regards,

Elaheh.
Mike Towler
Posts: 239
Joined: Thu May 30, 2013 11:03 pm
Location: Florence
Contact:

Re: Error PPOTS

Post by Mike Towler »

I am using runqmc -p 512 -T 48h. When it runs out of time, .err and .runqmc.lock are present in the directory while they are not created while it stops with Error. I do not know if I answered your question.
No, maybe I'm being dense, but I don't think you did. Let's try again.

First of all note that running out of time is not an error. It just means that you've asked CASINO to do too many moves in the time available (possibly exacerbated by e.g. having too many configs). Make sure you do a proper estimate of how long N moves takes before you do a serious calculation. There are also automatic continuation tools options available in runqmc. Let's not talk about this any more.

More importantly, the pseudopotential error:

When does it stop with the pseudopotential error? At the beginning of the calculation or when you restart after you've run out of time?

Is it the case that a calculation with the same set of input files can sometimes work (i.e. get through the pseudopotential read at the start of the calculation) and sometime not - this seems to be what you are implying?

Can you attach the contents of your pseudopotential file to your next post?

Mike
elaheh
Posts: 10
Joined: Mon Jun 03, 2013 10:24 am

Re: Error PPOTS

Post by elaheh »

>When does it stop with the pseudopotential error? At the beginning of the calculation or when you restart after you've run out of time?
It usually stops when I restart after run out of time.
>Is it the case that a calculation with the same set of input files can sometimes work (i.e. get through the pseudopotential read at the start of the calculation) and sometime not - this seems to be what you are implying?
Yes.
>Can you attach the contents of your pseudopotential file to your next post?
It is attached.

Thanks,

Elaheh.
Attachments
c_pp.data.gz
(49.63 KiB) Downloaded 1060 times
Mike Towler
Posts: 239
Joined: Thu May 30, 2013 11:03 pm
Location: Florence
Contact:

Re: Error PPOTS

Post by Mike Towler »

>It usually stops when I restart after run out of time.
OK - when you say 'usually', do you mean that sometimes it stops at the beginning of the calculation? If it does, then the whole business of restarts is irrelevant, and your question can be boiled down to 'Why does CASINO sometimes stop when it tries to open my pseudopotential file?'

Is that a correct statement?
CASINO_ARCH is "linuxpc-ifort-sge-parallel.polaris". I had the same problem with the CASINO_ARCH "linuxpc-ifort-lsf-parallel.lancaster".
These are two physically different machines, correct?

If you see the same error on two different machines then it's likely to be a CASINO problem, or a compiler problem, rather than some weird problem with your computer..

Could you tell me the output of 'ifort --version' on the two machines.

Are there any other compilers available on either machine?

Your pseudopotential file looks fine by the way.
elaheh
Posts: 10
Joined: Mon Jun 03, 2013 10:24 am

Re: Error PPOTS

Post by elaheh »

> OK - when you say 'usually', do you mean that sometimes it stops at the beginning of the calculation? If it does, then the whole business of restarts is irrelevant, and your >question can be boiled down to 'Why does CASINO sometimes stop when it tries to open my pseudopotential file?'
>Is that a correct statement?
Sorry for confusion. I mean, always there is no problem at the beginning of the calculation but it occasionally stops when I restart it after running out of time and sometimes I can restart it without any error.
> CASINO_ARCH is "linuxpc-ifort-sge-parallel.polaris". I had the same problem with the CASINO_ARCH "linuxpc-ifort-lsf-parallel.lancaster".
> These are two physically different machines, correct?
Yes. that is right.
> If you see the same error on two different machines then it's likely to be a CASINO problem, or a compiler problem, rather than some weird problem with your computer..
> Could you tell me the output of 'ifort --version' on the two machines.
For "linuxpc-ifort-lsf-parallel.lancaster", ifort --version is "ifort (IFORT) 12.1.0 20111011" and for "linuxpc-ifort-sge-parallel.polaris" it is "ifort (IFORT) 12.1.5 20120612".
> Are there any other compilers available on either machine?
Sorry, I do not know. How can I check it?
Mike Towler
Posts: 239
Joined: Thu May 30, 2013 11:03 pm
Location: Florence
Contact:

Re: Error PPOTS

Post by Mike Towler »

Sorry for confusion. I mean, always there is no problem at the beginning of the calculation but it occasionally stops when I restart it after running out of time and sometimes I can restart it without any error.
Describe to me exactly the (presumably manual) process you use to restart the calculation.
Post Reply