Project "Pseudos ABINIT" : global view

Global long-term objective :
G1. To have, on the Web, sets of validated pseudopotentials, for the whole periodic table,
for different exchange-correlation functionals, with different possibilities of semi-core electrons
(e.g. for GW), different cut-off radii (e.g. high pressure application), with an optimal cut-off energy.
G2. To have a Web portal for the generation / validation of new pseudopotentials.


Intermediate objectives (more and more difficult steps):
O1. Robust generation of one pseudopotential, be given the atomic number
    (for a given exchange-correlation functional, definition of semi-core electrons, definition of cut-off radii)
    The cut-off energy does not matter at this stage.
O2. Computation of selected physical properties for selected systems, associated with one given pseudopotential
    (automatic computation of the cut-off energy, computation of the total energy, the interatomic distance,
     the lattice parameter of the elemental solid and one oxide, also dimer). Results presented on the Web.
O3. Validation of pseudopotential with respect to a reference
O4. Semi-automatic improvement of the generation of the pseudopotential :
      more accurate physical properties, better energy cut-off
O5. Automatic generation of one set of pseudopotentials, and associated automatic procedure of calculation, validation.
Also, one should develop tools for 
O6. The transfer to the Web of the different results of the objectives.

Components (database)
D1. A set of repositories (one for each atomic number), under the Version Control System bazaar, that can be accessed
      by all ABINIT developers as well as buildbot. 
      Access : bzr+ssh://psps@archives.abinit.org/forge/psps/<Z-symbol>
      where <Z-symbol> is the atomic number (three digits) followed by the symbol, with a dash in between, e.g. 001-H,
      or 092-U .
      Each of these repository is an "ATOM repository", and will contain subdirectories
      of two kinds, see section D2 and D3.
D2. Structuration inside one ATOM repository. 
    - A /ReferenceData subdirectory, that is not tight to one pseudopotential.
    - For each pseudopotential :  
      /[PAW|NC|...]/#valence_electrons/<xc_type>/ID , which might called a "Pseudo-atom Box" or patbox.
    The #valence_electrons might be 4e, or 22e ...
    The <xc_type> might be GGA-PBE or LDA-PW91 or the libcx ID : p.ex. X001C012 .
    ID will be a digit, e.g. 1, or 2, etc ...
    These IDs will not have any predefined meaning. Some of the pseudo-atom boxes might be good
    for a specific purpose (e.g. GW or High-pressure), but this will be determined from the
    database of results for each pseudo-atom box, by a script, at the demand of one user (or for populating
    a Web page).
D3. Content of the /ReferenceData subdirectory of the "ATOM repository":
      D3a. A (xml ? plist ?) file with the atomic configuration for each possible #valence_electrons, and other data needed for 
           pseudo-atom generators that are not specific to a pseudo-atom generator. Standard name : atomic_config.xml .
      D3b. Possibly, some CIF files for a elemental solid (or more than one), and for oxide(s) or hydride(s), or potassium-based compounds.
      D3c. A set of master data file (xml) containing the description of the different test systems for the specific atom.
           Each test system belongs to a test system class : atom, dimer, elemental, oxide, hydride or potassium-based compound.
           Within each class it is labelled with an ID (number starting from 1). 
           This description contains insulator/metal and magnetism information, and either
           the name of the cif file to be used, or the reference length for the dimer.
           Standard name : <Z-symbol>.description_<class>_ID.xml
D4. Content of one pseudo-atom Box :
      D4a Subdirectories : <name_of_generator>, atom_X, dimer_X, elemental_X, oxide_X, possibly hydride_X or K_X . 
            Where <name_of_generator> might be atompaw, or ape, or fhi98pp, or ...
            And where X is the ID defined in D3c.
      D4b A (xml) summary file containing metadata concerning this pseudo-atom box, obtained by running the applications in the
            different subdirectories, and also describing the validation criteria (this implies a set of runs).
            Standard name : <Z-symbol>.summary.[PAW|NC|...].#valence_electrons.<xc_type>.ID.xml
D5. Content of the <name_of_generator> subdirectory of the pseudo-atom Box
      D5a Optionally, the specific input data needed for the generator (PAW or NC), to complement the content of the atomic_config file.
            Typically cut-off radii.
            Standard name atomic_data_<name_of_generator>.xml, e.g. atomic_data_atompaw.xml
      D5b A pseudo-atom data generator input file (PAW or NC) - might have been automatically generated
            from atomc_config and the file in D5a.
            Standard name <name_of_generator>.in
      D5c A pseudo-atom data file (PAW or NC) - has been automatically generated (output of the atomic generator).
            Standard name <Z-symbol>.pseudoatom.[PAW|NC|...].#valence_electrons.<xc_type>.ID.<standard_postfix_for_the_generator>
            This is the pseudopotential file, or the PAW atomic data file.
            The <standard_postfix_for_the_generator> might be .fhi or .pawps , or other postfix.
D6. Content of the <class>_X (where class is atom, dimer, elemental, oxide, hydride ...)
      D6a Subdirectories abinit_runY and elk_runY, where Y is an integer starting from 1.
D7. Content of the abinit_runY directory
      D7a This is a working directory for one abinit run. It contains an ABINIT input file 
            usually automatically generated from D3c, specialized for the pseudo-atom box and the system.
      D7b For Y=1 : determination of a basic k point grid, using kptrlen and prtkpt. Can be used by elk, see elk_1 .
      D7c For Y=2 : computation of total energy as a function of ecut, for basic k point grid, and, for metals, using the
            tsmear determined by elk_1.
D8. Content of the elk_runY subdirectory 
      D8a This is a working directory for one elk run. It contains an ELK input file 
            usually automatically generated from D3c, specialized for the pseudo-atom box and the system, and
            using the k point grid determined by D7b .
      D8b. For Y=1, determination of the tsmear.

Components (repositories)
R1. The ABINIT repository contains a subdirectory psps, that has a README file (the present one),
      a script subdirectory, and a share subdirectory, but will also be the natural place
      to create branches of the other repositories described below.
R2. The >100 repositories, described at length in D1-D8
R3. A bzr+ssh://psps@archives.abinit.org/forge/psps/bin repository, that should contain
      the elk executable (for standard Pentium based machines) as well as the spacegroup executable
R4. A bzr+ssh://psps@archives.abinit.org/forge/psps/share repository, that should contain
       different common files, like the oxygen and hydrogen reference files for different (at least LDA and GGA)
       exchange-correlation functionals.
       OTHER SCHEMES have been discussed for this, especially relying upon the the D1-D8 repositories for Oxygen, and 
       Potassium, instead of Hydrogen.


Components (software) .
They should be placed inside ABINIT package psps/script, for testing/coherency purposes accross the different <Z-symbol> directories.
C0. A "pseudo-atom box" creator (init_patbox.py), to be called inside the psps/<Z-symbol> directory.
       (propose options for the path described in D2, then create the path, 
        and the directories of the pseudo-atom box, and also bzr add the dirs)
C1. A cif2cml translator, to go from D3b to D3c.
C2. A script to initialize the file <name_of_generator>.in mentioned in D5b from D3a atomic_config.xml and D5a atomic_data_<name_of_generator>.xml 
C3. A pseudopotential generator , e.g. ATOMPAW  (already placed inside the ABINIT package)
C4. A driver of abinit : generation of abinit input files, running of abinit, gathering of the data in D4b.
      ACTUALLY NEED A LIST OF TASKS / VALIDATION CRITERIA /  to be defined.
C5. A driver of elk, and a binary for elk.
C6. A validator.
More scripts to be added ...

Miscellaneous
M.1 Reference oxygen PAW data files for different XC functionals, reference hydrogen PAW data files
      for different XC functionals.  Placed in the abinit/psps/RefPseudoAtoms subdirectory of the ABINIT package.
      And to be copied in the patbox at init time.

Strategy
- Work component by component, by placing these components under version management and automatic testing,
with appropriate hardware
- Define the files and their format (including metadata) in an iterative way, with possibilities to regenerate them in an automatic way
- Gradual understanding of the CPU constraints, memory constraint, and human time needed.
- Adjust the objectives to stay realist.

----------------------------------------------------------------------------------
TO BE KEPT IN MIND FOR FURTHER SPECIFICATION

- set up of a bot (on the machine nazgul) : be given the ABINIT branch, and the pseudopotentials =>
     computation of the physical characteristics of this pseudopotential
- set up of the corresponding "on-demand" mechanism
- set up a new waterfall : the list of files that will be provided will be quite different from the usual bots
- set up of a new Web window to visualize the files (to be discussed).

----------------------------------------------------------------------------------
Quick reference :

bzr+ssh://psps@archives.abinit.org/forge/psps/<Z-symbol>
e.g. 
bzr+ssh://psps@archives.abinit.org/forge/psps/008-O
bzr+ssh://psps@archives.abinit.org/forge/psps/014-Si
as well as
bzr+ssh://psps@archives.abinit.org/forge/psps/bin    ! Contains elk
bzr+ssh://psps@archives.abinit.org/forge/psps/share  ! Contains shared files, like reference O and H pseudopotentials (LDA/GGA)
