Skip to content

The Materials API

Presented by: John Dagdelen

In this lesson, we cover:

  • The Materials Project API (MAPI) and its documentation, the mapidoc.
  • Getting your Materials Project API key.
  • Using the MPRester to access the MP database.
  • A hands-on example of using the API and pymatgen to screen the MP database for interesting materials.
# This supresses warnings.
import warnings
warnings.filterwarnings('ignore')

# This is a helper function to shorten lists during the 
# live presentation of this lesson for better readability. 
# You can ignore it. 
def shortlist(long_list, n=5):
    print("First {} of {} items:".format(min(n, 5), len(long_list)))
    for item in long_list[0:n]:
        print(item)


Section 0: Getting an API key

The first step to getting started with the API is to get an API key. We do this on the Materials Project website (https://materialsproject.org/dashboard.)

  1. Click the Generate API key button
  2. copy your shiny new key
  3. Paste your key in the line below and run the cell.
!pmg config --add PMG_MAPI_KEY <your API key>
Existing /Users/jdagdelen/.pmgrc.yaml backed up to /Users/jdagdelen/.pmgrc.yaml.bak
New /Users/jdagdelen/.pmgrc.yaml written!

Section 1: The MAPIDOC

The mapidoc is a key source of information regarding the Materials Project API. It should be the first thing you consult whenever you are having trouble with the API. Let's take a look!


Section 2: Basic Queries In the Web Browser

To request data from the Materials Project, you will need to make requests to our API. To do this, you could simply make a GET request through your web browser, providing your API key as an argument.

For example,

https://www.materialsproject.org/rest/v2/materials/mp-1234/vasp?API_KEY=<your api key>

For example,

https://www.materialsproject.org/rest/v2/materials/mp-1234/vasp?API_KEY=<your api key>

returns the following JSON document:

{"response": [{"energy": -26.94573468, "energy_per_atom": -4.49095578, "volume": 116.92375473740876, "formation_energy_per_atom": -0.4835973866666663, "nsites": 6, "unit_cell_formula": {"Al": 4.0, "Lu": 2.0}, "pretty_formula": "LuAl2", "is_hubbard": false, "elements": ["Al", "Lu"], "nelements": 2, "e_above_hull": 0, "hubbards": {}, "is_compatible": true, "spacegroup": {"source": "spglib", "symbol": "Fd-3m", "number": 227, "point_group": "m-3m", "crystal_system": "cubic", "hall": "F 4d 2 3 -1d"}, "task_ids": ["mp-1234", "mp-925833", "mp-940234", "mp-940654"], "band_gap": 0.0, "density": 6.502482433523648, "icsd_id": null, "icsd_ids": [608375, 57958, 608376, 608372, 608371, 608370], "cif": "# generated using pymatgen\ndata_LuAl2\n_symmetry_space_group_name_H-M   'P 1'\n_cell_length_a   5.48873905\n_cell_length_b   5.48873905\n_cell_length_c   5.48873905\n_cell_angle_alpha   60.00000005\n_cell_angle_beta   60.00000003\n_cell_angle_gamma   60.00000007\n_symmetry_Int_Tables_number   1\n_chemical_formula_structural   LuAl2\n_chemical_formula_sum   'Lu2 Al4'\n_cell_volume   116.92375474\n_cell_formula_units_Z   2\nloop_\n _symmetry_equiv_pos_site_id\n _symmetry_equiv_pos_as_xyz\n  1  'x, y, z'\nloop_\n _atom_site_type_symbol\n _atom_site_label\n _atom_site_symmetry_multiplicity\n _atom_site_fract_x\n _atom_site_fract_y\n _atom_site_fract_z\n _atom_site_occupancy\n  Al  Al1  1  0.500000  0.500000  0.500000  1\n  Al  Al2  1  0.500000  0.500000  0.000000  1\n  Al  Al3  1  0.000000  0.500000  0.500000  1\n  Al  Al4  1  0.500000  0.000000  0.500000  1\n  Lu  Lu5  1  0.875000  0.875000  0.875000  1\n  Lu  Lu6  1  0.125000  0.125000  0.125000  1\n", "total_magnetization": 0.0012519, "material_id": "mp-1234", "oxide_type": "None", "tags": ["High pressure experimental phase", "Aluminium lutetium (2/1)"], "elasticity": null, "full_formula": "Lu2Al4"}], "valid_response": true, "created_at": "2018-08-08T18:52:53.042666", "version": {"db": "3.0.0", "pymatgen": "2018.7.23", "rest": "2.0"}, "copyright": "Materials Project, 2018"}

For obvious reasons, typing these kinds of urls into your web browser is not an ideal way to request MP data. Instead, we should try to access the API programatically with python. Let's do the same request that we did above using Python's requests library.

Making Requests With Python

import requests

response = requests.get("https://www.materialsproject.org/rest/v2/materials/mp-1234/vasp", 
                        {"API_KEY": "<your API key>"})

print(response.text)


Section 3: The MPRester

In this section we will:

  • Open the pymatgen.MPRester web documentation.
  • Create our first instance of an MPRester object.
  • Get our feet wet with calling a few of the MPRester's "specialty" methods.

Background and Documentation

  • Code connects to the MP Database through REST requests.
  • Pymatgen's MPRester class is helpful for accessing our API in python.
  • The documentation for the MPRester is very helpful. Let's take a look!

Background and Documentation

REST is a widely used type of standardization that allows different computer systems to work together. In RESTful systems, information is organized into resources, each of which is uniquely identified via a uniform resource identifier (URI). Since MAPI is a RESTful system, users can interact with the MP database regardless of their computer system or programming language (as long as it supports basic http requests.)

To facilitate researchers in using our API, we implemented a convenient wrapper for it in the Python Materials Genomics (pymatgen) library called the MPRester. You can find the relevant pymatgen documentation for it here.

Starting up an instance of the MPRester

We'll import the MPRester and create an instance of it.

Note: You may need to use your API key as an input argument if it has not been pre-configured.

from pymatgen import MPRester

mpr = MPRester()
print(mpr.supported_properties)
('energy', 'energy_per_atom', 'volume', 'formation_energy_per_atom', 'nsites', 'unit_cell_formula', 'pretty_formula', 'is_hubbard', 'elements', 'nelements', 'e_above_hull', 'hubbards', 'is_compatible', 'spacegroup', 'task_ids', 'band_gap', 'density', 'icsd_id', 'icsd_ids', 'cif', 'total_magnetization', 'material_id', 'oxide_type', 'tags', 'elasticity')

However, we recommend that you use the “with” context manager to ensure that sessions are properly closed after usage:

with MPRester() as mpr:
    print(mpr.supported_properties)
('energy', 'energy_per_atom', 'volume', 'formation_energy_per_atom', 'nsites', 'unit_cell_formula', 'pretty_formula', 'is_hubbard', 'elements', 'nelements', 'e_above_hull', 'hubbards', 'is_compatible', 'spacegroup', 'task_ids', 'band_gap', 'density', 'icsd_id', 'icsd_ids', 'cif', 'total_magnetization', 'material_id', 'oxide_type', 'tags', 'elasticity')

MPRester Methods:

The MPRester has many methods that you might want to use in your research. For example, there is a method to get the bandstructure for a material, get_bandstructure_by_material_id.

Let's use this method and the following bandstructure plotting function to get and plot a bandstructure for mp-1234:

### Don't edit this code ####
from pymatgen.electronic_structure.plotter import BSPlotter
# Helpful function for plotting a bandstructure. 
def plot_bandstructure(bs):
    BSPlotter(bs).get_plot().show() 
#############################
# Excercise: Use the MPRester's get_bandstructure_by_material_id method to 
# get a bandstructure from the MP Database and plot it using the
# plot_bandstructure functin defined above.
with MPRester() as mpr:
    bs = mpr.get_bandstructure_by_material_id("mp-1234")

plot_bandstructure(bs) 

There's also a method to get MPIDs for a formula or chemical system called get_materials_ids.

with MPRester() as mpr:
    # You can pass in a formula to get_materials_ids
    shortlist(mpr.get_materials_ids("LiFePO4"))
    # Or you can pass in a "chemsys" such as "Li-Fe-P-O"
    shortlist(mpr.get_materials_ids("Li-Fe-P-O"))
First 5 of 67 items:
mp-765593
mp-757182
mp-1662030
mp-772409
mp-765604
First 5 of 908 items:
mp-1245108
mp-1271693
mp-1194030
mp-1271562
mp-136

Using the API to achieve research goals:

Imagine you want to get the structure for the multiferroic material BiFeO3 (mp-24932) and suggest some substrates for growing it.

We can use methods of the MPRester to get this information from the Materials Project API.

Hints:

  • MPRester.get_structure_by_material_id
  • MPRester.get_substrates
# Get the structure for BiFeO3 (mp-23501) and 
# suggest some substrates for growing it.
with MPRester() as mpr:
    structure = mpr.get_structure_by_material_id("mp-23501")
    substrates = mpr.get_substrates("mp-23501")
    print(structure)
    print([s["sub_form"] for s in substrates])
Full Formula (Fe2 Bi2 O6)
Reduced Formula: FeBiO3
abc   :   5.615643   5.615629   5.705140
angles:  60.510834 119.489364 120.001015
Sites (10)
  #  SP           a         b         c    magmom
---  ----  --------  --------  --------  --------
  0  Fe    0.219028  0.780969  0.657065    -4.256
  1  Fe    0.719032  0.28097   0.15707      4.256
  2  Bi    0.498595  0.501425  0.495716    -0.001
  3  Bi    0.998575  0.001404  0.995717     0.001
  4  O     0.436045  0.111857  0.359395     0.034
  5  O     0.035218  0.563986  0.359413     0.034
  6  O     0.888122  0.964774  0.359409     0.034
  7  O     0.388142  0.063955  0.859394    -0.034
  8  O     0.535228  0.61188   0.859409    -0.034
  9  O     0.936013  0.46478   0.859414    -0.034
['AlN', 'LaAlO3', 'LiGaO2', 'WS2', 'MoS2', 'C', 'TbScO3', 'MgF2', 'NdGaO3', 'BaTiO3', 'Ag', 'C', 'DyScO3', 'GdScO3', 'Mg', 'LiAlO2', 'Au', 'BaTiO3', 'AlN', 'TiO2', 'ZnO', 'NaCl', 'MgF2', 'Bi2Te3', 'SrTiO3', 'KTaO3', 'GaN', 'NaCl', 'Al', 'C', 'ZnO', 'TeO2', 'Ni', 'C', 'SrTiO3', 'GaN', 'TiO2', 'DyScO3', 'Te2W', 'GdScO3', 'SiC', 'BaTiO3', 'ZnSe', 'SiC', 'WS2', 'WS2', 'ZnO', 'WS2', 'MoS2', 'MoS2']

At this point, you should be comfortable with:

  • Finding documentation on the MPRester.
  • Creating an instance of the MPRester.
  • Using methods of the MPRester.


Section 4: Using the MPRester.query method.

The MPRester also has a very powerful method called query, which allows us to perform sophisticated searches on the database. The query method uses MongoDB's query syntax. In this syntax, query submissions have two parts: a set of criteria that you want to base the search on (in the form of a python dict), and a set of properties that you want the database to return (in the form of either a list or dict).

You will probably find yourself using the MPRester's query method frequently.

The general structure of a MPRester query is:

                        mpr.query(criteria, properties)

The general structure of a MPRester query is:

                        mpr.query(criteria, properties)
  • criteria is usually a string or a dict.
  • properties is always a list of strings

Let's try out some queries to learn how it works!

First, we'll query for \(SiO_2\) compounds by chemical formula through 'pretty_formula'.

with MPRester() as mpr:
    results = mpr.query({'pretty_formula':"SiO2"}, properties=['material_id', 'pretty_formula'])
    print(len(results))

If we investigate the object that the query method returns, we find that it is a list of dicts. Furthermore, we find that the keys of the dictionaries are the very same keywords that we passed to the query method as the properties argument.

print('Results are returned as a {} of {}.\n'.format(type(results), type(results[0])))

for r in results[0:5]:
    print(r)

In fact, if you are just looking for materials based on formula/composition/stoichiometry, there is an easier way to use the query method: just pass in a string as the criteria!

You can even use wildcard characters in your searches. For example, if we want to find all \(ABO_3\) compounds in the Materials Project:

with MPRester() as mpr:
    results = mpr.query('**O3', properties=["material_id", "pretty_formula"])
    shortlist(results)

Putting it into practice:

There are 296 variants of \(SiO_2\) in the MP database, but how many \(Si_xO_y\) compounds are there in the Materials Project?

Hint:

  • Query using a chemsys string instead of a formula.
with MPRester() as mpr:
    print(len(mpr.query("Si-O", ["material_id"])))
331

EXCERCISE 1

MongoDB Operators

Above, we specified the chemical formula SiO\(_2\) for our query. This is an example of, the "specify" operator. However, MongoDB's syntax also includes other query operators, allowing us to bulid complex conditionals into our queries. These all start with the "$" character.

Some important MongoDB operators you should be familiar with are:

  • $in (in)
  • $nin (not in)
  • $gt (greater than)
  • $gte (greater than or equal to)
  • $lt (less than)
  • $lte (less than or equal to)
  • $not (is not)

We used these more advanced operators as follows:

{"field_name": {"$op": value}}

For example, "entries with e_above_hull that is less than 0.25 eV" would be:

{"e_above_hull": {"$lt": 0.25}}

A paper by McEnany et. al. proposes a novel ammonia synthesis process based on the electrochemical cycling of lithium (link). As an exercise, let's use some of MongoDB's operators and ask the database for nitrides of alkali metals.

# Find all nitrides of alkali metals
alkali_metals = ['Li', 'Na', 'K', 'Rb', 'Cs']
criteria={"elements":{"$in":alkali_metals, "$all": ["N"]}, "nelements":2}
properties=['material_id', 'pretty_formula']
shortlist(mpr.query(criteria, properties))
#Bonus short way to do this with wildcards
shortlist(mpr.query('{Li,Na,K,Rb,Cs}-N', ['material_id', 'pretty_formula']))

We can also perform the same query, but ask the database to only return compounds with energies above the hull less than 10 meV/atom by using the "less than" operator, "$lt". (The energy above the convex hull gives us a sense of how stable a compound is relative to other compounds with the same composition.)

criteria={"elements":{"$in":alkali_metals, "$all":["N"]}, "nelements":2, 
          'e_above_hull':{"$lt":0.010}}
properties=['material_id', 'pretty_formula']
mpr.query(criteria, properties)

EXCERCISE 2

In this lesson, we have covered:

  • The Materials Project API (MAPI) and its documentation, the mapidoc.
  • Getting your Materials Project API key.
  • Using the MPRester to access the MP database.
  • Hands-on examples of using the API and pymatgen to screen the MP database for interesting materials.