Skip to content

MPContribs

Walkthrough

Exercises

  • make yourself familiar with datasets available on MPContribs
  • use the Search tab to find interesting contributions and their MP detail page(s)
  • scan the example notebooks in the Work tab
  • consider applying for your own project via the Apply tab
  • feel free to ask questions on Slack

Contribute data on Refractive Index

For demonstration purposes, we prepared data from https://refractiveindex.info for you to contribute using the mpcontribs-client.

from mp_workshop.mpcontribs import data
from mpcontribs.client import Client
FW Echo Test: MP Workshop

We're setting the name of our project and initialize the API client with our API key. Here, we use a dedicated sandbox project and provide you with the API key through Slack. Feel free to use your own project name and API key instead if you were successful in applying for a project (and getting it approved) during the previous exercise.

name = 'sandbox' # this should/could be your project
client = Client('api-key-here')

Retrieve and update project info

We might have missed to add all references when we applied for the project on the portal. So let's update them and also set unique_identifiers to False. This flag indicates that this project can contain multiple contributions (rows in the landing page's overview table) for the same MP material (mp-id).

client.projects.update_entry(pk=name, project={
    'unique_identifiers': False,
    'references': [
        {'label': 'Docs', 'url': 'https://mpcontribs.org'},
        {'label': 'Source', 'url': 'https://refractiveindex.info'}
    ]
}).result()
{'unique_identifiers': False,
 'references': [{'label': 'Docs', 'url': 'https://mpcontribs.org'},
  {'label': 'Source', 'url': 'https://refractiveindex.info'}]}

The response above indicates which fields have been updated in the database. We can confirm that by retrieving the full information stored for the project. Note that, the columns key is set and updated automatically when data is later added to the project.

client.get_project(name).pretty()
namesandbox
is_publicTrue
titleSandbox
ownerphuck@lbl.gov
is_approvedTrue
unique_identifiersFalse
long_titleSandbox for Workshop 2020
authorsP. Huck
descriptionThis project is for testing purposes only. It's being used for the MP Workshop, for instance, and might be reset/deleted without advanced notice.
references
labelurl
Docshttps://mpcontribs.org
Sourcehttps://refractiveindex.info
columns

Create contributions

The data dictionary imported above from our workshop library contains a small list of ready-to-go contributions for each attendee. Let's find our entry and explore it a little.

data.keys()
dict_keys(['aggrey', 'saleemaldajani', 'sayoudaq', 'arias', 'baatar', 'bagherzad', 'bavdekar', 'rpuga', 'bogdan', 'bonkowski', 'chalk', 'cuillier', 'dahliah', 'delgado', 'adonakowski', 'anetxeb', 'fallon', 'sfaraji', 'gangopadhyay', 'garcia', 'giammona', 'gibson', 'jglodo', 'tul10053', 'guillen', 'sgup', 'hjzfree', 'hassan', 'hawkins', 'homer', 'hoermann', 'hossain', 'ajh618', 'akrami1', 'cjang', 'jiang', 'jinxing', 'djohnson', 'hjohnson', 'juhasz', 'kaliappan', 'krkaufma', 'kinsley', 'physics979', 'lanetti', 'fall', 'lau', 'xl2778', 'liang', 'lin', 'jlow', 'luyuan', 'amantra', 'lamcrae', 'fanchem', 'mirza', 'tara', 'imonterrubio', 'nawwar', 'oertel', 'hiroko', 'no6', 'spa031', 'paulino', 'gollapalli', 'xqi', 'aramac', 'mreynaud', 'cicenergigune', 'rrivers', 'rom', 'schwietert', 'sebastian', 'shaw', 'psonbell', 'sspr', 'ksteirer', 'stuckner', 'sun', 'atantillo', 'tappan', 'taylor', 'prabalt', 'tiwari', 'tolentino', 'tomczak', 'tsaie', 'vanbever', 'vasileiadis', 'verma', 'wang', 'n01392810', 'wufeifeng', 'yuan', 'hongxing', 'zyzhang', 'xiaojing', 'dwaraknath', 'horton', 'huck', 'persson'])
me = "huck"
data[me]
[{'identifier': 'mp-6134',
  'data': {'type': 'f4',
   'Δλ': {'min': '0.4 µm', 'max': '1.0 µm'},
   'coefficients': {'c0': '1.92155',
    'c1': '0.00494',
    'c2': '0',
    'c3': '0.00617',
    'c4': '1',
    'c5': '0',
    'c6': '0',
    'c7': '0',
    'c8': '1',
    'c9': '-0.00373',
    'c10': '2'}},
  'tables': [            wl         n
   Woods-e                 
   0        0.400  1.397522
   1        0.406  1.397165
   2        0.412  1.396824
   3        0.418  1.396498
   4        0.424  1.396186
   ...        ...       ...
   96       0.976  1.386801
   97       0.982  1.386762
   98       0.988  1.386723
   99       0.994  1.386685
   100      1.000  1.386647

   [101 rows x 2 columns]]},
 {'identifier': 'mp-6134',
  'data': {'type': 'f4',
   'Δλ': {'min': '0.4 µm', 'max': '1.0 µm'},
   'coefficients': {'c0': '1.92552',
    'c1': '0.00492',
    'c2': '0',
    'c3': '0.00569',
    'c4': '1',
    'c5': '0',
    'c6': '0',
    'c7': '0',
    'c8': '1',
    'c9': '-0.00421',
    'c10': '2'}},
  'tables': [            wl         n
   Woods-o                 
   0        0.400  1.398832
   1        0.406  1.398478
   2        0.412  1.398140
   3        0.418  1.397817
   4        0.424  1.397507
   ...        ...       ...
   96       0.976  1.388058
   97       0.982  1.388017
   98       0.988  1.387977
   99       0.994  1.387937
   100      1.000  1.387897

   [101 rows x 2 columns]]}]

Notice that a contribution to a specific MP material contains 3 optional components:

  • simple (potentially nested) "key-value" data
  • tables as Pandas DataFrame objects (think spreadsheets and csv files)
  • structures as Pymatgen Structure objects (think CIF, POSCAR, ...)

The only thing left to do here is a quick loop over the contributions to assign each of them to our project and set it to public. By default, projects and contributions are private and only visible (and writable) to project owners and their collaborators. The owner needs to explicitly request for a collaborator to be added to the project group.

for contrib in data[me]:
    contrib["project"] = name
    contrib["is_public"] = True

len(data[me])
2

Submit contributions

Simply provide your list of contributions as argument to the client's submit_contributions function to prepare and upload them to MPContribs. Duplicate checking will happen automatically if unique_identifers is set to True for the project (the default). If successful, the client will return the number of contributions submitted.

client.submit_contributions(data[me])


Your contributions should now show up on the project landing page on the MPContribs portal as well as on the according material details pages on MP :)

Retrieve and query contributions

Choose an mp-id from the landing page and retrieve the IDs for its contributions. Use one of the IDs to show a pretty display of the data.

resp = client.contributions.get_entries(
    project=name, identifier="mp-6134", _fields=["id"]
).result()
cids = [d["id"] for d in resp["data"]]
contrib = client.get_contribution(cids[0])
contrib.pretty()
projectsandbox
identifiermp-6134
formulaLiCaAlF6
is_publicTrue
last_modified2020-07-30 01:24:49.263000
data
typef4
Δλ
min400.0 nm
max1.0 µm
coefficients
c01.92155
c10.00494
c20
c30.00617
c41
c50
c60
c70
c81
c9-0.00373
c102
structures
tables
idnamemd5
5f2221607798578b36163216Woods-e30409de9efda60793528098dd33642b4
notebook
id5f2221617798578b36163218
id5f2221607798578b36163217

Grab the table ID and retrieve it as Pandas DataFrame. You can plot it interactively using Pandas integration with Plotly through the plot() function.

tid = contrib["tables"][0]["id"]
client.get_table(tid)#.plot()
n
wl
0.4 1.3975215819099
0.406 1.3971648958942
0.412 1.3968239880086002
0.418 1.3964979034667
0.424 1.3961857588595
... ...
0.976 1.3868008648135
0.982 1.3867619907128999
0.988 1.3867234413885001
0.994 1.386685206585
1.0 1.3866472763569

101 rows × 1 columns

Finally, let's build up a more complicated query to reduce the list of contributions to the ones we might be interested in.

query = {
    "project": name,
    "formula__contains": "Li",
    "data__type__contains": "f4",
    "data__coefficients__c1__value__gte": 4.93e-3,
    "_order_by": "data__coefficients__c1__value",
    "order": "desc",
    "_fields": [
        "id", "identifier", "formula",
        "data.type", "data.coefficients.c0.value",
        "data.coefficients.c1.value"
    ]
}
client.contributions.get_entries(**query).result()
{'data': [{'id': '5f2221607798578b36163217',
   'identifier': 'mp-6134',
   'formula': 'LiCaAlF6',
   'data': {'type': 'f4',
    'coefficients': {'c0': {'value': 1.92155}, 'c1': {'value': 0.00494}}}}],
 'has_more': False,
 'total_count': 1,
 'total_pages': 1}

Exercises

  • check out the columns field in the project information
  • query the carrier_transport or another dataset of your choice
  • retrieve a table, display and plot it interactively
  • if you used your own project, delete the refraction index data and start adding your own :)
  • join the dedicated MPContribs Slack