MPContribs¶
Walkthrough¶
- start with a materials detail page on MP with user contributions
- navigate to the portal and explore
- discover the API documentation
Exercises¶
- make yourself familiar with datasets available on MPContribs
- use the
Search
tab to find interesting contributions and their MP detail page(s) - scan the example notebooks in the
Work
tab - consider applying for your own project via the
Apply
tab - feel free to ask questions on Slack
Contribute data on Refractive Index¶
For demonstration purposes, we prepared data from https://refractiveindex.info for you to contribute using the mpcontribs-client.
from mp_workshop.mpcontribs import data
from mpcontribs.client import Client
We're setting the name of our project and initialize the API client with our API key. Here, we use a dedicated sandbox
project and provide you with the API key through Slack. Feel free to use your own project name and API key instead if you were successful in applying for a project (and getting it approved) during the previous exercise.
name = 'sandbox' # this should/could be your project
client = Client('api-key-here')
Retrieve and update project info¶
We might have missed to add all references when we applied for the project on the portal. So let's update them and also set unique_identifiers
to False
. This flag indicates that this project can contain multiple contributions (rows in the landing page's overview table) for the same MP material (mp-id
).
client.projects.update_entry(pk=name, project={ 'unique_identifiers': False, 'references': [ {'label': 'Docs', 'url': 'https://mpcontribs.org'}, {'label': 'Source', 'url': 'https://refractiveindex.info'} ] }).result()
The response above indicates which fields have been updated in the database. We can confirm that by retrieving the full information stored for the project. Note that, the columns
key is set and updated automatically when data is later added to the project.
client.get_project(name).pretty()
Create contributions¶
The data
dictionary imported above from our workshop library contains a small list of ready-to-go contributions for each attendee. Let's find our entry and explore it a little.
data.keys()
me = "huck"
data[me]
Notice that a contribution to a specific MP material contains 3 optional components:
- simple (potentially nested) "key-value" data
- tables as Pandas DataFrame objects (think spreadsheets and csv files)
- structures as Pymatgen Structure objects (think CIF, POSCAR, ...)
The only thing left to do here is a quick loop over the contributions to assign each of them to our project and set it to public. By default, projects and contributions are private and only visible (and writable) to project owners and their collaborators. The owner needs to explicitly request for a collaborator to be added to the project group.
for contrib in data[me]:
contrib["project"] = name
contrib["is_public"] = True
len(data[me])
Submit contributions¶
Simply provide your list of contributions as argument to the client's submit_contributions
function to prepare and upload them to MPContribs. Duplicate checking will happen automatically if unique_identifers
is set to True
for the project (the default). If successful, the client will return the number of contributions submitted.
client.submit_contributions(data[me])
Your contributions should now show up on the project landing page on the MPContribs portal as well as on the according material details pages on MP :)
Retrieve and query contributions¶
Choose an mp-id
from the landing page and retrieve the IDs for its contributions. Use one of the IDs to show a pretty display of the data.
resp = client.contributions.get_entries( project=name, identifier="mp-6134", _fields=["id"] ).result() cids = [d["id"] for d in resp["data"]] contrib = client.get_contribution(cids[0]) contrib.pretty()
Grab the table ID and retrieve it as Pandas DataFrame. You can plot it interactively using Pandas integration with Plotly through the plot()
function.
tid = contrib["tables"][0]["id"] client.get_table(tid)#.plot()
Finally, let's build up a more complicated query to reduce the list of contributions to the ones we might be interested in.
query = { "project": name, "formula__contains": "Li", "data__type__contains": "f4", "data__coefficients__c1__value__gte": 4.93e-3, "_order_by": "data__coefficients__c1__value", "order": "desc", "_fields": [ "id", "identifier", "formula", "data.type", "data.coefficients.c0.value", "data.coefficients.c1.value" ] } client.contributions.get_entries(**query).result()
Exercises¶
- check out the
columns
field in the project information - query the
carrier_transport
or another dataset of your choice - retrieve a table, display and plot it interactively
- if you used your own project, delete the refraction index data and start adding your own :)
- join the dedicated MPContribs Slack