Skip to content

Getting started on Colab

Colab is a hosted python notebook service, based on Jupyter, that provides free access to computing resources (including GPUs and TPUs). Colab allows for quick and easy sharing with collaborators without requiring downloads or python environment setup.

Executing Python Code

for i in range(10): print(i**2)

Colab environment comes with a number of pre-installed scientific and machine learning packages such as numpy, scipy, pandas, tensorflow, and pytorch.

We can check the installed python packages using pip freeze

!pip freeze

Add code snippets

To add pre-written code snippets (useful for plotting), open the code snippets tab

  • Insert -> Code Snippets

from vega_datasets import data stocks = data.stocks()

import altair as alt alt.Chart(stocks).mark_line().encode( x='date:T', y='price', color='symbol' ).interactive(bind_y=False)

Runtimes

Each Colab instance (runtime) runs on an individual Virtual Machine(VM).

To view active sessions, select Runtime > Manage Sessions.

Here you will be able to view all your open notebooks (and terminate them).


By Default, Colab instances have access to CPUs, however notebooks have access to GPU and TPU resources.

To initialize a runtime with a GPU,

  1. Select 'Runtime' > 'Change Runtime Type'
  2. Select 'GPU' in the drop down menu and click 'Save'

We'll test the GPU using the example provided by colab.

%tensorflow_version 2.x import tensorflow as tf import timeit

device_name = tf.test.gpu_device_name() if device_name != '/device:GPU:0': print( '\n\nThis error most likely means that this notebook is not ' 'configured to use a GPU. Change this in Notebook Settings via the ' 'command palette (cmd/ctrl-shift-P) or the Edit menu.\n\n') raise SystemError('GPU device not found')

def cpu(): with tf.device('/cpu:0'): random_image_cpu = tf.random.normal((100, 100, 100, 3)) net_cpu = tf.keras.layers.Conv2D(32, 7)(random_image_cpu) return tf.math.reduce_sum(net_cpu)

def gpu(): with tf.device('/device:GPU:0'): random_image_gpu = tf.random.normal((100, 100, 100, 3)) net_gpu = tf.keras.layers.Conv2D(32, 7)(random_image_gpu) return tf.math.reduce_sum(net_gpu)

We run each op once to warm up; see: https://stackoverflow.com/a/45067900

cpu() gpu()

Run the op several times.

print('Time (s) to convolve 32x7x7x3 filter over random 100x100x100x3 images ' '(batch x height x width x channel). Sum of five runs.') print('CPU (s):') cpu_time = timeit.timeit('cpu()', number=5, setup="from main import cpu") print(cpu_time) print('GPU (s):') gpu_time = timeit.timeit('gpu()', number=5, setup="from main import gpu") print(gpu_time) print('GPU speedup over CPU: {}x'.format(int(cpu_time/gpu_time)))

Install Conda using conda-colab

Google Colab does contain a pre-installed version of conda. Conda is a package manager and environment manager.

This will allow you to use conda to install packages

!which conda

To simplify the conda installation, we use condacolab.

!pip install condacolab

import condacolab condacolab.install()

After running condacolab.install(), conda and mamba are installed. Conda and Mamba are functionally equivalent, however conda is written in python, whilst Mamba is written in C.

In our lesson, we use Mamba to leverage its speed (as compared to Conda). However, we note that Mamba can be more prone to bugs, as it is relatively new and less popular.

!which conda !which mamba

For compatibility with colab, we will need to ensure the numpy version is unchanged when we install packages (such as pymatgen).

To identify the numpy version factory installed on colab:

import numpy as np print(np.version)

The current version of numpy is v1.19.5, thus we specify mamba to install this specific version.

!mamba install -q -c conda-forge pymatgen numpy=1.19.5 -y

Setup environment using conda constructor.


  1. First, we want to remove our previous installations by resetting our runtime to factory defaults. This process will revert the environment to its original state, removing any added packages or restoring deleted system files.
  2. Runtime > Factory reset runtime
  1. Next, we install conda colab

!pip install -q condacolab

  1. Finally, we install using the constructor we have built for the workshop

Link to the release

  • https://github.com/materialsproject/workshop/releases/download/2021.08.09/condacolab-0.1-Linux-x86_64.sh

import condacolab

conda_constructor = 'https://github.com/materialsproject/workshop/releases/download/2021.08.09/condacolab-0.1-Linux-x86_64.sh' condacolab.install_from_url(conda_constructor)

The environment is now setup with pymatgen, fireworks, custodian, atomate, and matminer.

Building a conda constructor

To build your own environment constructor (to save time when restarting notebook)

Adapted from this notebook

  1. Download the templates constructor

!wget -q https://raw.githubusercontent.com/materialsproject/workshop/master/workshop/primer/04_Python_environments/env_installer/construct.yaml !wget -q https://raw.githubusercontent.com/materialsproject/workshop/master/workshop/primer/04_Python_environments/env_installer/pip-dependencies.sh

  1. Modify the construct.yaml as desired. Here, we will add:

  2. numpy 1.19.5

  3. pymatgen
  4. custodian
  5. fireworks
  6. atomate
  7. matminer
  1. Install condacolab, conda, mamba, and constructor

!pip install -q condacolab

Install conda and mamba

import condacolab condacolab.install()

Install constructor

!mamba install -q constructor

  1. Run constructor to build .sh template file

!constructor .

  1. Download .sh constructor and upload to any file hosting site (github releases can be used)

from google.colab import files installer = !ls *-Linux-x86_64.sh files.download(installer[0])