0. Configuring your computer to use Python for scientific computing


Why Python?

As will become readily apparent even at the beginning of our journey into biological circuit design, you will need to use your computer to analyze circuits and understand the principles governing their function. There are plenty of approaches we could take, and many languages we could use for computing as well. Indeed, in addition to Python, Matlab/Octave, Mathematica, R, Julia, Java, JavaScript, C++, and others are widely used. We have chosen to use Python. Though we view this as an unessential choice (we believe language wars are counterproductive and welcome anyone to port the code we use to any language of their choice), we nonetheless feel we should explain our choice.

Python is a flexible programming language that is widely used in many applications. This is in contrast to more domain-specific languages like R and Julia. It is easily extendable, which is in many ways responsible for its breadth of use. We find that there is a decent Python-based tool for many applications we can dream up, certainly in systems biology. However, the Python-based tool is seldom the very best for the particular task at hand, but it is almost always pretty good. Thus, knowing Python is like having a Swiss Army knife; you can wield it to effectively accomplish myriad tasks. Finally, we also find that it has a shallow learning curve with most students.

Why not use systems biology packages?

There are packages available to streamline systems biology calculations, such as PySB or Matlab’s SymBiology. While these packages are useful, we find that many applications in systems biology, and in genetic circuits in particular, need, or at least benefit from, bespoke computational analyses. We therefore will build all of our code from scratch, using only packages like NumPy, SciPy, and Bokeh, which contain core numerical and plotting data structures and routines. Of course, code we use in one lesson may be reused in another, but our approach is that we build all of the code we need as we go along. This will provide a greater level of mastery and less reliance on black boxes (though there will inevitably be some).

What to do with you are new to Python

As you proceed through the lessons, we assume that you have a basic introduction to computer programming and the Python programming language. We assume further that you have a working knowledge of NumPy. If this is new to you, there are plenty of great resources to learn Python and to learn the basics quickly. A weeklong intensive course offered by one of the authors and the resources linked to therein provide a good starting point.

Installing a Python distribution

Prior to embarking on your journey into biological circuits, you need to have a functioning Python distribution installed on your computer. There are two main ways people set up Python for scientific computing.

  1. By downloading and installing package by package with tools like pip.

  2. By downloading and installing a Python distribution that contains binaries of many of the scientific packages needed. The major distributions of these are Anaconda and Enthought Canopy. Both contain IDEs.

We will use Anaconda, with its associated package manager, conda. It is pretty much the de facto package manager/distribution for scientific use.

A special note to Mac users

If your machine is a Mac, you will need to install XCode, which you can get through the App Store, before installing Anaconda. Once you install XCode, you need to launch it in order to have everything set up properly. It will take a while to launch, and it may ask you to install extras, which you should do. After it has launched, you can close it, and you won’t need it again for this course in biological circuits. Important components under the hood are set up by installing and launching XCode.

Windows users: Chrome or Firefox

To run Jupyter notebooks, you use JupyterLab. It is browser-based, and Chrome, Firefox, and Safari are supported. Internet Explorer is not. Therefore, if you are a Windows user, you need to be sure you have either Chrome of Firefox installed.

Downloading and installing Anaconda

Mac users: Before installing Anaconda, be sure you have XCode installed.

Downloading and installing Anaconda is simple.

  1. Go to the Anaconda homepage and download the graphical installer.

  2. Install Anaconda with Python 3.7.

  3. You may be prompted for your email address, which you should provide. If you are at a university, you may want to use your university email address because educational users can get some of the non-free goodies in Anaconda.

  4. Follow the on-screen instructions for installation. While doing so, be sure that Anaconda is installed in your home directory, not in root.

That’s it! After you do that, you will have a functioning Python distribution.

Launching JupyterLab and a terminal

After installing the Anaconda distribution, you should be able to launch the Anaconda Navigator. If you’re using macOS, this is available in your Applications menu. If you are using Windows, you can do this from the Start menu. Launch Anaconda Navigator.

You should see an option to launch JupyterLab. When you do that, a new browser window or tab will open with JupyterLab running. Within the JupyterLab window, you will have the option to launch a notebook, a console, a terminal, or a text editor. We will notebooks heavily.

We will be using command line-based tools for package management. You can use your OS’s Terminal problem, or you can use JupyterLab, which which case you can click on Terminal to launch a terminal. You will get a terminal window (probably black) with a prompt. We refer to this text interface in the terminal as the “command line.” You will use this to install the requisite packages.

The conda package manager

conda is a package manager for keeping all of your packages up-to-date. It has plenty of functionality beyond our basic usage in class, which you can learn more about by reading the docs. We will primarily be using conda to install and update packages.

conda works from the command line. Now that you know how to get a command line prompt, you can start using conda. The first thing we’ll do is update conda itself. To do this, enter the following on the command line:

conda update conda

If conda is out of date and needs to be updated, you will be prompted to perform the update. Just type y, and the update will proceed.

Now that conda is updated, we’ll use it to see what packages are installed. Type the following on the command line:

conda list

This gives a list of all packages and their versions that are installed. Now, we’ll update all packages, so type the following on the command line:

conda update --all

You will be prompted to perform all of the updates. They may even be some downgrades. This happens when there are package conflicts where one package requires an earlier version of another. conda is very smart and figures all of this out for you, so you can almost always say “yes” (or “y”) to conda when it prompts you.

As you work through this course, you will sometimes use packages that are not included in the default Anaconda distribution. As we develop code throughout the course, we will reuse it. For convenience, this is contained in the biocircuits package.

You can do these installations with conda or pip by doing the following at the command line. We will take care of these installations now, and will discuss them in much more detail as we use them. To do the installations, do the following on the command line.

conda install -c pyviz holoviz
conda install nodejs
conda install selenium phantomjs pillow black
pip install biocircuits bokeh-catplot watermark blackcellmagic

Configuring JupyerLab

Finally, we need to configure JupyterLab to work with Bokeh, which we will use to visualize images.

jupyter labextension install --no-build @pyviz/jupyterlab_pyviz

After installing all of these extensions, you can rebuild JupyterLab.

jupyter lab build

If you’re using a terminal in JupyterLab, close your JupyterLab session and relaunch it after you have completed the build.

Checking your distribution

We’ll now run a quick test to make sure things are working properly. We will make a quick plot that requires some of the scientific libraries we will use in the bootcamp.

Use the JupyterLab launcher (you can get a new launcher by clicking on the + icon on the left pane of your JupyterLab window) to launch a notebook. In the first cell (the box next to the [ ]: prompt), paste the code below. To run the code, press Shift+Enter while the cursor is active inside the cell. You should see a plot that looks like the one below. If you do, you have a functioning Python environment for scientific computing!

[1]:
import numpy as np
import bokeh.io
import bokeh.plotting

bokeh.io.output_notebook()

# Generate plotting values
t = np.linspace(0, 2 * np.pi, 200)
x = 16 * np.sin(t) ** 3
y = 13 * np.cos(t) - 5 * np.cos(2 * t) - 2 * np.cos(3 * t) - np.cos(4 * t)

# Make the plot
p = bokeh.plotting.figure(width=400, height=375)
p.line(x, y, line_width=3, color="red")
source = bokeh.models.ColumnDataSource(
    dict(x=[0], y=[0], text=["Biocircuits"])
)
p.text(
    x="x",
    y="y",
    text="text",
    source=source,
    text_align="center",
    text_font_size="18pt",
)

# Display
bokeh.io.show(p)
Loading BokehJS ...

Computing environment

[2]:
%load_ext watermark
%watermark -v -p numpy,bokeh,jupyterlab
CPython 3.7.7
IPython 7.13.0

numpy 1.18.1
bokeh 2.0.1
jupyterlab 1.2.6