0. Configuring your computer to use Python for scientific computing¶
Python is a flexible programming language that is widely used in many applications. This is in contrast to more domain-specific languages like R and Julia. It is easily extendable, which is in many ways responsible for its breadth of use. We find that there is a decent Python-based tool for many applications we can dream up, certainly in systems biology. However, the Python-based tool is seldom the very best for the particular task at hand, but it is almost always pretty good. Thus, knowing Python is like having a Swiss Army knife; you can wield it to effectively accomplish myriad tasks. Finally, we also find that it has a shallow learning curve with most students.
Why not use systems biology packages?¶
There are packages available to streamline systems biology calculations, such as PySB or Matlab’s SymBiology. While these packages are useful, we find that many applications in systems biology, and in genetic circuits in particular, need, or at least benefit from, bespoke computational analyses. We therefore will build all of our code from scratch, using only packages like NumPy, SciPy, and Bokeh, which contain core numerical and plotting data structures and routines. Of course, code we use in one lesson may be reused in another, but our approach is that we build all of the code we need as we go along. This will provide a greater level of mastery and less reliance on black boxes (though there will inevitably be some).
What to do with you are new to Python¶
As you proceed through the lessons, we assume that you have a basic introduction to computer programming and the Python programming language. We assume further that you have a working knowledge of NumPy. If this is new to you, there are plenty of great resources to learn Python and to learn the basics quickly. A weeklong intensive course offered by one of the authors and the resources linked to therein provide a good starting point.
Installing a Python distribution¶
Prior to embarking on your journey into biological circuits, you need to have a functioning Python distribution installed on your computer. There are two main ways people set up Python for scientific computing.
By downloading and installing package by package with tools like pip.
By downloading and installing a Python distribution that contains binaries of many of the scientific packages needed. The major distributions of these are Anaconda and Enthought Canopy. Both contain IDEs.
We will use Anaconda, with its associated package manager,
conda. It is pretty much the de facto package manager/distribution for scientific use.
A special note to Mac users¶
If your machine is a Mac, you will need to install XCode, which you can get through the App Store, before installing Anaconda. Once you install XCode, you need to launch it in order to have everything set up properly. It will take a while to launch, and it may ask you to install extras, which you should do. After it has launched, you can close it, and you won’t need it again for this course in biological circuits. Important components under the hood are set up by installing and launching XCode.
Windows users: Chrome or Firefox¶
To run Jupyter notebooks, you use JupyterLab. It is browser-based, and Chrome, Firefox, and Safari are supported. Internet Explorer is not. Therefore, if you are a Windows user, you need to be sure you have either Chrome of Firefox installed.
Downloading and installing Anaconda¶
Mac users: Before installing Anaconda, be sure you have XCode installed.
Downloading and installing Anaconda is simple.
Go to the Anaconda homepage and download the graphical installer.
Install Anaconda with Python 3.7.
You may be prompted for your email address, which you should provide. If you are at a university, you may want to use your university email address because educational users can get some of the non-free goodies in Anaconda.
Follow the on-screen instructions for installation. While doing so, be sure that Anaconda is installed in your home directory, not in root.
That’s it! After you do that, you will have a functioning Python distribution.
Launching JupyterLab and a terminal¶
After installing the Anaconda distribution, you should be able to launch the Anaconda Navigator. If you’re using macOS, this is available in your
Applications menu. If you are using Windows, you can do this from the
Start menu. Launch Anaconda Navigator.
You should see an option to launch JupyterLab. When you do that, a new browser window or tab will open with JupyterLab running. Within the JupyterLab window, you will have the option to launch a notebook, a console, a terminal, or a text editor. We will notebooks heavily.
We will be using command line-based tools for package management. You can use your OS’s Terminal problem, or you can use JupyterLab, which which case you can click on
Terminal to launch a terminal. You will get a terminal window (probably black) with a prompt. We refer to this text interface in the terminal as the “command line.” You will use this to install the requisite packages.
conda package manager¶
conda is a package manager for keeping all of your packages up-to-date. It has plenty of functionality beyond our basic usage in class, which you can learn more about by reading the docs. We will primarily be using
conda to install and update packages.
conda works from the command line. Now that you know how to get a command line prompt, you can start using
conda. The first thing we’ll do is update
conda itself. To do this, enter the following on the command line:
conda update conda
conda is out of date and needs to be updated, you will be prompted to perform the update. Just type
y, and the update will proceed.
conda is updated, we’ll use it to see what packages are installed. Type the following on the command line:
This gives a list of all packages and their versions that are installed. Now, we’ll update all packages, so type the following on the command line:
conda update --all
You will be prompted to perform all of the updates. They may even be some downgrades. This happens when there are package conflicts where one package requires an earlier version of another.
conda is very smart and figures all of this out for you, so you can almost always say “yes” (or “
conda when it prompts you.
As you work through this course, you will sometimes use packages that are not included in the default Anaconda distribution. As we develop code throughout the course, we will reuse it. For convenience, this is contained in the
You can do these installations with
pip by doing the following at the command line. We will take care of these installations now, and will discuss them in much more detail as we use them. To do the installations, do the following on the command line.
conda install -c pyviz holoviz conda install nodejs conda install selenium phantomjs pillow black pip install biocircuits bokeh-catplot watermark blackcellmagic
Finally, we need to configure JupyterLab to work with Bokeh, which we will use to visualize images.
jupyter labextension install --no-build @pyviz/jupyterlab_pyviz
After installing all of these extensions, you can rebuild JupyterLab.
jupyter lab build
If you’re using a terminal in JupyterLab, close your JupyterLab session and relaunch it after you have completed the build.
Checking your distribution¶
We’ll now run a quick test to make sure things are working properly. We will make a quick plot that requires some of the scientific libraries we will use in the bootcamp.
Use the JupyterLab launcher (you can get a new launcher by clicking on the
+ icon on the left pane of your JupyterLab window) to launch a notebook. In the first cell (the box next to the
[ ]: prompt), paste the code below. To run the code, press
Shift+Enter while the cursor is active inside the cell. You should see a plot that looks like the one below. If you do, you have a functioning Python environment for scientific computing!
import numpy as np import bokeh.io import bokeh.plotting bokeh.io.output_notebook() # Generate plotting values t = np.linspace(0, 2 * np.pi, 200) x = 16 * np.sin(t) ** 3 y = 13 * np.cos(t) - 5 * np.cos(2 * t) - 2 * np.cos(3 * t) - np.cos(4 * t) # Make the plot p = bokeh.plotting.figure(width=400, height=375) p.line(x, y, line_width=3, color="red") source = bokeh.models.ColumnDataSource( dict(x=, y=, text=["Biocircuits"]) ) p.text( x="x", y="y", text="text", source=source, text_align="center", text_font_size="18pt", ) # Display bokeh.io.show(p)
%load_ext watermark %watermark -v -p numpy,bokeh,jupyterlab
CPython 3.7.7 IPython 7.13.0 numpy 1.18.1 bokeh 2.0.1 jupyterlab 1.2.6