Data Analytics Bootcamp 2021 – Handy Class Info

Some of the environment setup isn’t terribly well documented, so I’m adding it here for our use.

Setting up our Data Bootcamp conda environment:

In our class gitlab repo you’ll find:

uofo-por-virt-data-pt-03-2021-u-c/01-ClassActivities/03-Python/Supplemental/Anaconda Environments

The following sets up our basic PythonData virtual environment:

conda create -n PythonData --file intro_python_requirements_osx.txt python=3.6

Note: you’ll want to choose the right .txt file for your system. Also this assumes you’re running the command in the same directory as the file.

Activate your virtual environment

conda activate PythonData

Other items we need to install, while you’re in the PythonData environment:

conda install -c anaconda nb_conda_kernels

…for our API requests:

conda install requests

…and now google maps tools:

conda install -c conda-forge gmaps
jupyter nbextension enable --py gmaps
jupyter nbextension enable --py widgetsnbextension

…also the census tools:

conda install -c conda-forge census

…and the citypy library:

pip install citipy

…and you’ll need this:

conda install -c anaconda sqlalchemy

I found I needed this for the SQL homework bonus:

pip install psycopg2-binary

We also need the PostgreSQL. At the time of this writing you want the version 11.11 for your computing platform. Just select all the defaults while installing it.

For scraping web page we’ll be using Beautiful Soup:

pip install bs4

We also need the get the web-driver manager and splinter:

pip install webdriver_manager
pip install splinter

MongoDB here: If you’re like me (and I know I am) you may run into “issues”. Try this:

brew services start mongodb-community@4.2

We’ll be parsing things with lxml, so:

pip3 install lxml

Install instructions for the machine learning unit

In your shell, cd into the first activity resources folder where you’ll find a file called python_adv.yml. Then execute these commands:

conda deactivate
conda env create --file python_adv.yml
conda activate PythonAdv

And for working with Spark and Hadoop:

pip install MRJob

I will document more items as we install them…

Would you like to have your current Git branch appear on the command line? Here’s some blog posts that show how to make this change. Hey, guess what? The Conda installer writes your .bash_profile as “root” so you’ll need to edit it with “sudo nano .bash_profile” to write to your own file!

Have Python speak a line of text through your computer:

import os
os.system('say "This is your python script speaking."')