Specialist course Doctoral schools of Ghent University
View the Project on GitHub jorisvandenbossche/DS-python-data-analysis
To get started, you should have the following elements setup:
conda
In the following sections, more details are provided for each of these steps. When all three are done, you are ready to start coding!
As the course has been set up as a git repository managed on Github, you can clone the entire course to your local machine. Use the command line to clone the repository and go into the course folder:
git clone https://github.com/jorisvandenbossche/DS-python-data-analysis.git
cd DS-python-data-analysis
In case you would prefer using Github Desktop, see this tutorial.
To download the repository to your local machine as a zip-file, click the download ZIP
on the
repository page https://github.com/jorisvandenbossche/DS-python-data-analysis (green button “Code”):
After the download, unzip on the location you prefer within your user account (e.g. My Documents
, not C:\
). Watch out for a nested ‘DS-python-data-analysis/DS-python-data-analysis’ folder structure after unzipping and move the inner DS-python-data-analysis folder to your preferred location.
Note: Make sure you know where you stored the course material, e.g. C:/Users/yourusername/Documents/DS-python-data-analysis
.
conda
For scientific and data analysis, we recommend to use conda
, a command line tool for package and environment management (https://docs.conda.io/projects/conda/).
conda
allows us to install a Python distribution with the the scientific libraries we will use in this course (this recommendation applies to all platforms, so for both Windows, Linux and Mac).
conda
conda
installedWe recommend to use the installer provided by the conda-forge community: https://conda-forge.org/download/.
Follow the instructions on that page, i.e. first download the appropriate installed (depending on your operating system), and then run that installer.
On Windows, this will mean double-clicking the downloaded .exe
file, and following the instructions. During installation, choose the options (click checkbox):
On MacOS or Linux, you have to open a terminal, and run bash Miniforge3-$(uname)-$(uname -m).sh
conda
, Anaconda or Miniconda installedWhen you already have an installation of conda
or Anaconda, you have to make sure you are working with a recent version. If you installed it only a
few months ago, this step is probably not needed, otherwise follow the next steps:
conda update conda
, by typing that command, hit the ENTER-button
(make sure you have an internet connection), and respond with Yes by typing y
.conda config --add channels conda-forge
, by typing that command, hit the ENTER-buttonconda config --set channel_priority strict
, by typing that command, hit the ENTER-buttonIf you are using Anaconda on Windows, replace each time “Miniforge Prompt” by “Anaconda Prompt” in the following sections.
conda
installationNow we will use conda
to install the Python packages we are going to use
throughout this course.
As a good practice, we will create a new conda environment to work with.
The packages used in the course are enlisted in
an environment.yml
file. The file looks as follows:
name: DS-python
channels:
- conda-forge
dependencies:
- python=3.12
- geopandas
- ...
The file contains information on:
name
is the name used for the environmentchannels
to define where to download the packages fromdependencies
contains each of the packagesThe environment.yml file for this course is included in the course material you downloaded.
Now we can create the environment:
Navigate to the directory where you downloaded the course materials (that directory should contain a environment.yml
file, double check in your file explorer).:
cd FOLDER_PATH_TO_COURSE_MATERIAL
(Make sure to hit the ENTER-button to run the command)
Create the environment by typing the following commands line by line + hitting the ENTER-button (make sure you have an internet connection):
conda env create -f environment.yml
! FOLDER_PATH_TO_COURSE_MATERIAL
should be replaced by the path to the folder containing the downloaded course materials (e.g. in the example it is C:/Users/yourusername/Documents/DS-python-data-analysis
)
! You can safely ignore the warning FutureWarning: 'remote_definition'...
.
Respond with Yes by typing y
when asked. Output will be printed and if no error occurs, you should have the environment configured with all packages installed.
When finished, keep the terminal window (or “Miniforge Prompt”) open (or reopen it). Execute the following commands to check your installation:
conda activate DS-python
ipython
Within the terminal, a Python session will be started in which you can start writing Python! Type the following command:
import pandas
import matplotlib
If no message is returned, you’re all set! If a message (probably an error) returned, contact the instructors. Copy paste the message returned.
To get out of the Python session, type:
quit
To check if your packages are properly installed, open the Conda Terminal again (see above) and navigate to the course directory:
cd FOLDER_PATH_TO_COURSE_MATERIAL
With FOLDER_PATH_TO_COURSE_MATERIAL
replaced by the path to the folder with the downloaded
course material (e.g. in the example it is C:/Users/yourusername/Documents/DS-python-data-analysis
).
Activate the newly created conda environment:
conda activate DS-python
Then, run the check_environment.py
script:
python check_environment.py
When all checkmarks are ok, you’re ready to go!
Each of the course modules is set up as a Jupyter notebook, an interactive environment to write and run code. It is no problem if you never used jupyter notebooks before as an introduction to notebooks is part of the course.
In the terminal (or “Miniforge Prompt”), navigate to the DS-python-data-analysis
directory (downloaded or cloned in the previous section)
cd FOLDER_PATH_TO_COURSE_MATERIAL
Ensure that the correct environment is activated.
conda activate DS-python
Start a jupyter notebook server by typing
jupyter lab
This will open a browser window automatically. Navigate to the course directory (if not already there) and choose the notebooks
folder to access the individual notebooks containing the course material.