More examples on GitHub
If you prefer you could use any IDE or text editor to directly work en .py files, I will also provide the .py files corresponding to a notebook file. Or use natively python, pyspark, spark on windows 10 if you have some experience. For lectures I will focus on Linux and jupyter notebook
Double check that your code matches the notes exactly!
Google it or StackOverflow for quick autonomous answers
Contact me, or the sessions community, I will provide a github project with an issues tracker for that purpose
tested on Windows 10 enterprise (or pro) edition 1909 and above
With the GUI or simply open PowerShell as Administrator and run the following commands:
dism.exe /online /enable-feature /featurename:VirtualMachinePlatform /all /norestart
dism.exe /online /enable-feature /featurename:Microsoft-Windows-Subsystem-Linux /all /norestart
Reboot (restart) if needed
wsl --set-default-version 2 (must install WSL 2 first) will be by default in a near future
"WSL 2 will soon be officially available as part of Windows 10, version 2004! " upgrade to this version then
wsl --set-default-version 2
create a user and password : to use the same user choose bdml
sudo apt install git
Generate the public/private key for ssh, we will use it with github
ssh-keygen
Create your account in github or use the one you have
sudo apt install python3
python3 --version
bdml@PF:~$ sudo apt install python3
[sudo] password for bdml:
....
bdml@PF:~$ python3 --version
Python 3.8.5
There are a few more packages and development tools to install to ensure that we have a robust setup for our programming environment:
sudo apt install -y python3-pip build-essential libssl-dev libffi-dev python3-dev
To avoid conflicting package and create an isolated environment we create virtual environment
python3 -m pip install venv
pip is a package installer for python : more infos on https://pypi.org/project/pip/
python3 -m venv ~/vBDML
# In Unix ~ represent the home directory $HOME the root of all your files
source ~/vBDML/bin/activate
Example in this example the user is pascalfares and the venv is vDMA:
pascalfares@PF:~$ source vDMA/bin/activate
(vDMA) pascalfares@PF:~$
JupyterLab requires Python 3, as well as a Python package manager – we’ll use pip – You install them in the previous steps in Ubuntu (native or WSL).
We will use the virtualenv we installed previously. I suppose that we all call it vBDML if not change the following to your particular naming you choose.
~/vBDML/bin/pip install jupyterlab pandas matplotlib
Along with JupyterLab, we’ll also install pandas and Matplotlib, as these are two popular Python libraries used in conjunction with Jupyter itself.
In my case
(vBDML) bdml@PF:~$ jupyter lab --generate-config
For more security and being sure we use the jupyter version of the virtual environement
~/vBDML/bin/jupyter lab --generate-config
Writing default config to: /home/bdml/.jupyter/jupyter_notebook_config.py
Edit the file jupyter_notebook_config.py with VScode for example
code ~/.jupyter/jupyter_notebook_config.py
and change the following line
...
to
....
then try it
jupyter lab
If not retry, follow all the steps, ask me...