Machine Learning / Data Science notes
News
Learning
- Virgilio - path for machine learning learning
- Awesome Machine Learning/Data Science repo -- all the links you could want and more
- Machine Learning Operations - curated list
- MIT Intro to Deep Learning course
- Intro to GeoML
- Earth Lab - online tutorials for earth data science
- YouTube playlist - ARM ML seminar series
- Devis Peressutti - developer for eo-learn
Data Science
- Coding for Data - free online Data Science textbook
- Hands-on Data Visualization -- awesome book
Javascript/browser
Maps
- Vancouver tree map -- lots of stuff to steal here
- Eagle tracking
- Movebank.org -- animal tracking data
Geoscience
- Visualizing LIDAR data in Python
Digital Africa Sandbox Notebooks -- the "Real World Examples" section looks really good
Reference
Machine Learning in Python, Step-By-Step -- from the excellent machinelearningmastery.com site
Neural Networks and Deep Learning - online book, CC-licensed, excellentdy
Books
Production
- Deep Learning in Production -- a curated set of links
TinyML
- Towards Data Science article
- Microsoft Bonsai -- classifier that can run on Arduino
- Edge Impulse -- neat service for making models easily & deploying to Arduino, Pi, etc
Repos
Machine Learning Deliberate Practice -- great intro to one person's project to get better at ML
GIS/Satellite imagery/etc
Data sets
- Practice ml with small, in-memory datasets from the UC ML Repository
- Awesome satellite imagery datasets
- Awesome forests dataset
Techniques
- Long short-term memory analysis with Keras and Python
- Good overview of object detection & segmentation, wrapped up in a tutorial about Pytorch Lightning and DET
- miniGPT -- A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
Tools
- Cookiecutter data science - the DS equivalent of "rails new"
- GBDX Notebooks - Jupyter notebooks + Digital Globe imagery
- EasyImages - Library for easy image display in Jupyter notebooks (mentioned in this article about image segmentation)
- Boilerplate code for machine learning -- excellent
Image labelling
- Awesome data labelling resources
- MakeSense.ai -- cool online, open-source image labeller
Automated optimization
Libraries
Folium - Python + LeafletJS for maps
Categorical Encoding - Scikit-Learn library
Wireservice -- excellent data science libraries. Like Pandas, but much simpler.
Engineering
- OPML Conference 2019 -- Usenix conference
Computing
Projects
Ein-notebook: Emacs interface to Jupyter
Test from ein
This is a notebook explaining how I got ein working.
- Fire up a shell in the project, and activate the venv.
- Run jnb to start up Jupyter.
- Copy the URL, including the token. (Note: this part I'm still figuring out.)
- Run ein:login and give the URL, NOT including the token (but try the token if that doesn't work)
- This will fail. Try again; you should now be able to get a list of notebooks, etc.
- Click on the [Dir] link for a directory, rather than the name of a directory.
- You can click on [New Notebook], or run ein:notebooklist-new-notebook-with-name. Note that you'll need to include the directory with the name:
- my_new_notebook will create my_new_notebook.ipynb at the root of the project.
- notebooks/my_new_notebook will create that notebook in the notebooks directory.
Some helpful commands
- ein:notebooklist-open will show you a list of notebooks. Note: Hit on the [Dir] part, not the directory name part.
- C-c will execute a cell.
- M-return will execute a cell and go to the next cell.
- C-c C-a will insert a cell above the current one.
- C-c C-b will insert a cell below the current one.
- C-c C-t toggles the cell type (markdown to code & vice-versa). So will C-c C-u.
- C-c C-s will split a cell at point.
- C-c RET will merge a cell with the previous cell.
- C-c C-k will kill/delete a cell.
- C-x C-s will save a notebook.
- C-x C-w will rename a notebook. Note that if you want it saved in a directory, you have to include that in the name. Thus:
- C-x C-w my_notebook will save it as my_notebook.ipynb in the root of the project.
- C-x C-w notebooks/my_notebook will save it with the same name, but in the notebooks directory.