Wanna know more about data science? Make sure to check out my events and my webinar What it's like to be a data scientist and What’s the best way to become a data scientist !
Data science is ever evolving, and with so many things going on it can be difficult to keep track of all the new libraries and algorithms. This is when it can be really useful to have a reference guide to help you out. Continuing from our series about cheatsheets, in this post I provide some more very useful cheatsheets for data science.
If you want to acquire data science skills, also make sure to check out some of my courses.
Python data science cheasheets
Python for data science: Python basics
This great cheatsheet from Datacamp is going to be extremely useful for any people learning Python for data science. All the basic commands, from list manipulation to numpy arrays are there.
Keras is a great and easy-to-use deep learning library for Python. It is easier to get started in deep neural networks with Keras, rather than it is with Tensorflow directly. This cheatsheet contains some quick recipes to create the most basic neural network types.
Data visualisation in Python
A great data scientist should also be a great communicator, and quite often there is no better tool to do that, than a visualisation. This cheatsheet covers some of the basics of visualisation in Python using matplotlib and seaborn.
The scikit-learn flowchart
I don’t think that any data science cheatsheet article is complete without a reference to the famous scikit-learn flowchart for choosing the right machine learning model. This amazing cheatsheet shows you how to choose the right machine learning model depending on your task and the number of rows and features.
Text cleaning in Python
Every good data scientist should know how to do natural language processing. This cheatsheet presents some very good tips and tricks for cleaning up text.
R data science cheatsheets
The R reference card
This is the go-to cheatsheet for all basic R commands. Provides a good coverage of all the native R commands from plotting, to installing packages, to manipulating vectors. Good for beginners, but even some experienced R users might find it useful.
Data transformation with dplyr
Visualisation with ggplot2
Ggplot2 is best way to produce visually pleasing plots in R. While the traditional plotting capabilities of R are good, the plots produced do not look that great, plus they are not very flexible. Ggplot2 improves upon all that, but it can be a bit daunting for the uninitiated. This cheatsheet provides a great overview of ggplot2 commands and syntax.
The caret package
The caret package provides an easy way to do machine learning in R. It provides a wrapper over many other machine learning R libraries and has utility functions for running cross-validation and cleaning up data. This cheatsheet is a good way to get started using caret.
R reference card for data mining
This very useful cheatsheet contains a high level overview of functions and associated packages in R for data mining. From data manipulation, to big data and parallel computing, this cheatsheet covers a variety of use cases.