How did I get to machine learning? A couple of months ago, I started to listen to the Machine Learning Guide podcast. I found out about it by chance and highly recommend it to get you an introduction for machine learning. Tyler Renelle is doing an amazing job to get you excited about the topic. I almost feel like I am following him on the same path to learn about machine learning now. Even though I didn't actively plan about learning ML, it was interesting to hear about all those foreign concepts. There it was again; this excitement when everything is unexplored. I felt like a whole new world opened up in front of me. It was the same feeling when I finally got the foot into web development.
As I read about a couple of machine learning articles, the course on Machine Learning by Andrew Ng was the by far most recommended to get started in machine learning. I have never taken an online course from start to end before, even though I actively give these online courses myself, but I decided to give it a shot this time. Fortunately, the course had started one week ago. So I enrolled in it and by now finished it. It's a blast and I recommend everyone who wants to get into ML to take it. Even though it's a big commitment in the first place to enroll in the course for 12 weeks. But more about it later.
- Math / Data Analysis: Matlab, Octave, Julia, R
- Data Mining: Scala, Java (e.g. Hadoop, Spark, Deeplearning4j)
- Performance: C/C++ (e.g. GPU accelerated)
Next, you can see why Python makes so much sense in machine learning. It has a suitable set of libraries for the different tasks assigned to the programming languages from above and even more good fitting solutions:
- Math: numpy
- Data Analysis: Pandas
- Data Mining: PySpark
- Server: Flask, Django
- TensorFlow (because it is written with a Python API over a C/C++ engine)
- Keras (sits on top of TensorFlow)
- Math: math.js
- Data Analysis: d3.js
- Server: node.js (express, koa, hapi)
- Tensorflow.js (e.g. GPU accelerated via WebGL API in the browser)
Tensorflow.js (previously Deeplearn.js): The library by Google is GPU accelerated via WebGL API and used for predictions by using pre-trained models in inference mode in the browser but also for the training mode itself. It mirrors the API of the popular TensorFlow library.
TensorFire and Keras.js: Yet another pair of two GPU accelerated libraries which are used for pre-trained models in inference mode. They allow you to write your models in Keras or TensorFlow with Python. Afterward you can deploy them to the web by using TensorFire or Keras.js.
Only 2017 brought up those exciting and promising libraries. So I am curious what 2018 will offer us.
I made my own motivation clear in the beginning of this article. However, that's not all to the story. There are plenty of reasons and opportunities to dive into machine learning as a web developer.
Last but not least, there is great effort involved on the side of ML open source contributors (e.g. Tensorflow.js, TensorFire, Keras.js, Brain.js) to enable machine learning in the browser. However, most often the documentation is suited for machine learners entering the browser domain and not the other way around as I described it in this article. Thus these solutions come with a lot of fundamental machine learning knowledge which isn't taught along the way. In return, it makes it difficult for web developers to enter the machine learning domain. Thus there is a great opportunity to pave the way for web developers into the domain of machine learning by making those fundamental topics and ported libraries accessible in an educational way. That's the point where I try to tie in my knowledge in teaching about those things. In the future, I want to give you the guidance if you are keen to enter the field of machine learning as web developer. Read more about this in the final paragraphs of this article.
If you are familiar with machine learning, feel free to skip this section. Entering the field of machine learning as a beginner can be a buzzword heavy experience. Where should you start? There is so much terminology to clarify in the beginning. Is it AI or machine learning? What's all the hype about deep learning? And how fits data science in this area?
Let's start our journey with AI (artificial intelligence). "It is the intelligence of a machine that could successfully perform any intellectual task that a human being can." There is a great analogy in the Machine Learning Guide podcast to convey the information of AI: Whereas the goal of the industrial revolution was the simulation of the physical body through machines, it is the goal of AI to simulate the brain for mental tasks through algorithms. So how does machine learning relate to AI? Let's have a look at the a couple of subfields of AI:
- searching and planning (e.g. playing a game with possible actions)
- reasoning and knowledge representation (structuring knowledge to come to conclusions)
- perception (vision, touch, hearing)
- ability to move and manipulate objects (goes into robotics)
- natural language processing (NLP)
The last one represents machine learning. As you can see, it is only a subfield of AI. However, it might be the only essential core fragment of AI because it reaches into the other subfields of AI too. It reaches into them even more over the recent time. For instance, vision as subfield becomes more of a part of applied machine learning. Where other techniques, e.g. domain specific algorithms, dominated the domain in the past, machine learning enters the field now. Now deep neural networks are often used for the domain. So what are applicable domains of AI and therefore most often machine learning? A bunch of domains and examples:
- Image Recognition (see referenced linked above)
- Web (e.g. Search Engines, Spam Filters)
- Art (e.g. Painting)
- Autonomous Vehicles (e.g. Tesla Autopilot, awareness comes up for Robots in Warfare)
- Medical Diagnosis
- Playing Games (e.g. Go, StarCraft)
So machine learning is a subfield of AI. Let's dive into the subject itself. There are a couple of great definitions for machine learning, yet when I started out with the subject, I found the one by Arthur Samuel (1959) most memorable: "The field of study that gives computers the ability to learn without being explicitly programmed." How does it work? Basically machine learning can be grouped into three categories: supervised learning, unsupervised learning and reinforcement learning. It's quite an evolution from the former to the latter. Whereas the former is more concrete, the latter becomes more abstract (yet exciting and unexplored). The former, supervised learning, gives the best entry point to machine learning and is used therefore in several educational machine learning courses to get you into the field. In supervised learning, an algorithm is trained to recognize a pattern in a given data set. The data set is split up into input (x) and output (y). The algorithm is trained to map input to output by learning with the given data set (training phase) the underlying pattern. Afterward, when the algorithm is trained, it can be used to make predictions for future input data points to come up with output data points (inference phase). During the training phase, a cost function estimates the performance of the current algorithm and adjusts the parameters of the algorithm based on those outcomes (penalization). The algorithm itself can be simplified into a simple function to map an input x to an output y. It's called hypothesis or model.
Predicting housing prices in Portland is one popular machine learning problem for supervised learning. Given a data set of houses whereas each house has a size in square meter (x), the price (y) of the house should be predicted. Thus the data set consists a list of sizes and prices for houses. It is called a training set. Each row in the training set represents a house. The input x, in this case the size of the house, is called a feature of the house. Since there is only one feature for the houses in the training set, it is called a univariate training set. If there are more features for a house, such as number of bedrooms and size, it becomes a multivariate training set. Increasing the size of the training size (m) and the size of features (n) can lead to an improved prediction of y whereas y is called a label, target or simply the output. In a nutshell: A model is trained with a penalizing cost function to predict labels from data points and their features.
Tom Mitchell (1998): "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E."
The previous use case of predicting housing prices in Portland is called a regression problem. A linear regression, as explained before, can be used to train the hypothesis to output continuous values (e.g. housing prices). Another problem in the area of supervised learning to be solved is called classification problem where a logistic regression is used to output categorical values. For instance, imagine you have a training set of T-Shirts. The features, such as width and height, can be used to make predictions for the categorical sizes X, M and L.
The previous paragraphs were a first glimpse on supervised learning in machine learning. How does unsupervised learning work? Basically there is a given training set with features but no labels y. The algorithm is trained without any given output data in the training set. In a classification problem the algorithm has to figure out on its own to classify the data points into clusters.
And last but not least, what about reinforcement learning? In reinforcement learning the algorithm is trained without any given data. It learns from experience by repeating a learning process. For instance, take this flappy bird which learns to win the game by using neural networks in reinforcement learning. The algorithm is learning by trial and error. The underlying mechanism is a combination of rewards and penalizations to train the bird to fly. Similar as a real bird would learn how to fly.
Last but not least, there might be another question popping up in your head: What's the relationship of data science to machine learning? Data science is often associated with machine learning. So one could argue that machine learning bleeds into both domains: data science and artificial intelligence. However, data science has its own subfields such as data mining and data analysis. It can often be used coupled to machine learning, because data mining enables an algorithm to learn from mined data and data analysis enables researchers to study the outcomes of algorithms.
There are a bunch of resources that I want to recommend for web developers entering the field of machine learning. As for myself, I wanted to stimulate my senses for at least 12 weeks. That's how long it is said to complete Andrew Ng's machine learning course. Keep in mind that it's my personal roadmap and it might not be suited for everyone. But it helped me a lot following a strict routine and having enough learning material along the way. So it might help other web developers too.
If you just want to get a feeling for the topic, start to listen to the Machine Learning Guide up to episode 11. Tyler Renelle has done an amazing job giving an introduction to the topic. Since it is a podcast, just give it a shot while you exercise in a gym. That's how I entered the field of ML.
If you start to get excited, the next step would be to enroll in the Machine Learning course by Andrew Ng which takes 12 weeks for completion. It takes you on a long journey from shallow machine learning algorithms to neural networks, from regression problems to clustering problems and from theoretical knowledge in the field to applied implementations in Octave or Matlab. It is intense and challenging, but you can do it by dedicating a couple of hours each week to the course and the exercises.
The machine learning course goes from linear regression to neural networks in 5 weeks. In the end of week 5, I was left with an overwhelming feeling. It was a combination of "Can week 6 become even more complex?" and "Wow, this course taught me all the building blocks to implement a neural network from scratch". Andrew gives a perfect walkthrough to learn about all these concepts which build up on one another. After all, machine learning has a lot in common with the composition of functions from functional programming. But you will learn about this yourself. I can only say that it was an overwhelming feeling to see an own implementation of a neural network performing in the browser for the first time.
After you have completed week 5 of the machine learning course, you should have a good feeling about what's machine learning and how to solve problems with it. Afterward, the course continues with shallow algorithms for supervised learning and unsupervised learning. It gives elaborated guidance of how to improve your implemented machine learning algorithms and how to scale them for large data sets. When you have completed week 5, you should continue as well with the Machine Learning Guide podcast to learn more about shallow algorithms and neural networks. I listened to it until episode 17, because afterward it goes heavily into natural language processing.
In addition, over the course of those weeks, I read The Master Algorithm by Brilliance Audio to get an overview about the topic, its different perspectives and stakeholders, and its history. After that, I started to read the open source ebook Deep Learning (by Ian Goodfellow and Yoshua Bengio and Aaron Courville). It happened after week 5 of the course and fitted perfectly to all the foundational knowledge I learned so far. Even though I found it quite a challenging book so far, I can recommend both books to give you even more guidance along the way. Once I finish the second book, I want to read the free ebooks Neural Networks and Deep Learning by Michael Nielsen and Deep Learning by Adam Gibson, Josh Patterson. Do you have any other book or podcast recommendations? You can leave a comment below!
What else is out there to learn machine learning? Now after I completed the course by Andrew Ng, I will take some rest to internalize all those learnings. Likely I will write more about them for my blog. You can subscribe to the Newsletter if you are interested in hearing about them. However, there a bunch of other courses out there which I want to check out.
- Machine Learning Engineer Nanodegree on Udacity
- Deep Learning Specialization on Coursera
- Practical Deep Learning for Coders on Fast.ai
These are all courses recommended along with the Machien Learning course by Andrew Ng. Fast.ai has a course on computational linear algebra for the underlying math in ML too. In general, machine learning involves lots of math. If you need a refresher on certain topics, I can highly recommend Khan Academy.
- Pavlov.js (Markov Decision Processes)
- SVM.js (Support Vector Machines)
- Brain.js (Neural Networks)
- Synaptic (Neural Networks)
- Neataptic (Neural Networks, Neuroevolution)
- WebDNN (Neural Networks, Inference Mode)
- Natural (Natural Language Processing)
- Sentiment (Sentiment Analysis)
- OpenCV.js (Computer Vision with OpenCV for the Browser)
- opencv4nodejs (Computer Vision with OpenCV for Node.js)
- face-recognition.js (Face Recognition)
- face-api.js (Face Recognition based on Tensorflow.js)
Many of those libraries are only for machine learning in Node.js. Thus they are not using the computational efficient WebGL in the browser.
If you have any other recommendations, please leave a comment below. If you know whether certain libraries are active or not maintained anymore, please reach out as well. I would love to keep this article updated for the future.
If you have made it so far in this article, thank you so much for reading it!