The number of machine learning algorithms and tools is overwhelming and can be intimidating for beginners who wants to master machine learning. Machine learning is a long distance race and so does it depends on how you start your race so that you don’t exhaust yourself after a few miles and quit. In this post we are going to learn how to write our first machine learning program in the very basic level. The secrete to mastering and enjoying machine learning is developing the curiosity about the cool things that you have ever seen on fictitious movies (for those who watch fiction movies). There is no free lunch in mastering machine learning and the only way is to practice, practice, practice,……,practice, and practice. Practice in this context implies reading research papers, learning machine learning concepts and hands-on practices. Pick a tool, learn as many algorithms as possible, write the code to test the algorithms, learn the underlying principles and theories about each algorithm, learn about the problem you are trying to solve and how different algorithms give different results. This and many more approaches will help you master on applying machine learning to any problems. Machine learning is an iterative process, you will often go back and forth, use different algorithms and techniques, understand the problem domain and many other aspects for you to provide an end-to-end solution. In this post we are going to write our first hello world for machine learning program.

Hello World For Machine Learning Program

There are so many things that we need to learn in order to write our first machine learning application, however, in this post we are not going to cover everything. We will acquire many relevant techniques as we move forward.The best way of mastering machine learning is by writing code and executing the ideas using code. There is a lot of theoretical concepts underpinning machine learning which is very important to know. One of the skill you will need to cultivate is the ability to read research papers and implement the research findings by writing code. Let’s start by writing a simple bare minimum program that does simple prediction. But before we do this let’s see what we need in order to make it happen.

Installing Anaconda Platform

Anaconda is an open source cross platform Python environment for data science. It comes with many libraries for machine learning and data analysis. To install Anaconda go to Anaconda website hereĀ  and download the version that is compatible with your operating system. You can visit my previous post on installing Anaconda to learn more. After installing Anaconda we need to open our favorite IDE to start writing our first machine learning program. Anaconda comes with different IDEs for writing code. The two most commonly used IDEs are; Jupyter and Spyder. In this post and many other post we will be using the Jupyter IDE. Jupyter IDE allows us to create notebooks that we can publish and make cool presentations. However, you can use any other IDE you feel comfortable with.

Machine Learning Libraries

In this post we are going to use sci-kit-learn. We have tons of machine learning libraries such as sci-kit-learn, PyTorch, Tensorflow, DeepLearning4j, Theano etc and we will be covering them in the coming post. Sci-kit-learn is one of the popular and the most used machine learning library. It comes readily installed with Anaconda so you don’t need to install it if you are using Anaconda. We will be reviewing other libraries in the coming post. Knowledge of Python, Pandas, NumPy and Matplotlib will be an added advantage throughout this series. You can visit my previous post learn more about them. However, if you don’t have any solid knowledge about Python, Pandas and NumPy don’t worry you’ll pick these skills along the way.

Data For Machine Learning

Without data machine learning is less efficient. Machine learning works well when there is a lot of data since it learns and improves its efficiency. There are many sources of data for machine learning both structured and unstructured. Open data sets for machine learning project can be found at data.gov, uci, among many other sites or you can scrap web data from social media or other sites. Sci-kit-learn also comes with several data sets such as iris, diabetes and wine for machine learning practices. In this post we are going to create simple data set and use it to train our machine learning model.

Our Program

We are going to create a simple machine learning program that will predict if the drink is milk or wine given its calories,sodium content, fat, and cholesterol content. We will train our model using decision tree algorithm. A more detailed post on decision tree algorithm will be covered later. In supervised machine learning we have data that is labeled i.e the feature are assigned to the target. The features are the attributes which are basically the input while the target is the label which is the output of the data set. Here is how our data for training our model looks like

We first represent the features and labels as array then encode the labels into number because machine learning works only with numbers.

From our simple machine learning program we have used the decision tree algorithm to train our data and used the trained model to predict the next input which yields 0 implying that its milk. This kind of machine learning is called classification and is a type of supervised machine learning. We first import the Decision Tree algorithm from the sci-kit-learn using this line from sklearn.tree import DecisionTreeClassifier. Then we declare our algorithms and train it using the fit() function classifier.fit(features, labels). We finally make prediction of the next input using predict() function as follows classifier.predict([[140,130,4,37]]). Pretty easy! This is just the basics of machine learning program and we have left out tons of useful concepts that we are going to cover in the future posts.

Conclusion

In this post we have learned about the most important step in mastering machine learning which is by writing our first hello world machine learning program. This is perhaps the hardest step in machine learning due to large amount of literature and code snippet that can be overwhelming and intimidating. There are so many important concepts of machine learning that we have not covered in this post. However, we will cover most of these concepts in the coming posts.

What’s Next

We have seen how to create a simple machine learning program. In the next post we are going to learn how to load data for machine learning.

Hello World For Machine Learning Program

Post navigation