Here you will get introduction to machine learning.
Hello there. Many of you must be aware of this term but some might be wondering what the heck is this? Another technical jargon only? Let’s make this simple for you, Machine Learning is made up of two different words Machine and Learning which literally means “making machines learn”. Again how is this possible? We would talk about this later in this very post. Stay tuned.
If you eager to know some interesting points about Machine Learning (ML) we’ve got you covered. Let’s dive deeper.
ML is a vast field and very often related with AI (Artificial Intelligence), whereas some people use these two terms interchangeably. But according to data scientists these two are quite distinct from each other in many aspects. In other words ML is a subset of AI.
Real Life Machine Learning (ML) Examples
We all use email services of Gmail on almost regular basis, but have you ever wondered why is there a section named ‘SPAM’ and there exist some mails in it. Here is where ML come into action, with the application of ML Gmail programs it’s product to differentiate between legit and spam mails. Sounds interesting? Let’s see some more examples.
Have you guys ever noticed that after you surf any product selling site you start seeing very similar advertisement across the web? Suppose you surfed a clothing site, right from that moment you will start noticing ads very similar to the product you searched for. This motive of big companies is accomplished by the application of ML only.
Not only these, ML is functional almost everywhere from Facebook to astronomy to predicting your credit score. Though ML practices are not evolved that much yet, but is definitely among one of the hottest topics of the decade. Also the career options in this domain would be supposed to be a wise decision on the basis of current scenario.
As we can infer from the image above that the machine is made to learn from ‘experience’, i.e. we feed the machines with bulk of data related to any function/work that we expect it to do. The machine primarily tries to recognize the patterns in the input data and learns the pattern. Later then when machine come across any similar pattern it delivers the intended result.
Let’s understand this with an example, suppose we want to make our machine tell us the breed of the dog when we click a picture of any dog with camera. First we need to train our machine with abundance of dog related data i.e. how a breed looks like, what they eat, height of the breed, friendliness with human etc. The machine try to form some pattern from this data and trains itself from previous experiences. Next time when your machine come across any dog it will be able to tell you the breed (though not 100 percent accurately).
Getting Started with Machine Learning (ML)
As we are already aware about the fact that ML is a subset of AI right? So talking about the Artificial Intelligence, this term is not very new to us. Researches on AI is old thing i.e. scientists were trying to develop an artificial brain since 1940s and 50s which led the foundation of AI. Coming back to ML, it is an advancement in the AI’s domain with possibility of the products like Human Robot, Driverless cars etc.
Let’s have a look on what are the prime contents in ML:
- Finding the Dataset
- Which language to opt for ML
- Development Environment (IDE) for ML
- Important Packages & Libraries
- Supervised Learning
- Unsupervised Learning
- ML Models
- Data Mining
- Natural Language processing
Note: Apart from these we do have some bonus tips and suggestions for our readers, which will be provide in between the learning process.
The topics mentioned above cover most of the machine learning and are vast enough to accommodate in one, two or three blog posts. So, we will be publishing the posts on regular intervals to let our readers get a grasp over ML. Hope you guys enjoy learning with us. So let’s dive together.
The very first step in the process of ML is finding a relevant dataset for your machine accompanied by data cleaning and pre-processing. Datasets contain abundant of data as you can see in the example above that are used as experiences for the machine and machine tries to develop some patterns from them.
You can find a dataset according to your needs very easily and essentially for free most of the times. Here are some of the open repositories that we would like to suggest our readers to have their intended datasets.
Here we’ve mentioned a few online free repositories where you can find your datasets. You just need to visit the websites and download the required dataset in .csv format.
After successful downloading of the dataset, the data cleaning and pre-processing steps come into consideration which we will be studying in later posts.
Which Language for ML?
We can use any of the language like R, Python, Java, Scala etc. But in this course we will be focusing on one of the procedural language R and one object oriented language Python. Also these two languages are the most beloved and preferred languages among data scientists.
Let’s do a comparative study of R and Python and find out what they are good for and what not:
- In-Depth Statistical Analysis: R being a language designed for statisticians, it is no point denying that R is practically very mush suited for Statistical Analysis. It adds value to your motive whether you are working data derived from sensors from an IOT device or prediction in financial models. Another reason why R is loved by the data scientists is the fact that it contains CRAN repository, which is the house of thousands of outstanding packages to allow for more elaborate analysis and visualization tasks.
- High-Quality Imaging: R is a well-known language for producing high quality graphs and charts. The important packages that adds more value to R’s this functionality are ggplot2, googleVis, rCharts There exists a Shiny framework in R which can be used to turn visuals into interactive web applications.
Why not Good?
- Learning Ease: R is a language for which it is said that if the programmer have a background in mathematics or statistics it would be pretty easy for him/her to get a grasp of the language otherwise it is more likely to appear counter intuitive.
- Processing Large data: The flexibility provided by R when it comes to processing and creation of large-scale data products is not appreciated. Rather the data scientists prefer to go with languages like Python or Java when actual product is to be made.
- Performance: When compared with other languages, the performance delivered by R is not up to the mark because R was designed with data scientists in mind, not the computers. It is observed that R is relatively slower than Java or Python.
- Smooth Workflow: Python provides a workflow integration and is thus popular among the developers and data scientists when it comes for applying statistical techniques or when these tasks need to be integrated with web apps or production environments. In order to manage their entire data-related workflow data scientists choose Python as their first priority.
- Beneficial in ML: Various libraries that is being provided by python like Scikit-learn, Tensorflow, Pandas, Numpy, PyBrain etc. and a flexibility of python makes this language suitable for application of ML techniques and developing sophisticated models and prediction engines.
Why not Good?
- Not suitable for specialized data tasks: Though Python is well known for it’s flexibility but there are still hundreds of such R packages that do not have equivalent Python substitutes. If very specific tasks have to be done, R is preferred over Python.
So, in this blog post we’ve covered a few topics pertaining to ML and we will be learning all of the remaining topics in the later posts of this course. Hope you guys enjoying learning with us. Stay tuned for more blog posts like this.
Comment below if you have any queries related to above introduction to machine learning.