Description:

The imbalanced-learn library that integrates with Pandas ML (machine learning) offers several techniques to address the imbalance in datasets used for classification. In this course, explore oversampling, undersampling, and a combination of techniques. Begin by using Pandas ML to explore a data set in which samples are not evenly distributed across target classes. Then apply the technique of oversampling with the RandomOverSampler class in the imbalanced-learn library; build a classification model with oversampled data; and evaluate its performance. Next, learn how to create a balanced data set with the Synthetic Minority Oversampling Technique and how to perform undersampling operations on a data set by applying Near Miss, Cluster Centroids, and Neighborhood cleaning rules techniques. Next, look at ensemble classifiers for imbalanced data, applying combination samplers for imbalanced data, and finding correlations in a data set. Learn how to build a multilabel classification model, explore the use of principal component analysis, or PCA, and how to combine use of oversampling and PCA in building a classification model. The exercise involves working with imbalanced data sets.

Target Audience:

Duration: 01:24

Description:

Classification, regression, and clustering are some of the most commonly used machine learning (ML) techniques and there are various algorithms available for these tasks. In this 10-video course, learners can explore their application in Pandas ML. First, examine how to load data from a CSV (comma-separated values) file into a Pandas data frame and prepare the data for training a classification model. Then use the scikit-learn library to build and train a LinearSVC classification model and evaluate its performance with available model evaluation functions. You will explore how to install Pandas ML and define and configure a ModelFrame, then compare training and evaluation in Pandas ML with equivalent tasks in scikit-learn. Learn how to build a linear regression model by using Pandas ML. Then evaluate a regression model by using metrics such as r-square and mean squared error, and visualize its performance with Matplotlib. Work with ModelFrames for feature extraction and encoding, and configure and build a clustering model with the K-Means algorithm, analyzing data clusters to determine unique characteristics. Finally, complete an exercise on regression, classification, and clustering.

Target Audience:

Duration: 01:04

Description:

In this 8-video course, explore the fundamentals of regression and clustering and discover how to use a confusion matrix to evaluate classification models. Begin by examining application of a confusion matrix and how it can be used to measure the accuracy, precision, and recall of a classification model. Then study an introduction to regression and how it works. Next, take a look at the characteristics of regression such as simplicity and versatility, which have led to widespread adoption of this technique in a number of different fields. Learn to distinguish between supervised learning techniques such as regression and classifications, and unsupervised learning methods such as clustering. You will look at how clustering algorithms are able to find data points containing common attributes and thus create logical groupings of data. Recognize the need to reduce large data sets with many features into a handful of principal components with the PCA (Principal Component Analysis) technique. Finally, conclude the course with an exercise recalling concepts such as precision and recall, and use cases for unsupervised learning.

Target Audience:

Duration: 00:49

Description:

Examine fundamentals of machine learning (ML) and how Pandas ML can be used to build ML models in this 7-video course. The working of Support Vector Machines to perform classification of data are also covered. Begin by learning about different kinds of machine learning algorithms, such as regression, classification, and clustering, as well as their specific applications. Then look at the process involved in learning relationships between input and output during the training phase of ML. This leads to an introduction to Pandas ML, and the benefits of combining Pandas, scikit-learn, and XGBoost into a single library to ease the task of building and evaluating ML models. You will learn about Support Vector Machines, which are a supervised machine learning algorithm, and how they are used to find a hyperplane to divide data points into categories. Learners then study the concept of overfitting in machine learning, and the problems associated with a model overfitted to training data. and how to mitigate the issue. The course concludes with an exercise in machine learning and classification.

Target Audience:

Duration: 00:46