LendingClub is the world’s largest peer-to-peer lending platform. Until recently (through the end of 2018), LendingClub published a public dataset of all loans issued since the company’s launch in 2007.
This post reviews NumPy main components and functionality, with attention to the needs of Data Science and Machine Learning practitioners, and people who aspire to become a data professional.
What it says on the tin.
Although there are an increasing number of commercial AutoML products, the open-source ecosystem has been innovating here as well. In the early days of the AutoML movement, the focus was on those looking to leverage the power of ML models without a background in data science – citizen data scientists. Today, however, AutoML tools have a lot to offer experts too.
One of the most common mistakes data scientists make when training machine learning models is incorrectly splitting data for training and testing. The train/test split involves splitting data during the model training and evaluation process.
Learner makes this simple with a single parameter selection during the model building process. It’s also simple to set the percentage split between training and testing data for each model trained.
Transfer learning is a powerful technique for training deep neural networks that allows one to take knowledge learned about one deep learning problem and apply it to a different, yet similar learning problem.
Using transfer learning can dramatically speed up the rate of deployment for an app you are designing, making both the training and implementation of your deep neural network simpler and easier.
In this tutorial, you will learn how to automatically find learning rates using Keras. This guide provides a Keras implementation of fast.ai’s popular “lr_find” method.
Searching for pulsars is a labor-intensive process that requires experienced astronomers and trained volunteers for their classification. In this article, we implement machine learning techniques to facilitate the process.
Data pipelines are where most of the time is spent for those working with data because the bulk of a machine learning project involves data collection and cleaning. Loominus gives everyone the power to build the data pipelines critical to any machine learning project.
Teraport is a powerful tool within the Loominus product suite that ingests and stages data. In another post, we’ll discuss the data ingestion APIs. For now we’ll focus on building a powerful data pipeline for feature engineering.
The Pattern library is a multipurpose library capable of handling the following tasks:
- Natural Language Processing: Performing tasks such as tokenization, stemming, POS tagging, sentiment analysis, etc.
- Data Mining: It contains APIs to mine data from sites like Twitter, Facebook, Wikipedia, etc.
- Machine Learning: Contains machine learning models such as SVM, KNN, and perceptron, which can be used for classification, regression, and clustering tasks.
In this article, we will see the first two applications of the Pattern library from the above list. We will explore the use of the Pattern Library for NLP by performing tasks such as tokenization, stemming and sentiment analysis. We will also see how the Pattern library can be used for web mining.
Topic Model: In a nutshell, it is a type of statistical model used for tagging abstract “topics” that occur in a collection of documents that best represents the information in them.
Many techniques are used to obtain topic models. This post aims to demonstrate the implementation of LDA: a widely used topic modeling technique.