Panel is an open-source Python library that lets you create custom interactive web apps and dashboards by connecting user-defined widgets to plots, images, tables, or text.
This project refers to Lambda Labs at Lambda School in which students spent the past 5 weeks building production-grade web applications, with some of them utilizing machine learning models as part of their backends.
The pandas library is a powerful tool for multiple phases of the data science workflow, including data cleaning, visualization, and exploratory data analysis. However, the size and complexity of the pandas library makes it challenging to discover the best way to accomplish any given task.
The Pattern library is a multipurpose library capable of handling the following tasks:
- Natural Language Processing: Performing tasks such as tokenization, stemming, POS tagging, sentiment analysis, etc.
- Data Mining: It contains APIs to mine data from sites like Twitter, Facebook, Wikipedia, etc.
- Machine Learning: Contains machine learning models such as SVM, KNN, and perceptron, which can be used for classification, regression, and clustering tasks.
In this article, we will see the first two applications of the Pattern library from the above list. We will explore the use of the Pattern Library for NLP by performing tasks such as tokenization, stemming and sentiment analysis. We will also see how the Pattern library can be used for web mining.
Topic Model: In a nutshell, it is a type of statistical model used for tagging abstract “topics” that occur in a collection of documents that best represents the information in them.
Many techniques are used to obtain topic models. This post aims to demonstrate the implementation of LDA: a widely used topic modeling technique.
String manipulations are an essential part of Data Science. The latest release of Vaex adds incredibly fast and memory efficient support for all common string manipulations. Compared to Pandas, the most popular DataFrame library in the Python ecosystem, string operations are up to ~30–100x faster on your quadcore laptop, and up to a 1000 times faster on a 32 core machine.
In this article, we will explore TextBlob, which is another extremely powerful NLP library for Python. TextBlob is built upon NLTK and provides an easy to use interface to the NLTK library. We will see how TextBlob can be used to perform a variety of NLP tasks ranging from parts-of-speech tagging to sentiment analysis, and language translation to text classification.