Python’s pandas library is one of the things that makes Python a great programming language for data analysis. Pandas makes importing, analyzing, and visualizing data much easier. It builds on packages like NumPy and matplotlib to give you a single, convenient, place to do most of your data analysis and visualization work.
Archives for August 2019
The Machine Learning team at commercetools is excited to release the beta version of our new Image Search API.
Image search (sometimes called reverse image search) is a tool, where given an image as a query, a duplicate or similar image is returned as a response. The technology driving this search engine is called computer vision, and advancements in this field are giving way to some compelling product features.
What is Pyjanitor? Before we continue learning on how to use Pandas and Pyjanitor to clean our datasets, we will learn about this package. The python package Pyjanitor extends Pandas with a verb-based API. This easy to use API is providing us with convenient data cleaning techniques. Apparently, it started out as a port of the R package janitor. Furthermore, it is inspired by the ease-of-use and expressiveness of the r-package dplyr. Note, there are some different ways how to work with the methods and this post will not cover all of them (see the documentation).
In this tutorial, you will learn how to implement a simple scene boundary/shot transition detector with OpenCV.
In this post, which can be read as a follow up to our ultimate web scraping guide, we will cover almost all the tools Python offers you to web scrape. We will go from the more basic to the most advanced one and will cover the pros and cons of each. Of course, we won’t be able to cover all aspect of every tool we discuss, but this post should be enough to have a good idea of which tools does what, and when to use which.
One of the most common mistakes data scientists make when training machine learning models is incorrectly splitting data for training and testing. The train/test split involves splitting data during the model training and evaluation process.
Learner makes this simple with a single parameter selection during the model building process. It’s also simple to set the percentage split between training and testing data for each model trained.
Systematic trading allows you to test and evaluate your trading ideas before risking your money. By formulating trading ideas as concrete rules, you can evaluate past performance and draw conclusions about the viability of your trading plan.
Following systematic rules provides a consistent approach where you will have some degree of predictability of returns, and perhaps more importantly, it takes emotions and second guessing out of the equation.
From the onset, getting started with professional grade development and backtesting of systematic strategies can seem daunting. Many resort to simplified software which will limit your potential.
NAG has developed, in collaboration with Xi-FINTIQ, a CVA demonstration code to show how the NAG Library and NAG Algorithmic Differentiation (AD) tool dco/c++ combined with Origami – a Grid/Cloud Task Execution Framework available through NAG – can work together to solve large scale CVA computations.
What Softmax is, how it’s used, and how to implement it in Python.
Transfer learning is a powerful technique for training deep neural networks that allows one to take knowledge learned about one deep learning problem and apply it to a different, yet similar learning problem.
Using transfer learning can dramatically speed up the rate of deployment for an app you are designing, making both the training and implementation of your deep neural network simpler and easier.
In this tutorial, you will learn how to automatically find learning rates using Keras. This guide provides a Keras implementation of fast.ai’s popular “lr_find” method.
This article introduces how to build a Python and Flask based web application for performing text analytics on internet resources such as blog pages. To perform text analytics I will utilizing Requests for fetching web pages, BeautifulSoup for parsing html and extracting the viewable text and, apply the TextBlob package to calculate a few sentiment scores. The code for this article is hosted on GitHub so please fork and experiment with it.
With Python code to scrape, extract, transform and load it into a HDF5 data store to please your future self.
In this tutorial, you will learn how to use Cyclical Learning Rates (CLR) and Keras to train your own neural networks. Using Cyclical Learning Rates you can dramatically reduce the number of experiments required to tune and find an optimal learning rate for your model.
Searching for pulsars is a labor-intensive process that requires experienced astronomers and trained volunteers for their classification. In this article, we implement machine learning techniques to facilitate the process.
Data pipelines are where most of the time is spent for those working with data because the bulk of a machine learning project involves data collection and cleaning. Loominus gives everyone the power to build the data pipelines critical to any machine learning project.
Teraport is a powerful tool within the Loominus product suite that ingests and stages data. In another post, we’ll discuss the data ingestion APIs. For now we’ll focus on building a powerful data pipeline for feature engineering.