Bayesian Optimization provides a principled technique based on Bayes Theorem to direct a search of a global optimization problem that is efficient and effective. It works by building a probabilistic model of the objective function, called the surrogate function, that is then searched efficiently with an acquisition function before candidate samples are chosen for evaluation on the real objective function.
This document serves as an introduction, crash course, and quick API reference for TensorFlow 2.0.
A large amount of data that is generated today is unstructured, which requires processing to generate insights. Some examples of unstructured data are news articles, posts on social media, and search history. The process of analyzing natural language and making sense out of it falls under the field of Natural Language Processing (NLP). Sentiment analysis is a common NLP task, which involves classifying texts or parts of texts into a pre-defined sentiment. You will use the Natural Language Toolkit (NLTK), a commonly used NLP library in Python, to analyze textual data.
In this post, we are going to work with Pandas iloc, and loc. More specifically, we are going to learn slicing and indexing by iloc and loc examples.
Once we have a dataset loaded as a Pandas dataframe, we often want to start accessing specific parts of the data based on some criteria. For instance, if our dataset contains the result of an experiment comparing different experimental groups, we may want to calculate descriptive statistics for each experimental group separately.
In this tutorial, we’re going to dig into how to transform data using Python scripts and the command line.
But first, it’s worth asking the question you may be thinking: “How does Python fit into the command line and why would I ever want to interact with Python using the command line when I know I can do all my data science work using IPython notebooks or Jupyter lab?”
Notebooks are great for quick data visualization and exploration, but Python scripts are the way to put anything we learn into production. Let’s say you want to make a website to help people make Hacker News posts with ideal headlines and submission times. To do this, you’ll need scripts.
Logistic regression is the bread-and-butter algorithm for machine learning classification. If you’re a practicing or aspiring data scientist, you’ll want to know the ins and outs of how to use it. Also, Scikit-learn’s
LogisticRegression is spitting out warnings about changing the default solver, so this is a great time to learn when to use which solver. 😀
TensorFlow 2 is now live! This tutorial walks you through the process of building a simple CIFAR-10 image classifier using deep learning. In this tutorial, we will:
- Define a model
- Set up a data pipeline
- Train the model
- Accelerate training speed with multiple GPUs
- Add callbacks for monitoring progress/updating learning schedules
The code in this tutorial is available here.
Comparing 5 popular neural net architectures on iOS: VGG16, ResNet50, InceptionV3, GoogleNet, and SqueezeNet using PyTorch.
Since the advent of deep reinforcement learning for game play in 2013, and simulated robotic control shortly after, a multitude of new algorithms have flourished. Most of these are model-free algorithms which can be categorized into three families: deep Q-learning, policy gradients, and Q-value policy gradients.
Onesies with logos of open source software. Your favorite open source software for your favorite munchkin.
Although there are an increasing number of commercial AutoML products, the open-source ecosystem has been innovating here as well. In the early days of the AutoML movement, the focus was on those looking to leverage the power of ML models without a background in data science – citizen data scientists. Today, however, AutoML tools have a lot to offer experts too.
One of the milestones of the investment management application was to implement an end to end solution that starts by fetching company stock prices and builds a set of efficient and optimum portfolios using optimisation routines.
In this article, we’ll use some basic machine learning methods to train a bot to play cards against me. The card game that I’m interested in is called Literature, a game similar to Go Fish.
The version of Literature that we implemented is roughly similar to the rules I linked above. Literature is played in two teams, and the teams compete to collect “sets.” A set is a collection of either A – 6 of a suit or 8 – K of a suit (7’s are not included in the game).
The purpose of this article is to introduce the reader to some of the tools used to spot stock market trends.
We will utilize a data set consisting of five years of daily stock market data for Analog Devices. The time period we consider starts on January 1, 2013 and ends on December 31, 2017. We will start analyzing the data using line plots, then introduce candlestick charts. Patterns that can be seen in the candlestick chart will be introduced which can be used to spot changes in the market. We add another of level analysis by overlaying moving averages and discussing how these can help confirm trend changes. Finally, we construct a figure that concisely summarizes the stock price data for any company.
An introduction to running parallel tasks with Celery, plus how and why we built an API on top of Celery’s Canvas task primitives.
One of the technology goals of Zymergen is to empower biologists to explore genetic edits of microbes in a high throughput and highly automated manner. The Computational Biology team at Zymergen is responsible for building software to help scientists design and execute these genetic edits. (For a brief overview, see our Zymergen 101 tutorial).
In this tutorial you will learn how to use OpenCV to stream video from a webcam to a web browser/HTML page using Flask and Python.
Python’s pandas library is one of the things that makes Python a great programming language for data analysis. Pandas makes importing, analyzing, and visualizing data much easier. It builds on packages like NumPy and matplotlib to give you a single, convenient, place to do most of your data analysis and visualization work.
The Machine Learning team at commercetools is excited to release the beta version of our new Image Search API.
Image search (sometimes called reverse image search) is a tool, where given an image as a query, a duplicate or similar image is returned as a response. The technology driving this search engine is called computer vision, and advancements in this field are giving way to some compelling product features.
What is Pyjanitor? Before we continue learning on how to use Pandas and Pyjanitor to clean our datasets, we will learn about this package. The python package Pyjanitor extends Pandas with a verb-based API. This easy to use API is providing us with convenient data cleaning techniques. Apparently, it started out as a port of the R package janitor. Furthermore, it is inspired by the ease-of-use and expressiveness of the r-package dplyr. Note, there are some different ways how to work with the methods and this post will not cover all of them (see the documentation).
In this tutorial, you will learn how to implement a simple scene boundary/shot transition detector with OpenCV.
In this post, which can be read as a follow up to our ultimate web scraping guide, we will cover almost all the tools Python offers you to web scrape. We will go from the more basic to the most advanced one and will cover the pros and cons of each. Of course, we won’t be able to cover all aspect of every tool we discuss, but this post should be enough to have a good idea of which tools does what, and when to use which.
One of the most common mistakes data scientists make when training machine learning models is incorrectly splitting data for training and testing. The train/test split involves splitting data during the model training and evaluation process.
Learner makes this simple with a single parameter selection during the model building process. It’s also simple to set the percentage split between training and testing data for each model trained.
Systematic trading allows you to test and evaluate your trading ideas before risking your money. By formulating trading ideas as concrete rules, you can evaluate past performance and draw conclusions about the viability of your trading plan.
Following systematic rules provides a consistent approach where you will have some degree of predictability of returns, and perhaps more importantly, it takes emotions and second guessing out of the equation.
From the onset, getting started with professional grade development and backtesting of systematic strategies can seem daunting. Many resort to simplified software which will limit your potential.
NAG has developed, in collaboration with Xi-FINTIQ, a CVA demonstration code to show how the NAG Library and NAG Algorithmic Differentiation (AD) tool dco/c++ combined with Origami – a Grid/Cloud Task Execution Framework available through NAG – can work together to solve large scale CVA computations.
What Softmax is, how it’s used, and how to implement it in Python.
Transfer learning is a powerful technique for training deep neural networks that allows one to take knowledge learned about one deep learning problem and apply it to a different, yet similar learning problem.
Using transfer learning can dramatically speed up the rate of deployment for an app you are designing, making both the training and implementation of your deep neural network simpler and easier.
In this tutorial, you will learn how to automatically find learning rates using Keras. This guide provides a Keras implementation of fast.ai’s popular “lr_find” method.
This article introduces how to build a Python and Flask based web application for performing text analytics on internet resources such as blog pages. To perform text analytics I will utilizing Requests for fetching web pages, BeautifulSoup for parsing html and extracting the viewable text and, apply the TextBlob package to calculate a few sentiment scores. The code for this article is hosted on GitHub so please fork and experiment with it.
With Python code to scrape, extract, transform and load it into a HDF5 data store to please your future self.
In this tutorial, you will learn how to use Cyclical Learning Rates (CLR) and Keras to train your own neural networks. Using Cyclical Learning Rates you can dramatically reduce the number of experiments required to tune and find an optimal learning rate for your model.
Searching for pulsars is a labor-intensive process that requires experienced astronomers and trained volunteers for their classification. In this article, we implement machine learning techniques to facilitate the process.
Data pipelines are where most of the time is spent for those working with data because the bulk of a machine learning project involves data collection and cleaning. Loominus gives everyone the power to build the data pipelines critical to any machine learning project.
Teraport is a powerful tool within the Loominus product suite that ingests and stages data. In another post, we’ll discuss the data ingestion APIs. For now we’ll focus on building a powerful data pipeline for feature engineering.
In this post we will learn how to create a binder so that our data analysis, for instance, can be fully reproduced by other researchers. That is, in this post we will learn how to use binder for reproducible research.
Hugging Face, the NLP startup behind several social AI apps and open source libraries such as PyTorch BERT, just released a new python library called PyTorch Transformers.
Transformers are a new set of techniques used to train highly performing and efficient models for performing natural language processing (NLP) and natural language understanding (NLU) tasks such as questions answering and sentiment analysis. Several of the recent techniques used to improve and advance the performance of NLP models, such as XLNet and BERT, are all based on a variation of Transformer.
There are countless reasons why we should learn Bayesian statistics, in particular, Bayesian statistics is emerging as a powerful framework to express and understand next-generation deep neural networks.
Advanced machine learning everyone can use. Stage data. Build models with no code. Manage models in production.
What it sounds like 🙂
Calculating Black-Scholes implied volatilities is a key part of financial modelling, and is not easy to do efficiently.
The benchmark in this field is the iterative method due to Peter Jaeckel (2015), though some banks have their own methods. NAG have teamed up with Dr Kathrin Glau and her colleagues from Queen Mary University of London to see whether their research in Chebyshev interpolation could be combined with NAG’s expertise in efficient computing to provide a faster way of obtaining implied volatilities.
This article describes how to to use Microsoft Azure’s Cognitive Services Face API and python to identify, count and classify people in a picture. In addition, it will show how to use the service to compare two face images and tell if they are the same person. We will try it out with several celebrity look-alikes to see if the algorithm can tell the difference between two similar Hollywood actors. By the end of the article, you should be able to use these examples to further explore Azure’s Cognitive Services with python and incorporate them in your own projects.
In today’s tutorial, you will learn how to use Keras’ ImageDataGenerator class to perform data augmentation. I’ll also dispel common confusions surrounding what data augmentation is, why we use data augmentation, and what it does/does not do.
Machine learning is pretty undeniably the hottest topic in data science right now. It’s also the basic concept that underpins some of the most exciting areas in technology, like self-driving cars and predictive analytics. Searches for Machine Learning on Google hit an all-time-high in April of 2019, and they interest hasn’t declined much since.
This tutorial will show you how to develop, completely from scratch, a stand-alone photo editing app to add filters to your photos using Python, Tkinter, and OpenCV!
For roughly $100 USD, you can add deep learning to an embedded system or your next internet-of-things project.
Are you just getting started with machine/deep learning, TensorFlow, or Raspberry Pi? Perfect, this blog series is for you!
But like in most cities, finding a parking space here is always frustrating. Spots get snapped up quickly and even if you have a dedicated parking space for yourself, it’s hard for friends to drop by since they can’t find a place to park.
My solution was to point a camera out the window and use deep learning to have my computer text me when a new parking spot opens up.
In this tutorial, you will learn how to use Keras and Mask R-CNN to perform instance segmentation (both with and without a GPU).
Using Mask R-CNN we can perform both: Object detection, giving us the (x, y)-bounding box coordinates of for each object in an image; Instance segmentation, enabling us to obtain a pixel-wise mask for each individual object in an image.
A Comprehensive Guide to Modeling with H2O.ai and AutoML in Python