Turning to DuckDB when you need to crunch more numbers faster than pandas in your Streamlit app
Data
Holy 🦆uck! Fast Analysis with DuckDB + Pyarrow gerardbentley.com
Datasets from real-world scenarios are important for building and testing machine learning models. You may just want to have some data to experiment with an algorithm. You may also want to evaluate your model by setting up a benchmark or determining its weaknesses using different sets of data. Sometimes, you may also want to create synthetic datasets, where you can test your algorithms under controlled conditions by adding noise, correlations, or redundant information to the data.
Awesome Pandas Tricks youtube.com
Learn these fun, exciting, unusual and just plain awesome pandas tricks to solve problems from the Advent of Code.
Web Scraping 101 with Python daolf.com
In this post, which can be read as a follow up to our ultimate web scraping guide, we will cover almost all the tools Python offers you to web scrape. We will go from the more basic to the most advanced one and will cover the pros and cons of each. Of course, we won’t be able to cover all aspect of every tool we discuss, but this post should be enough to have a good idea of which tools does what, and when to use which.
In this post we will learn how to create a binder so that our data analysis, for instance, can be fully reproduced by other researchers. That is, in this post we will learn how to use binder for reproducible research.