Reddit has been at the epicenter of one of the biggest movements in the world of finance, and although it seemed like an unlikely source of such a movement — it’s hardly surprising in hindsight.
The trading-focused subreddits of Reddit are the backdrop for a huge amount of discussion about what is happening in the markets — so it is only logical to tap into this huge data source.
When building a data extraction tool like this, one of the first things we need to do is identify what the data we’re extracting is actually about — and for that we will be using named entity recognition (NER).