About Journey to Kaggle Expert

Juzef the Koala is trying to determine in which area he is an expert

Yesterday, I completed one more minor achievement on my own – I was able to receive enough upvotes on my works to receive a status of Kaggle Notebooks Expert. I’ve set it as a personal goal for myself for the end of this year, but thought that it is too optimistic (I will explain later why), and just continued ‘dancing as nobody sees it’.

Purple frame adds more vibes!

In one of the previous posts I already shared a background of my relations with Kaggle – I registered there 9 years ago, but abandoned it almost immediately as soon as I realized that I’m not smart enough to do data science. It’s quite understandable in a situation when you see other people’s notebooks and understand almost nothing – especially while checking competitions-related works. Even more demotivating is to speak with different data scientists and reveal that of themselves consider Kaggle as something student-level/beginner-level (“ah, yeah, I remember how I checked that Titanic and Iris projects”), if not to speak about competitions for real prizes.

At the same time, I cannot agree with such an estimation. Kaggle is a great place for any person working with data, no matter what level you are currently at. Still, with available datasets, computing power, it is a very inspiring playground to try something new – and, sometimes, to be recognized for that.

When I just rejoined portal 2 months ago, I immediately started playing with datasets I’m interested in – books, video games, dinosaurs, koalas. In most cases, my decision-making process on datasets selection was followed just by my personal interests, not by the popularity or hype of the selected data source. And I could spend hours composing EDAs, playing with dataviz stuff, or trying something completely new for myself – and then to see that my notebook was upvoted by 1 person.

I’ve prepared a dataset with Heroes of Might & Magic 4 unit stats, and it was downloaded 8 times – I have no clue why people did that 🙂

It is a sad but obvious truth that on Kaggle, audience attention will be either on competitions or on very trendy datasets, which are changing every day. And it means that outside of that, you need to be very lucky to receive even a minimal dozen users’ feedback – or, sometimes, it will not be seen by anyone at all.

My work related to attempts to build graphs on video games articles in Wikipedia received 3 upvotes – but who knows how many hours I’ve spent on that…

From 5 of my bronze medal-level notebooks, only one was related to non-competition – it was an EDA based on a synthetic dataset about life science-oriented application usage. It was exactly a lucky moment – once the dataset was published by other user, I was one of the first users to do something with that, and then the dataset became trendy, and significantly more people viewed my work.

Another 4 bronze medals I received on playground competitions. From October to December, I created notebooks with simple approaches, like basic logistic regression, XGBoost, spending not so much time on model fine-tuning and feature engineering. At the same time, due to the high volume of visitors of playground competitions, even simple works receive users’ attention – especially if to update it one time per several days.

An alternative here can be social as much as possible – from that perspective Kaggle definitely can be considered as a social platform – but uh, it always was too challenging for me. I’m really jealous of people who can simply share results of their work in Discussions section – and in some cases be recognized by the community. Especially, it demotivates when you see highly upvoted notebooks with just .describe() and .value_counts() methods without anything else – but, maybe, it is how it should work.

So, anyway, I consider Kaggle Expert status as a probably minor one for anyone else, but a major one for myself – personally, for me it proves that even so an asocial person as me can receive some portion of recognition. It doesn’t provide additional value for me as for analytics specialist, but since it was something I cannot control directly (compared to achieving certifications, where I can just master learning materials, and then master the exam), I received a lot of satisfaction during notebooks preparation.

I’m not planning anything more as the next step here – and just hope that I will continue discovering other stuff on Kaggle, and learning something new continuosly.