Photo by Ian Schneider on Unsplash

my happy place

After several months of continuous writing, I have published a great number of blog posts and I feel it is time to organize them in one place. I cannot express how grateful I am for those who view, read, clapped, and responded to my articles. Watching the number of followers…

Photo by MORAN on Unsplash

Data Science, Data Visualization, Programming

my first for-fun data science project

Recently I got myself obsessed with a Japanese tv show. I found myself cannot stop checking on Twitter, Instagram, and a Chinese app called Douban for updates and discussions about the show. In the meantime, I ran into an introduction article about the Python library Twint, which is very convenient…

Photo by Cosiela Borta on Unsplash

part 1: preprocessing text data

It is estimated that 80% of the world’s data is unstructured. Thus deriving information from unstructured data is an essential part of data analysis. Text mining is the process of deriving valuable insights from unstructured text data, and sentiment analysis is one applicant of text mining. It is using natural…

Photo by David Ballew on Unsplash

Part 1: A beginner's guide to K-means

Clustering is one of the most used unsupervised machine learning algorithms. You can think of clustering as putting unorganized data points into different categories so that you can learn more about the structures of your data. Clustering has a variety of applications in extracting information from data without labels. For…

Photo by Nadine Shaabana on Unsplash

conduct reliable causal inference with historical data

A causal relationship, unlike a correlation, is a much stronger relationship between two variables. Although it is hard to claim a causal relationship, it gives meaningful insights and informative guidance once proven. In my previous article, I have discussed what, why and how regarding causal inference:

As mentioned in the…

Photo by Clem Onojeghuo on Unsplash

Office Hours

Nail the data science interviews with confidence, part 5

I have listed the technical questions to practice in machine learning, statistics, and probability theory in my previous articles regarding data science interview preparations. I have also discussed the strategies that can be used to prepare case study questions before and during data science interviews. This article is the fifth…

Zijing Zhu

