Cleaning Movies Data for Tableau Visualization

Project Details
You are a business analyst consultant and your client is a new movie production company looking to make a new movie. The client wants to make sure it’s successful to help make a name for the new company. They are relying on you to help understand movie trends to help inform their decision making. They’ve given you guidance to look into three specific areas:

more ...

Shiny Greenhouse App

"During 2016, Earth's globally averaged surface temperature was 0.94$^{\circ}$C higher than the twentieth-century century average. All 16 years of the twenty-first century rank among the 17 warmest years on record." - National Oceanic and Atmospheric Administration (NOAA)

If you know me I deeply care about the health of our planet. The biggest threat our civilization and the life of planet right now is global warming. According to Intergovernmental Panel on Climate Change - "Scientific evidence for warming of the climate system is unequivocal". "Majority of Scientists more ...


A/B test Udacity's website

Udacity is considering online experiments to test potential improvements to their website. Two versions of the website are shown to different users - usually the existing website and a potential change. My goal is to design and analyze an A/B test and write up a recommendation on whether Udacity should introduce a new version of the website.

more ...

Identify Fraud From Enron Data

In 2000, Enron was one of the largest companies in the United States. By 2002, it had collapsed into bankruptcy due to widespread corporate fraud. In the resulting Federal investigation, a significant amount of typically confidential information entered into the public record, including tens of thousands of emails and detailed financial data for top executives. Here, I build a supervised learning algorithm to identify fraudulent employees using Enron dataset.

more ...

OSM Data Wrangling

Openstreetmap (OSM) is free, editable map of the world crafted entirely by croudsourcing approach. To build an intution for this wiki-like map, I would like to take an example from the recent earthquake in Nepal. In only 48 hours after the quake, over 2000 volunteers mapper responded to the crisis by quadrupling the road mileage and adding 30% more buildings. OSM is the biggest crowdsourced project ever. However, since the OSM data is human edited, it comes with it's own challenges for cleaning.

more ...

Titanic Survival Exploration

In 1912, the largest ship afloat at the time- RMS Titanic sank after colliding with an iceberg. Of the 2224 passengers and crew abroad 1502 died.

In this project, we will explore the training dataset (train) from kaggle. This dataset contains demographic and passenger information about 891 of the 2224 passengers and crew abroad. The most interesting question here is what features made people more likely to survive the sinking? Based on the available feature information can we build a classification algorithm that can reasonably predict survival?

more ...

Test a Perceptual Phenomenon

In a Stroop task, participants are presented with a list of words, with each word displayed in a color of ink. The participant’s task is to say out loud the color of the ink in which the word is printed. The task has two conditions: a congruent words condition, and an incongruent words condition. In the congruent words condition, the words being displayed are color words whose names match the colors in which they are printed: for example RED more ...