Analysis: Social Data Analysis| ITP | Instructor: Gilad Lotan
Dataviz :Maps, Lies and Storytelling | ITP | | Instructor: Andrew Hill
Technologies: Python, Pandas, Gephi and CartoDB.
At the beginning, I struggled finding a way to approach the Citibike data, this first graph shows the number of trips that departed from three different stations in February. As you can see here, there is a deep valley on Feb 13th, in which two of such stations show no activity at all. I thought that this was interesting and googled it.It was the worst day ever for the system. The only day under 1000 trips since it started operating. Mhhh, well that’s not entirely true: Sandy destroyed about 3000 bikes before the inauguration of the system. Anyway, it was the worst winter storm in 2013. So I thought this was kind of interesting, and repeated the experiment for other months.But the results were not very interesting. In general, people tend to use the system more on Mondays and Tuesdays than near the weekends. Obviously, people use bikes more when there is good weather, and apparently, lunar eclipses tend to eclipse the system, even if it is a Tuesday with good weather.Or this graph, showing the ages of the people that use Citibike, in which we can see that the people that tend to use more the system, were born in the 80’s.I decided to think about types of stations. I identified five main groups of stations. Midtown, Manhattan near the bridges from Brooklyn, Brooklyn, Williamsburg and big central hubs–in cyan and red, between 14th and 42nd street. The graph shows their connections and relative “importance” to the system. I wanted to know which were the top twenty stations for March 2014. Meaning, the stations that send and received more trips.As you can see, except from W 38 and University Pl, the start and end stations, are the same. Which makes sense because most of them are located near big transportation hubs, such as Grand Central and Penn Station.So I decided to see where are people traveling from those stations. In this case, I’m showing Lafayette St & E 8 St. In this graph we can see its top ten destinations. The first interesting finding is that none of such destinations is one of the 20 top stations that I identified earlier, which can mean that the system has big central hubs that distribute bikes to scattered less important ones.The argument becomes more compelling when we observe this graph in which we can observe the same 10 destinations through time. As we can see, there aren’t any visible patterns to allow us to predict any behavior.However, if we organize the end stations on a map, and enclose an area surrounding them, something interesting arises. This is just a hypothesis, but it looks like the bikes are used for the last one mille of a commute. For example, riding the last mile after taking a train from NJ, and a subway from Penn Station. Each one of its ten top destinations is less that a mille away from Lafayette St & E 8 St station. I believe that this is can be interesting, if you are thinking in ways to rebalance the system, because you could focus your efforts in defining balancing areas for central hubs. Also, at least in the case of Lafayette St & E 8 St., it appears like the system connects the ‘central hub’ to the East and to the West, which makes sense since the subway connects Uptown and Downtown Manhattan. This seems to be true for Lafayette St & E 8 St. It would be interesting to see if other central hubs of the system, show similar patterns.