Stencilled

Where do people eat in Austin ??

Recently I visited Austin and many of my friends had mentioned about the variety in food options here. So my wife and I decided to search for places to eat on the foursquare app. As a standard search filter with high rating we ended up at pretty good places and foursquare did alert us to checkins whenever we reached a place. Post the trip I wanted to see how many people do checkins using this app and how the checkins are correlated with the ratings. The first step here is to get the data . So I started to play around with the foursquare API and started working around the URL on what category(food,places to see, etc) to get the data . The authentication process for the foursquare API was a bit tricky but with my google-fu (( and special mention to the GIS tribe ) I was able to get going. Below is how you would get the client id and client secret when you create a new app. ![This is an image](createfsqapi.png) The idea was how to do it for many places across the country. So I decided to use R to scrap and clean the data. You can find the code here.
    
library(RJSONIO)
library(RCurl)

options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl")))

# Obtained from http://notebook.gaslampmedia.com/download-zip-code-latitude-longitude-city-state-county-csv/
ll = read.csv('zip_codes_states.csv',sep=",",head=TRUE)
clientid = "ENTER YOUR CLIENT ID"
clientsecret = "ENTER YOUR CLIENT SECRET"

venue_name = c()
venue_lat = c()
venue_long = c()
venue_city = c()
venue_state = c()
venue_country = c()
venue_checkins = c()
venue_users = c()
venue_hasMenu = c()
venue_rating = c()
venue_postalCode = c()
venue_usersCount = c()
venue_formattedAddress = c()



# To go through the lat longs in the csv and get the data.
  for (i in 1:dim(ll)[1]) {
    lat = ll$latitude[i]
    long = ll$longitude[i]

    # Do query and parse results
     query = paste("https://api.foursquare.com/v2/venues/explore?client_id=",clientid,"&client_secret=",clientsecret,"&ll=",lat,",",long,"&query=food&v=20170131",sep="")
    result = getURL(query)
    data <- fromJSON(result)

    # For each result, save a bunch of fields, you can tweak this to your liking
    if (length(data$response$groups[[1]]$items) > 0) {
      for (r in 1:length(data$response$groups[[1]]$items)) {
        tmp = data$response$groups[[1]]$items[[r]]$venue
        venue_name = c(venue_name,tmp$name)
        venue_lat = c(venue_lat,tmp$location$lat)
        venue_long = c(venue_long,tmp$location$lng)
        venue_city = c(venue_city,tmp$location$city)
        venue_state = c(venue_state,tmp$location$state)
        venue_country = c(venue_country,tmp$location$country)
        venue_checkins = c(venue_checkins,tmp$stats[1])
        venue_hasMenu = c(venue_hasMenu,tmp$hasMenu)
        venue_rating = c(venue_rating,tmp$rating)
        # venue_shortName = c(venue_shortName,tmp$shortName)

      }
    }
  }

  # To Save the raw output
  save(venue_name,venue_lat,venue_long,venue_city,venue_state,venue_country,venue_checkins,venue_hasMenu ,venue_rating ,file='venuesResult.RData')


  # put this into a dataframe
  data = as.data.frame(cbind(locationvar,venue_checkins,venue_name,venue_lat,venue_long,venue_checkins,venue_users))

  # remove the  duplicate results
  dsub = subset(data,!duplicated(data))
  names(dsub) = c("latlong","checkins","name","latitude","longitude")

  # Export to file to csv which can be used for the next step.

  write.csv(tabley,file = "Austin_Foursquare.csv")

  
Once this was done the next part was to how do I visualize this data . Since I have been trying my hands on d3js I used the cleaned output from R in CSV format to display how checkins and ratings vary for these places using bubble chart.