1  DataSF

Getting data from DataSF is a matter of copying the relevant URL into one of R’s many read functions, e.g. readr::read_csv, jsonlite::fromJSON, st::st_read, etc.

library(readr)
library(sf)
Linking to GEOS 3.9.1, GDAL 3.4.3, PROJ 7.2.1; sf_use_s2() is TRUE
reg_businesses <- read_csv("https://data.sfgov.org/resource/g8m3-pdis.csv")
Rows: 1000 Columns: 37
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (22): uniqueid, ttxid, certificate_number, ownership_name, dba_name, fu...
dbl   (7): business_zip, supervisor_district, :@computed_region_6qbp_sg9q, :...
lgl   (2): parking_tax, transient_occupancy_tax
dttm  (6): dba_start_date, dba_end_date, location_start_date, location_end_d...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Warning

Behind the scenes there is a limit parameter that defaults to 1000, even if the ‘All data’ radio button is selected. To retrieve all the data, either read the same URL with the RSocrata package:

reg_businesses <- RSocrata::read.socrata("https://data.sfgov.org/resource/g8m3-pdis.csv")

Or append ?$limit=9999999 to the end of the URL:

reg_businesses <- read_csv("https://data.sfgov.org/resource/g8m3-pdis.csv?$limit=9999999")

Read in a ‘spatial’ object with st_read and the URL with the geojson file extension:

sup_dists <- st_read("https://data.sfgov.org/api/geospatial/f2zs-jevy?accessType=DOWNLOAD&method=export&format=GeoJSON")
Reading layer `OGRGeoJSON' from data source 
  `https://data.sfgov.org/api/geospatial/f2zs-jevy?accessType=DOWNLOAD&method=export&format=GeoJSON' 
  using driver `GeoJSON'
Simple feature collection with 11 features and 7 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -123.1738 ymin: 37.63983 xmax: -122.3279 ymax: 37.8632
Geodetic CRS:  WGS 84