<- function() {
connect_to_snowflake <- DBI::dbConnect(
conn ::odbc(),
odbc"<data_source_name>",
uid = "<user_id>",
pwd = rstudioapi::askForPassword())
return(conn)
}
<- connect_to_snowflake() con
7 Snowflake
WIP
7.1 Setup
You will first need to install Snowflake’s ODBC Driver. Then configure the DSN for either Windows or macOS.
7.2 Making the Connection
I wrapped the connection into a function:
This works fine for interactive analysis, but you will need to stash your password as an environment variable for various workflows, or perhaps within {targets} projects. (see below)
7.3 Using SQL
Like any other database connection, you can pass SQL queries to the connection as a string.
If you set a default schema when configuring your Snowflake data source, you must explicitly reference other schemas when querying tables outside of it.
::dbGetQuery(
DBI
con,"select table_name, last_altered
from information_schema.tables
where table_name like 'STG%' limit 5
")
7.4 Using R
<- tbl(con, in_schema("INFORMATION_SCHEMA", "TABLES"))
is %>%
is select(TABLE_NAME, LAST_ALTERED) %>%
filter(str_detect(TABLE_NAME, '^STG')) %>%
head(5) %>%
collect()
7.5 Snowflake tables as targets
We can use tarchetypes::tar_change()
and the LAST_ALTERED
field referenced above as a trigger in {targets} pipelines. If the date changes, the target will rerun alongside with downstream targets. Here’s a small example of how that might work:
_targets.R
library(targets)
library(tarchetypes)
tar_option_set(
packages = c(
"tibble",
"DBI",
"odbc",
"dplyr",
"dbplyr"
)
)
<- function() {
connect_to_snowflake <- DBI::dbConnect(
conn ::odbc(),
odbc"<dsn_name>",
uid = "<user_id",
pwd = "<password>")
return(conn)
}
tar_plan(
snowflake_con = connect_to_snowflake(),
tar_change(
my_table,collect(tbl(snowflake_con, "MY_TABLE")),
change = dbGetQuery(
snowflake_con,"select last_altered from db.information_schema.tables where table_name = 'MY_TABLE' and table_schema = 'PROD'"
)
),my_table_transformed = head(my_table)
)