A nifty and new (to me) way for Devs and PMs to share the <3
So I’m building a Postgres-based app with a non-technical person, let’s call him Bob. Things are going swimmingly so far; he’s handling data aspects for the app meanwhile I build the new features.
The main protagonist for today’s post is the Foreign Data Wrapper, a slick Postgres feature that lets you e.g. basically SELECT out to external data.
In this case I wanted to point to a Google Sheet that Bob owns. This way it will be easy for him to change data in an environment he’s familiar with, but also stay easy for me to build out a proper API using Postgres — each of us can stay in our respective comfort zones. Win win.
Here’s the SQL (Postgres 9.3 or higher + wget needed):
CREATE EXTENSION IF NOT EXISTS file_fdw; CREATE SERVER fdw_files FOREIGN DATA WRAPPER file_fdw; CREATE FOREIGN TABLE pokemon ( name varchar NOT NULL, type varchar NULL ) SERVER fdw_files OPTIONS (PROGRAM 'wget -q -O - "https://docs.google.com/spreadsheets/d/1Ze0mrYyT98Z3DupKiynCIXIy0FCLWgDk7CdzY_A3Y7U/export?gid=0&format=tsv"', FORMAT 'csv', HEADER 'true', DELIMITER U&'\0009');
I preferred TSV over CSV in case Bob enters a comma one day. The call out to GSheets takes around ~1 second.
SELECT * FROM pokemon; name | type ------------+---------- Pikachu | Electric Snorlax | Sleep Charmander | Red (3 rows)
1 second is going to be too slow for production, plus more importantly GSheets might not always be available. So let’s throw in a materialized view to make things more Robust™ (mm, tasty!):
CREATE MATERIALIZED VIEW pokemon_view AS SELECT * FROM pokemon;
Great! This view is much faster, only a few ms. We might consider setting up a nightly view refresh as a cronjob.
But wait! Enter a new requirement. We now need to add a primary key/id column.
Well, I don’t want to suddenly ask Bob to manually manage a new column. Annoyingly, I can’t add a primary key to a foreign table, which makes sense in hindsight, but doesn’t make life easier.
So it seems I have to accomplish this using our lovely material view.
Here’s the little tweak to the view to deliver what’s needed:
CREATE MATERIALIZED VIEW app_public.food AS SELECT row_number() over (ORDER BY name) AS id, * FROM app_public.fdw_food;
SELECT * FROM pokemon_view ; id | name | type ----+------------+---------- 1 | Charmander | Red 2 | Pikachu | Electric 3 | Snorlax | Sleep (3 rows)
Bob can edit away to his heart’s content, meanwhile the frontend can frontend properly (the id was needed to handle a reconciler quirk with the key attribute if you were wondering).
“But I don’t want to use integer primary keys,” you say. Maybe you don’t want sneaky people to work out how many records you have. No worries, here’s another flavor:
CREATE MATERIALIZED VIEW pokemon_view AS SELECT MD5(TEXTIN(RECORD_OUT(pokemon))) as id, * FROM pokemon;
SELECT * FROM pokemon_view; id | name | type ----------------------------------+------------+--------- 8c9badbad004555213765fa15395cc03 | Pikachu | Electric a508373d7ad63ee9c9a6ea26602428f0 | Snorlax | Sleep de724e47466765f6171a4d50ab2d7145 | Charmander | Red
I realize MD5 is widely considered to be vulnerable, and maybe there’s a better hashing function in Postgres. If so, please let me know!
Regardless, in this case I trust Bob, the data isn’t private, and my backend uses OFFSET anyway. So I wager we’re alright. If you disagree, please let me know, I’ll appreciate hearing your thinking!
Now, Bob can edit away, I can keep using Postgres, and together we can build a great app. Win win indeed!