Reading in and writing a JSON file to a storage bucket from a workflow
Workflows provides several connectors for interacting with various Google Cloud APIs and services. In the past, I’ve used for example the Document AI connector to parse documents like expense receipts, or the Secret Manager connector to store and access secrets like passwords. Another useful connector I was interested in using today was the Google Cloud Storage connector, to store and read files stored in storage buckets.
Those connectors are auto-generated from their API discovery descriptors, but there are some limitations currently that prevent, for example, to download the content of a file. So instead of using the connector, I looked at the JSON API for cloud storage to see what it offered (insert and get methods).
What I wanted to do was to store a JSON document, and to read a JSON document. I haven’t tried with other media types yet, like pictures or other binary files. Anyhow, here’s how to write a JSON file into a cloud storage bucket:
main:
params: [input]
steps:
- assignment:
assign:
- bucket: YOUR_BUCKET_NAME_HERE
- write_to_gcs:
call: http.post
args:
url: ${"https://storage.googleapis.com/upload/storage/v1/b/" + bucket + "/o"}
auth:
type: OAuth2
query:
name: THE_FILE_NAME_HERE
body:
name: Guillaume
age: 99
In the file, I’m storing a JSON document that contains a couple keys, defined in the body of that call. By default, here, a JSON media type is assumed, so the body defined at the bottom in YAML is actually written as JSON in the resulting file. Oh and of course, don’t forget to change the names of the bucket and the object in the example above.
And now, here’s how you can read the content of the file from the bucket:
main:
params: [input]
steps:
- assignment:
assign:
- bucket: YOUR_BUCKET_NAME_HERE
- name: THE_FILE_NAME_HERE
- read_from_gcs:
call: http.get
args:
url: ${"https://storage.googleapis.com/download/storage/v1/b/" + bucket + "/o/" + name}
auth:
type: OAuth2
query:
alt: media
result: data_json_content
- return_content:
return: ${data_json_content.body}
This time we change the GCS URL from upload
to download
,
and we use the alt=media
query parameter to instruct the GCS JSON API that we want to retrieve the content of the file (not just its metadata).
In the end, we return the body of that call, which contains the content.