Pan-tilt camera on a Raspberry Pi

This is part of my series on learning to build an End-to-End Analytics Platform project.

TLDR; We removed the sense had and assembled the Pimoroni pan-tilt hat with a NoIR Pi Camera. We tested the Pimoroni pan-tilt hat library and got the Pi movin’ and shakin’.

What are we trying to achieve though? ๐Ÿค”

The plan is to be able to use computer vision to detect objects or events, send events based on detection, process the event, then send an event or command back down to the device to perform an action. We’ll start with local development then explore cloud in future posts. Yup, the end-to-end analytics platform project is growing in scope, that’s okay.

Lights, camera, action! ๐ŸŽฅ

Now to detect, capture, and even track objects with a computer vision solution we need something that can ‘see’ and ‘move’. The Pimoroni has a pan-tilt hat to ‘move’ with servos to pan (x-axis) and tilt (y-axis) the mounted camera. We also have a NoIR Pi Camera to help ‘see’. NoIR means No Infrared. Why not the normal one? This one basically has night vision. Case closed.

A Raspberry Pi 4 with a disassembled pan-tilt hat and camera.
Compute, uh, vision? ๐Ÿ˜

After removing the SenseHAT we used before, which we can see in the background. We can use the Pimoroni guide for assembly to get the new hat set up. One thing we don’t have is the Neopixel stick (light) which we don’t need.

Interesting point ๐Ÿ’ก
Pan-Tilt HAT is a two-channel servo driver designed to control a tiny servo-powered Pan/Tilt assembly. - Pimoroni pan-tilt hat Github repo
What's a servo? Aย servomotorย (orย servo motor) is aย rotary actuatorย orย linear actuatorย that allows for precise control of angular or linear position, velocity and acceleration.[1]ย It consists of a suitable motor coupled to a sensor for position feedback. It also requires a relatively sophisticated controller, often a dedicated module designed specifically for use with servomotors - Wikipedia

We’re focused on getting the pan-tilt working, not the camera. We’ll set up the camera when we do the object detection, image capture, etc. The kind maintainers of the Pimoroni pan-tilt hat Github repo have graciously bestowed upon us a curl command to install everything we need.

curl https://get.pimoroni.com/pantilthat | bash
A command terminal with informational messages.
Wait… “may explode”? ๐Ÿ’ฅ

Because we already enabled the I2C using the raspi-config tool, we can see the setup noticed that and printed out ‘I2C Already Enabled’.

A command terminal with informational messages.
Fully prepared ๐Ÿ‘

We’re going to opt for the full install so that we can grab the examples and docs for future. You know, just in case ๐Ÿ˜.

A command terminal with informational messages.
Joyful exclamation โ—

Installation done, let’s turn the key ๐Ÿ—๏ธ and see if this beauty starts.

Start up python in the VSCode bash terminal. Import the pantilthat library. The documentation has a few methods we can try out. We’ll start simple using the pan() and tilt() methods passing in the angles within the allowed range.

A command terminal running Python 3 interpreter displaying code with informational messages.
One hop this time ๐Ÿ•บ

Great! It works. Playing around a little we can see the angle changing.

A Raspberry Pi 4 with an assembled pan-tilt hat and camera.
Hi WALL-E! ๐Ÿค–

Queue music for interpretive machine dancing through numerous function calls… ๐Ÿคฃ

All done! Before we close out, the documentation suggests it’s a good idea to turn of the servo drive signal to save power if we don’t it to actively point somewhere using the pantilthat.servo_enable(index, state) function. Reset the servos to their original position, and used the function to disabled the two servos. Now to shutdown the Pi and think about the next challenge. Getting the camera feed working, then on to object detection and tracking. I’d like to see if we can get the solution working with a Python venv though.

sudo halt

For fun ๐ŸŽˆ

While perusing the documentation I noticed a note on displaying the Pi in the bash terminal which I thought was nifty. Give it a try…

pinout
A command terminal with a Raspberry Pi 4 drawing and information displayed in the terminal.
Whaaaat! ๐Ÿคฏ Pi in bash!!

Summary

It’s been a quick post. Wrapping it up, we got our new pan-tilt hat installed, working, and dancing which brought a smile to my face. There’s heaps still to learn, I2C (I2C protocol), pinouts, and so much more. Most of which is new to me too.

Quick note ๐Ÿ“
A massive thank you to the many people who put their time and effort into projects, like the Pimoroni pan-tilt hat repo, which make things significantly easier for all of us.

We’ll work on the vision part in the next post. Camera’s, image capture, and even object detection. Once we have that done, we’re going to start working on connecting the device to the cloud.

Until next time.

๐Ÿœ

Telemetry logging to InfluxDB and Grafana on Raspberry Pi

This is part of my series on learning to build an End-to-End Analytics Platform project.

TLDR; After a suggestion from a friend (Waylon Payne LinkedIn) we start learning about time series database InfluxDB. We set up InfluxDB on the Raspberry Pi by creating a database. Work on getting Grafana installed and running. We write, troubleshoot, and learn a bunch logging data to InfluxDB. Finally we create a dashboard in Grafana to display on Sense HAT telemetry.

Begin the Influx ๐ŸŒŒ

Last time we tackled writing out SenseHAT readings to a csv on the Pi. Now though, we level up by working on writing that data to a database more suited for streaming log data, in this case InfluxDB.

The InfluxDB documentation has a section for installing on a Raspberry Pi. Now while I was discussing this with my friend, he suggested before strolling down the path of flashing a new OS I should read this article Installing InfluxDB & Grafana on Raspberry Pi. Skipping step 0 is the plan in our case. Another post for references was Datalogger example using Sense Hat, InfluxDB and Grafana. Big thanks to Simon Hearne and Circuits.dk, their posts really helped guide my thinking even though I chose to do things a little differently. ๐Ÿ˜Ž

First up, some updates.

sudo apt update
sudo apt upgrade -y
bash terminal apt update
Ah yes.. updates..

Updates ran pretty quickly. The next part is getting the InfluxDB packages.

wget -qO- https://repos.influxdata.com/influxdb.key | sudo apt-key add -
source /etc/os-release
echo "deb https://repos.influxdata.com/debian $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/influxdb.list

A few things here to learn from the previous code snippets:

  • apt – Is a command line package/software management tool on Debian (Debian Wiki) like search, installation, and removal.
  • etc directory – Holds core configuration files. Found a nice Linux directory structure for beginners post.
  • wget – Is a command line package/software for retrieving files using HTTP, HTTPS and FTP, the most widely-used Internet protocols (Debian Wiki).
  • tee – Reads the standard input and writes it to both the standard output and one or more files (GeeksForGeeks).
  • source – Reads and execute the content of a file (GeeksForGeeks).

Best I can understand at the moment is that we get and store a public key that allows us to authenticate/validate the InfluxDB package when we download it by echoing the latest InfluxData stable release package and adding the package deb record to the source.list directory in a new file which seem to allow apt-get to pick up future updates. That kicked off the install of InfluxDB. How do we know? The console says so..

bash terminal InfluxDB installation
The influx begins!

It installs InfluxDB version 1.8.9 (not 2.0 which is the latest at the moment). Keep that in mind when working with documentation. Upgrading to 2.0 we can leave for the future. Onward!

sudo systemctl unmask influxdb.service
sudo systemctl start influxdb
sudo systemctl enable influxdb.service

More things to learn:

Found the command to check a service status with the –help switch for the systemctl command. These all feel reasonably familiar coming from working a little with PowerShell and the Windows terminal.

sudo systemctl --help
sudo systemctl status influxdb.service
bash terminal InfluxDB service status check
Running, running, running.

The service is up, active, and running. That means we should be able to connect to it. We can do that by logging into the Influx CLI from the terminal. Then creating a database. Creating a user. Finally, granting the user permissions.

influx

create '<yourdatabase>'

use '<yourdatabase>'

create user '<yourusername>' with password '<yourpassword>' with all privileges

grant all privileges on '<yourdatabase>' to '<yourusername>'
bash terminal Influx CLI

Ah familiar territory! A database! Now we have:

  • A database service running.
  • A database created.
  • A user that has more than enough permissions to interact with the database.

Grafana

We want a way to visualise the telemetry that’s going to be written into the database. Grafana gives us the ability to create, explore and share all of your data through beautiful, flexible dashboards and we can run the service on the Pi. We’re taking the same approach as we did for InfluxDB to get Grafana up and going. Getting all the packages, installing them, running updates, and validating the services.

wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee /etc/apt/sources.list.d/grafana.list

sudo apt update && sudo apt install -y grafana

sudo systemctl unmask grafana-server.service
sudo systemctl start grafana-server
sudo systemctl enable grafana-server.service

sudo systemctl status grafana-server.service
bash terminal Grafana service status check
So much fitness ๐Ÿƒ

Once the service is up we should be able to connect to it on port 3000.

Grafana login page

Yay! We’re connected. Let’s log in with the user name and password ‘admin‘ then reset the password. After the login process is done, we’ll land on the homepage for our Grafana instance running on the Pi. I’m actually excited about this ๐Ÿ˜„.

We need a way to connect Grafana to our InfluxDB database. On the Home page is a ‘Data Sources‘ tile which we can follow to add a data source.

Grafana home page
Source of the action ๐Ÿ’ฅ

We can use the search box to lookup a connector for InfluxDB. Once we have that we just select it.

Grafana data source search page
Search the unknown ๐Ÿ”ฎ

From there we configure the settings for the connector.

Grafana data source settings
The way through the mountains ๐Ÿ”๏ธ

Credentials to make our way into the database.

Grafana data source credential settings
You shall pass ๐Ÿง™โ€โ™‚๏ธ

Finally, we save and test the connector to make sure its all working.

Grafana source page connection test
Green.. Green is good ๐ŸŸข

Good news. The scaffolding is in place. Now we need to get data into the database then configure some dashboards.

Bilbo Loggings ๐Ÿชต

โ€œIf ever you are passing my way,โ€ said Bilbo, โ€œdonโ€™t wait to knock! Tea is at four; but any of you are welcome at any time!โ€

– Bilbo Baggins

Did someone say tea? Time for a spot of IoT! Now to get our IoT device capturing data. A quick swish of our telemetry logging code and we have a starting point. All we need to do is figure out how to log data to InfluxDB not the csv.

There is a library for working with InfluxDB hosted on PyPi. We need to download and install the packages locally. Remembering to use our Python virtual environment. Though this time we are going to give the VS Code Python environments integration the spotlight. Invoke the Command Palette Ctrl + Shift + P. Start typing and select the option for “Python: Select Interpreter“.

VSCode command palette
Lost in translation

Look what VS Code recommends…

VSCode command palette Python interpreter selection
Recommended interpretation

Our virtual environment. It’s so smart. I think it reads my blog drafts ๐Ÿคฃ. After we chose the interpreter, VS Code switches context to our virtual environment context. It even reminds us in the status bar at the bottom of the window. So thoughtful!

VSCode status bar
Our friendship will continue.

๐Ÿšง Slight detour ๐Ÿšง

Slight digression from our regular blog flow...

Now, if you are briskly following along and haven’t switched to the venv in the terminal, then you will probably run into an error like this. Which might lead you on a wild goose chase across fields of learnings and wonder.

bash terminal error installing InfluxDB Client
Goose chase begins ๐Ÿฆข

Naturally, we go looking to see if we find any packages for influxdb-client:

pip search influxdb-client

Which the PyPI XMLRPC API politely, in a crimson message, lets us know things are not as peachy as we hoped:

bash terminal error downloading InfluxDB client
โš ๏ธ!Unmanageable load!โš ๏ธ

Fault: <Fault -32500: “RuntimeError: PyPI’s XMLRPC API is currently disabled due to unmanageable load and will be deprecated in the near future. See https://status.python.org/ for more information.”>

After updating the pip version in the venv, we check on the status which the error message suggested. After a very interesting read, a smidge of despair, hope emerges… I said to myself, “Self, why does that terminal not have the venv prefix?“. That’s when I realised the true source of the problem. Me. I forgot to activate the venv ๐Ÿ˜….

For the terminal we still need to activate the virtual environment. To do this on the Pi we can run:

source <yourvenv>/bin/activate
bash terminal activate Python venv
Activates PEBKAC fix to end goose chase ๐Ÿฆข

Behold!!! It lives!!

When we do that, the terminal actually changes a little, giving us a visual cue that we are in a Python virtual environment. Now to get supporting packages installed so that we can write Python code for InfluxDB. Taking a look at the InfluxDB Client Python GitHub Repo or the influxdb-client PyPI project.

pip install influxdb-client
bash terminal installing InfluxDB client
Goose = chased! ๐Ÿฆข

Sometimes Python can’t catch the programmer being the error.

~ me

๐Ÿšง Slight detour ends ๐Ÿšง

The packages are installed. The logging can begin. To get started we need to import the influxdb_client into our project.

from sense_hat import SenseHat
from datetime import datetime
from influxdb_client import InfluxDBClient, Point

Yet again, we find a pebble in our shoe… While trying to import the sense_hat library modules in the REPL an error presented itself which seemed related to the way the numpy library was installed.

Python terminal numpy error
numpy dumpty had a great fall ๐Ÿคฃ

The error message helps a ton! Jumping over to the common reasons and troubleshooting tips helps with options to solve the issue on a Raspberry Pi, if we use our original error “libf77blas.so.3: cannot open shared object file: No such file or directory“. I opted for the first option to install the package with apt-get:

sudo apt-get install libatlas-base-dev

Then tried to enter the python REPL again (just typing “python” while in the terminal) and importing the SenseHat module.

Python terminal module error
No one by that name here..

I am beginning to feel like a module hunter ๐Ÿน. I tracked down a RaspberryPi thread which led me to a comment on a GitHub issue for the RTIMU module error. To be clear, this doesn’t seem to be an issue when I am running in the global Python scope. Only an issue in the virtual environment. The folks were kind enough to provide a way to install this with a pip command. Here we go:

pip install rtimulib
Python terminal module successful installation
Sparks joy โœจ

Yes! It works!

I tried initially to write a Python function that would write to and query the database. It wasn’t long before I ran into an error trying to connect to the database using the Python function.

Python error message Flux query service disabled.
Flux capacitor disabled โšก

For v1.8 it’s disabled and we need to enable it. To edit the file we can use nano a Linux Command Line text editor.

[http]

  # ...

  flux-enabled = true

  # ...
sudo nano /etc/influxdb/influxdb.conf

Which opens the file in nano for us to edit.

Editing influxdb config in nano
A little text editing

The settings are changed. To bring them into effect we need to restart the service.

sudo systemctl restart influxdb.service
bash terminal influx service start
Starting the engine

No much to go on. I found what seems to be a potential workaround. There is a comment further down in this thread on InfluxDB not starting that talks about adjusting a sleep setting for a start up file. Worth a try. Using nano again, we open the file and make the change.

Editing influxdb systemd start sh file in nano
bash terminal influxdb service status check
Running, running, running ๐Ÿƒ

Time to write the code that will log records to our database. The idea is simple. Run a loop. Every few seconds get a Sense HAT reading. Log the reading to our InfluxDB. Stop the loop when we interrupt the program.


from sense_hat import SenseHat
from datetime import datetime
from influxdb_client import InfluxDBClient, Point

timestamp = datetime.now()
delay = 15
sense = SenseHat()
host = "localhost"
port = 8086
username = "grafanabaggins"
password = "<NotMyPrecious>"
database = 'shire'
retention_policy = 'autogen'
bucket = f'{database}/{retention_policy}'

def get_sense_reading():
    sense_reading = []
    sense_reading.append(datetime.now())
    sense_reading.append(sense.get_temperature())
    sense_reading.append(sense.get_pressure())
    sense_reading.append(sense.get_humidity())
    return sense_reading

# This method will log a sense hat reading into influxdb
def log_reading_to_influxdb(data, timestamp):
    point = ([Point("reading").tag("temperature", data[1]).field("device", "raspberrypi").time(timestamp), Point("reading").tag("pressure", data[2]).field("device", "raspberrypi").time(timestamp), Point("reading").tag("humidity", data[3]).field("device", "raspberrypi").time(timestamp)])
    client = InfluxDBClient(url="http://localhost:8086", token=f"{username}:{password}", org="-")
    write_client = client.write_api()
    write_client.write(bucket=bucket, record=point)

# Run and get a reading Forrest
def run_forrest(timestamp):
    try:
        data = get_sense_reading()
        log_reading_to_influxdb(data, timestamp)
        while True:
            data = get_sense_reading()
            difference = data[0] - timestamp
            if difference.seconds > delay:
                log_reading_to_influxdb(data, timestamp)
                sense.show_message("OK")
                timestamp = datetime.now()
    except KeyboardInterrupt:
        print("Stopped by keyboard interrupt [CTRL+C].")

I struggled for a while trying to figure out the bucket/token variable to what I was able to do in the 1.8.9 CLI easily. I revisited the Python client library and noticed a specific callout for v1.8 API compatibility which has an example that helped me define the token. It wasn’t long before we got the script running and data was being logged to the database.

We’re getting there!

To the Shire

Before we get started on logging data to the database we need to understand some key concepts in InfluxDB. It won’t be the last time I visit that page, these concepts are foreign to me. I learnt to use InfluxQL which is a SQL language to work with the data. There are some differences between Flux and InfluxQL that you might want to keep in mind. I had a tricky time figuring out how to execute Flux queries initially after I wasn’t getting any data back from my Flux commands in a Python function, but saw that you could invoke a REPL to test queries with). To keep things simple though, I opted for InfluxQL. We can launch the Influx CLI from the terminal and query our data.

influx

SHOW DATABASES

USE <database>

SELECT * FROM <table>
Influx queries returning results
Successfully captured! ๐Ÿชค

Let’s see if we can build a dashboard to visualise the data we are logging. We can connect to our Grafana server again. Head to the home page. There is a “Explore” menu item that is a quick way for us to query our data and experiment. Once the window opens up we select our data source connection from the drop down box and begin building a query with a wonderfully simple interface.

Visual query building ๐Ÿคฉ

It’s at this point we realise that our logging design might not be correct. What I was expecting was that I could use the columns in the SELECT and WHERE clauses. Apparently not. I initially thought that design would work better because I understood that tags were indexed, not fields, so querying the tags would be faster. Good in theory, but I couldn’t reference the tag in the SELECT and WHERE clauses. My initial mental model needed tweaking. A change to the logging function to log to a single point, not three, with multiple fields.

So this:

point = ([Point("reading").tag("temperature", data[1]).field("device", "raspberrypi").time(timestamp), Point("reading").tag("pressure", data[2]).field("device", "raspberrypi").time(timestamp), Point("reading").tag("humidity", data[3]).field("device", "raspberrypi").time(timestamp)])

Changed to this:

point = ([Point("reading").tag("device","raspberrypi").field("temperature", data[1]).field("pressure", data[2]).field("humidity", data[3]).time(timestamp)])

Minor InfluxDB management needed in future to clean up the database. For now though, we have our ‘frodologgins‘ database which is empty. I ran the logging function against the new database and…

* Chef’s kiss *

It works as expected! A quick updated to the Grafana connection settings to switch to the new database. With the updates in place we now get the expected results in the drop down. We can see the fields we want to display and chart.

We can try reconcile the point, tags, and field in the Python code to how we are querying it with InfluxQL. Slowly sharpening our mental model and skills. The query reads as follows:

  • From our database
  • Query our readings for the default/autogen retention policy
  • Where the device tag value is raspberrypi
  • Return the last temperature field reading
  • Group by ten second intervals

One thing I wasn’t quite sure of is the way that the time range worked in Grafana with the data logged in the database. The query looked correct but no data was returned. I was looking at a window from now-1d to now initially. It seemed logical to me, “find me all the data points from yesterday to now“. The Inspector in Grafana helps get the query and then we can use that to run the query in the Influx CLI to test the queries.

Inspector Clouseau ๐Ÿ•ต๏ธ

I eventually adjusted that to now to now+1d which in my mind is “back to the future” ๐Ÿ”ฎ๐Ÿš—, but it worked. I think this comes down to how the dates are stored (i.e. timezone offsets) and the functions evaluation. I’ll dig into that later, for now this works, we have data showing on a graph.

Grafana explore graph sample
Graph

Let’s take the learnings and apply it to building the dashboard. Head to the home page. There is a “Dashboards” tile we can use to build our first dashboard.

Grafana home page
Dash lightning! ๐ŸŒฉ๏ธ

It opens up a new editing window. I chose an empty panel. From there we can edit the panel in a similar way to what we did with the Explore window. In the upper right corner we can choose the type of chart.

Grafana explore chart type selection
Serious time โณ

There are a bunch of options from changing the charts, adjusting threshold values for the gauges, applying units of measure, and so much more. For our case that’s “Time series“.

That’s it! Use the same approach to build out the other charts. I added “Gauge” visuals as well with the corresponding query.

Grafana final dashboard displaying readings
Ice, Ice, Baby ๐ŸงŠ

Learnings ๐Ÿซ

We made it! It took a while but we did it. Failure is a pretty good teacher. I failed a bunch and learnt a more. That’s not wasted time. It’s worth just getting hands on and trying different things out to build the mental model and skills. I have a long way to go to really understand Python, InfluxDB, Grafana, and Linux but I’ve made progress and learnt new things which is a blessing.

Until next time.

๐Ÿœ

Logging SenseHAT telemetry on the Raspberry Pi

This is part of my series on learning to build an End-to-End Analytics Platform project.

TLDR; I’ve been working on upping my Python game ๐Ÿ. This post we got started by creating a Python virtual environment. Then built a sense HAT data logger with the Raspberry Pi to write the readings to a local file on the Pi.

Going virtual ๐Ÿ’พ

If you are wondering how I am writing code remotely on the Pi, go check out setting up remote development on the Raspberry Pi using VS Code. We are using the same approach here to get connected and working on our Pi.

Part of this journey is growing my skills. I chose Python as a programming language. Not diving into too many details. It just gives a range of capabilities (web through to machine learning) with a single language. No need to switch too much while learning all the techs in this series. Works for me.

While upping my Python game I came across something called virtual environments. A little primer on virtual environments. I think I have a reasonable grasp on how to start using them for better package management.

Not going full virtualenvwrapper yet though. “Hey, I just met you and this is crazy, but here’s my bookmark, browse it later maybe. #justsayin’

To that note, let’s set up a virtual environment. First, check our Python versions on the Pi:

python3 --version
Snake in eagle shadow ๐Ÿ๐Ÿฆ…

We have Python3 installed on the Pi. That means we should have the venv capability built-in. Let’s give it a whirl!

python3 -m venv noobenv
Environment cultivated

When we do that a new folder gets created in our repo. It has a bunch of folders and files related to the “inner workings” of how virtual environments work <- science ๐Ÿ‘ฉโ€๐Ÿ”ฌ.

Logging the things ๐Ÿชต

The goal here is that we have an IoT device that is capturing data from the sensors. It has a bunch of sensors we are going to use, which is exciting. Honestly, the more I work with it, the more amazing it is to me.

from sense_hat import SenseHat
from datetime import datetime

sense = SenseHat()

def get_sense_reading():
    sense_reading = []
    sense_reading.append(datetime.now())
    sense_reading.append(sense.get_temperature())
    sense_reading.append(sense.get_humidity())
    sense_reading.append(sense.get_pressure())
    sense_reading.append(sense.get_orientation())
    sense_reading.append(sense.get_compass_raw())
    sense_reading.append(sense.get_accelerometer_raw())
    sense_reading.append(sense.get_gyroscope_raw())

    return sense_reading

We create a function (get_sense_reading) that we can call repeatedly. Then use the SenseHat functions (e.g. get_temperature) to get readings from the different sensors. To get them all in a single object/row, we can use a list (sense_reading). Then append each reading to the list. Once we have them, we return the list object.

Witness the quickness โšก

We add a for loop to our code to call the get_sense_reading function a few times are print the results to the terminal window. We can run the program (main.py) by calling the Python 3 interpreter and passing the file name to it. That loads the code, executes the loop, prints the results.

python3 main.py

Now to add data to a file on the device. We’ll use a CSV for now, then adapt it later based on our needs. We can use the the sense_reading object returned by the get_sense_reading function and write that to the file using the csv library.

from csv import writer

timestamp = datetime.now()
delay = 1

with open("logz.csv", "w", newline="") as f:
    data_writer = writer(f)
    data_writer.writerow(['datetime','temp','pres','hum',
	                  'yaw','pitch','roll',
                      'mag_x','mag_y','mag_z',
                      'acc_x','acc_y','acc_z',
                      'gyro_x', 'gyro_y', 'gyro_z'])

    while True:
        data = get_sense_reading()
        difference = data[0] - timestamp
        if difference.seconds > delay:
            data_writer.writerow(data)
            timestamp = datetime.now()

We start with a timestamp because we want to calculate a delay interval, say 1 second, between writes to the file. The open the file and write a header row (writerow) to the file. We use a while loop to collect readings, then once we exceed the delay interval, write the row to the file. We need to update the timestamp otherwise we will write a row on every pass after we exceed the timestamp the first time.

Testing seems to be working and we can log data to a file on the device. VS Code integrated terminal really is fantastic at running multiple and side by side shell/terminal windows.

A tail of two terminals ๐Ÿ’ป

Awesome! It works. We have a program logging data to a csv file at a defined interval. Tail simply prints end of file content. A few lines is all we need to double check things are working. Last thing left.. shut down the Pi remotely. Usually I would use a shutdown command. I gave a new command a try “Halt”.

sudo halt
Halt! Who goes there? ๐Ÿ›‘

Looks like that worked ๐Ÿ™‚ The connection got terminated and VS Code detects that, and tries to reconnect. Pretty slick. We managed to start putting new Python skills to use. Learnt how to create a virtual environment for better package management. Then collecting and writing telemetry from the SenseHat to local storage on the Pi.

That’s it for now.

๐Ÿœ

VS Code Setting Remote Development on Raspberry Pi

This is part of my series on learning to build an End-to-End Analytics Platform project.

TLDR; The goal is to work remotely on the Raspberry Pi. We added the VS Code Remote Development extension pack. Used the Remote – SSH extension which is part of the pack to connect to the Pi remotely over the network. Set up set up key based/passwordless SSH authentication. Then to remember the host we added an entry to our SSH config file through VS Code. Finally, got started with the Sense HAT and wrote some code to do stuff on the Pi!

Working remote ๐Ÿง‘โ€๐ŸŒพ

We want to work remotely on the Pi (the artist formerly known as Raspberry Pi ๐Ÿ•บ). Sticking with VS Code, there is an extension to help us do remote development:

VS Code remote extension text.
Remote extensions

Thisย Remote Development extension packย includes three extensions:

  • Remote – SSH – Work with source code in any location by opening folders on a remote machine/VM using SSH. Supports x86_64, ARMv7l (AArch32), and ARMv8l (AArch64) glibc-based Linux, Windows 10/Server (1803+), and macOS 10.14+ (Mojave) SSH hosts.
  • Remote – Containers – Work with a separate toolchain or container based application by opening any folder mounted into or inside a container.
  • Remote – WSL – Get a Linux-powered development experience from the comfort of Windows by opening any folder in the Windows Subsystem for Linux.

Now the one we are after for the time being is the Remote – SSH extension which allows me to connect to the Pi over SSH. It’s not as simple as that though. Look at the architectures supported. We need to make sure our Pi has one of those architectures. Looking at the extension documentation we can see it supports: ARMv7l (AArch32) Raspbian Stretch/9+ (32-bit).

When we previously SSH into the Pi we get a glimpse of the architecture version:

Command line text with Raspberry Pi version.
Version dePi

To get the OS version we can use the cat command to find the release information:

cat /etc/os-release
Command line text with Raspberry Pi version.
Buster Pi

We have a supported configuration. Let’s try to connect remotely via SSH to our Pi with VS Code using our new extension pack. Open the Command Palette and type “remote-ssh”. Look for the “Connect to Host…” option:

VS Code command prompt with text.
Contact ๐Ÿ“ก

Then select the platform for the host. In our case Linux. Raspbian, the Pi-specific OS, is a Linux distribution based on Debian. Fill in the login credentials to finish establishing a connection.

One thing I spotted was the prompt in the bottom right corner of VS Code which was connecting via SSH, nice, and “Installing VS Code Server”. Wait, what? Which looking at the architecture, seems related to the way VS Code Remote Development works. A bit more than I am going to dig into here:

VS Code status bar with installation prompt.
Deploying

Eureka! We are connected remotely to the Pi! Take a look at the VS Code status bar in the bottom left. Notice that it says “SSH:<ip of Pi>” and the Terminal window is a bash shell running connected to the Pi.

VS Code terminal with remote SSH connection.
Ssh.. Terminal likes Pi too

Now that we are connected remotely to the Pi. Let get’s started with the Sense HAT. First things first, software updates. I am learning about Linux as I go here. Standard users by default aren’t allowed install applications on a Linux machine. To update the software we need to escalate privileges. The “Run as Administrator” in Linux terms seems to be “sudo“. I’m team “super user do” just sounds epic. Then apt-get update/upgrade to invoke the package handling utility to update or install the newest version of packages on the system.

sudo apt-get update
sudo apt-get upgrade
VS Code terminal with installation status messages.
Yep.. updates

I ran both commands (only one shown for brevity). They ran like a charm ๐Ÿ€. The upgrade pulled all it needed, created a few diversions, seemed to unpack it’s bags, set itself up, and process what just happened. I don’t know about you, but our Pi seems to have gone through a big phase in it’s life ๐Ÿคฃ. Our Software is updated. Now to install the sense HAT software:

sudo apt-get install sense-hat
VS Code terminal with package status messages.
Already the new and shiny

Nice! Our sense HAT software is up to date. Let’s start writing some code. What is awesome though is that there is an online emulator (trinket.io) that you can use to code the sense HAT in your browser if you don’t have one! Next up, we figure out how to set up new directories for the code with mkdir:

mkdir repos
cd repos
mkdir sense-hat-noob
cd sense-hat-noob

Now once that’s done, VS Code can pick up the new folder and we can open it using the regular “Open Folder” option in the “File” menu:

VS Code command prompt with filepath text.
The Pi files

Short detour here. You might keep getting prompted for the user password. That got annoying fast so I set up key based/passwordless SSH authentication. Then to remember the host we can add an entry to our SSH config file through VS Code.

VS Code SSH target extension.
Remember me

Now to add a file and write some code. To do that we are using the touch command. The moment we do that, VS Code noticed it’s a Python file extension and popped-up to ask if we want to install the recommended Python extensions.

VS Code remote Python extension text.
Why did it have to be snakes? ๐Ÿ๐Ÿค 

What is interesting is that it suggests the extension should be installed on the Pi ๐Ÿค” Again, this is related to the architecture for remote development. I tried not smiling about this, but apparently this extension has preferences…

VS Code remote Python host extension text.
There not here

Okay. We have reasonably secure remote connectivity, remote extensions, code file, no we need some code. Thanks to the new remote extension we have IntelliSense(Pylance) running:

VS Code Python Pylance intellisense.
I sense intellisense

I wrote some code to display a message on the LED face. There are a bunch of parameters for the function, let’s keep it simple here:

from sense_hat import SenseHat
sense = SenseHat()
sense.show_message("Hello world")

Viola! This sparks joy! The Pi displays it elation! I got a snap of the exclamation marks scrolling across the screen (pretty cool catching the one LED off as the characters move across the face).

Raspberry Pi Sense HAT led display with exclamation marks.
Spark Joy

That’s it for this one. Tons of new learning experiences. We are making progress! If you have any ideas or suggestions on things that can be done better. Let me know.

Until next time, take care.

๐Ÿœ


End-to-End Analytics Platform – GitHub Actions

This is part of my series on learning to build an End-to-End Analytics Platform project.

TLDR; This post we got started with GitHub Actions. Test drove Excalidraw for diagraming. While building the workflow learnt Yaml in y minutes. Started with a simple starter workflow. Worked through deploying Bicep files by using GitHub Actions.

Automate construction ๐Ÿšง

Now that we have the code set up and we can deploy it manually it’s time to autobot automate. We are going to use GitHub Actions for Azure to work on building our Continuous Integration/Continuous Deployment pipeline.

What we are looking to create is a workflow to automate tasks. Actions are event-driven. An example is “Run my testing workflow when some creates a pull request.”. What makes up a workflow? Queue Intro to GitHub Actions. To start off the thinking I get to test out a pretty fantastic open-source whiteboard tool, with tons of potential, called Excalidraw. *Inner voice screaming… “Excalidraw, I choose you!”* โš”๏ธ

Working the flow.
  • A workflow is an automated procedure that you add to your repository.
  • A job is a set of steps trigger by event/webhook or scheduled that execute on the same runner.
  • A step is an individual task that can run commands in a job.
  • Actions are standalone commands that are combined into steps to create a job.
  • A runner is a server that has the GitHub Actions runner application installed.

Now that we know some basics, let’s dive in and give it a go. Here is the plan for what we want to achieve with an accompanying artistic depiction:

  1. Develop some code, commit to our branch, push the changes
  2. Then complete a pull request and merge the changes
  3. The GitHub action triggers the workflow
  4. The workflow runs our jobs and steps to deploy the resources
  5. Validate the resources got deployed to Azure
Hard at work.

Let’s create our first GitHub Actions workflow. Navigate to the ‘Actions‘ tab in our repo. We get a few workflow templates we can use. Let’s use the ‘Simple workflow‘ by using the ‘Set up this workflow‘ button.

And Action ๐ŸŽฌ

GitHub actions use YAML syntax for defining events, jobs, and steps. When we created the workflow, GitHub added a new .github/workflows directory to our repo. It created a new .yml file which I renamed to build.yml in that directory. For those of us that don’t speak YAML fluently, we can learn x(yaml) in y minutes ๐Ÿงช.

Yay YAML!

A brief summary of what is going on here:

  1. Our new YAML file and path was added to our repo
  2. We can see our workflow is triggered on a push or pull request actions to our main branch
  3. We have one job that runs on a Ubuntu runner server
  4. We have three demo steps
    1. Uses a packaged action from https://github.com/actions/ called checkout to checkout our repo
    2. Runs a single line script in the runners shell
    3. Runs a multi-line script in the runners shell

That’s a good enough starter template. From this point on, for brevity, I followed the documentation to deploy Bicep files by using GitHub Actions. That covers setting a deployment service principal (My choice, Windows Terminal Prettified), configuring GitHub secrets, and the sample workflow to deploy Bicep files to Azure. Once that is all set up, we have a workflow that looks like this:

Note: Though not the recommended practice. I adjusted the scope of my service principal to a subscription level. For my testing I would like GitHub Actions to create resource groups using my Bicep file definitions not the CLI. So my command was a little different:

az ad sp create-for-rbac --name {myApp} --role contributor --scopes /subscriptions/{subscription-id} --sdk-auth

After reading the Azure/arm-deploy documentation and the exampleGuide

Adjusted

To commit our changes, just use the ‘Start commit’ button in the top right corner. I created a new branch from here for the change, then just finished the pull request from there, merging our changes to the main branch.

Committing to it

Remember our triggers? On pull_request to main? That kicks off our pipeline ๐Ÿ˜Ž

Behold! It lives!

Full disclosure: It failed. lol.

  • I got an error “missing ‘region’ parameter” for the Azure/arm-deploy action
  • I adjusted the file path for the ‘deployAnalyticsPlatform.bicep’ file from root to the src/bicep directory.
  • I also modified the trigger to only fire when a push is done to the main branch. That prevents us running the pipeline twice, once for the pull request, then again when the merge us run.

So we learn ๐Ÿ˜‰

Re-calibrating

Quick update to the code. Run through the GitHub flow again and we are back in business. When we navigate into the Action we can see a bunch of information. Why the workflow was triggered. What’s the status. Which jobs are running.

In flight

When we click on the job, we can drill into the runner logs as well. This helps a bunch in debugging workflows. An example, is that we have a property for the storage account which is read-only:

Only reading errors

The deployment succeeded though which I think is great progress!

Deployed with action!

That’s it! All done. Explored GitHub actions. Created service principals and GitHub secrets. Learnt some YAML. Broke and fixed our workflows. Then successfully deployed resources from Bicep code.

๐Ÿœ

P.S. A really simple video that also helped me rapidly establish some key points quickly was: GitHub Actions Tutorial – Basic Concepts and CI/CD Pipeline with Docker

End-to-End Analytics Platform – IoT with Pi ๐Ÿฅง

This is part of my series on learning to build an End-to-End Analytics Platform project.

TLDR; My very first Pi and sense HAT was graciously gifted to me by Jonathan Wade (LinkedIn). I assembled it. Tried to go headless. Ended up adding a head because reasons. Ran through initial setup and updates. Configured OpenSSH on Windows and SSH on the Pi to get to the headless state.

I got gifted something amazing! Yup, my first Raspberry Pi. Not only that, a Sense HAT too! Now for many people that might mean much, but this is a pretty big moment for me. To save you from another unboxing experience I took the liberty by doing that privately and cut to the end. Behold! The unboxed product:

Raspberry Pi device parts on a table.
Bare metal

Looking through what we have here. The Raspberry Pi 4 Model B (top left), the Raspberry Pi Sense HAT (bottom middle), a SanDisk 32GB microSD card, some spacers, screws, power cable, and a HDMI to mini HDMI cable.

Raspberry Pi device and Sense HAT assembled
Fresh off the factory floor

Assembling the unit was really simple. Just a quick look at the Sense Hat board and we can see some amazing things:

  • Air Pressure sensor
  • Temperature and humidity sensor
  • Accelerometer, gyroscope, and magnetometer
  • 8×8 LED matrix display
  • Even a small joystick!

This device is pretty EPIC and it’s not even powered it on yet. So many things I haven’t ever worked with but can’t wait to try and figure them out.

Raspberry Pi Sense HAT LED lights on
Like a diamond ๐Ÿ’Ž

Next up power and networking. The moment I connected the power a rainbow ๐ŸŒˆ filled the room. A sign of a pot o’learnings ๐Ÿช™ to be found at the end of this experience.

Nice! Now we have the whole unit assembled. What’s the plan? Well, the thinking is to use this to deploy and run Azure SQL Edge on it. Why? A few reasons:

  • I have never worked with a Raspberry Pi
  • I haven’t really work on Linux at all
  • I have never done any work with Azure IoT solutions, or IoT at all for that matter
  • I do know Azure SQL reasonably well, though not Azure SQL Edge
  • Azure SQL Edge has a bunch of interesting things data streaming, time series analysis, and ONNX AI/ML capabilities. None of which I have worked with.

I didn’t connect any screen at this stage. It’s known as “headless”. We need a way to connect to the Pi though. We have the Windows Terminal and found that we can use SSH to connect to the Pi from a Windows 10+ machine.

Terminal with OpenSSH installation
SSHHHHHH…

That didn’t work. Apparently SSH has been disabled by default. Considering I don’t have a microSD card reader, it’s time to put a “head” on nearly headless Pi and connect a screen ๐Ÿ–ฅ๏ธ. The HDMI cable, a keyboard, and a mouse later and we are connected. I ran through the setup, updated the password, downloaded the latest updates, then set up SSH. There are other security best practices that I am going to follow as well after this post. Then tried to connect again and…success!

Terminal with successful SSH connection message
Connection suck seeds! ๐ŸŒฑ

Next I shut the Pi down. Disconnected the screen, mouse, and keyboard. I’m going to try work on this device remotely so I don’t need those peripherals right now.

Now that we have an IoT device I am going to start exploring if there are any open data sets that I can start using and feed some of the device telemetry into the end-to-end analytics solution as a cohesive project. We are going to set up additional services in our solution to support IoT device which will be fun.

Until next time.

๐Ÿœ

End-to-End Analytics Platform – Bicep What-If deployment

This is part of my series on learning to build an End-to-End Analytics Platform project.

TLDR; After I refactored my code to use modules I found that Bicep supports ‘What-If’ operations which explain what the code is going to do before deploying it. This post I do a short test on that. Found an issue not showing Azure Synapse resource creation. Then browsed the Bicep GitHub repo to search issues related to What-If operations. Didn’t find what I was hoping for, so logged my first public GitHub issue ๐Ÿ˜.

Update: The issue we encountered seems to be related to another preflight improvement which is being worked on but is a “…bit of a gnarly, low level issue so please be patient ๐Ÿ™‚. I was amazed to see how quickly Bicep the team responded on this.

What happens when I push this button? ๐Ÿค”

So after my previous post on factoring in some Bicep best practices for code reuse I noticed that Bicep supports ‘What-If’ operations.

az deployment sub create --name '<name of deployment>' --location '<location name>' --template-file '<path to bicep file>' --confirm-with-what-if

Side note: I had to change the VS Code theme to save us all from the agony of lime green on light grey background reading.

What’s nice is we get a breakdown of changes that we are about to apply to our environment. I think that is awesome.

Terminal output of a Bicep deployment what-if operation.
I have one question. Explosions? ๐Ÿงจ

Yes, for the eagle-eyed reader, I realised my storage account name is an Azure Region name hahaha ๐Ÿ˜‚

Looking at the terminal output, reading top to bottom, I can see:

  • We are about to deploy at the subscription scope.
  • We are deploying a Azure Data Lake Gen2 Storage Account with blob container and all their configuration goodness.
  • We are deploying an Azure Synapse… wait a minute…

What was weird was that I didn’t see the Synapse Workspace. I checked the deployment details/output and it was there.

Azure portal deployments screen.
Deployed

I wondered if the reason it didn’t output the Azure Synapse Resource during the What-If was because I didn’t define any output variables for it which I did for the storage account.

Bicep output code.
Putting more out.

I updated my variables, added output variables for my synapse.bicep module, then ran the What-If again. Aaaaand…. nothing changed. Considering Bicep is an Open Source project on GitHub we get to search for issues with ‘What-If’ operations. So, we get to create a issue ๐Ÿ˜ Taking the things learnt over the past few posts on

GitHub issue summary.
de bug ๐Ÿ›

That’s it. Done. Created our first public issue: what-if operation doesn’t seem to include all bicep defined or created resources ยท Issue #3682 ยท Azure/bicep (github.com).

The what-if behaviour doesn’t block us at this stage. The deployment works so at this point I think we are set for the next section to work on getting this into a GitHub Actions pipeline.

๐Ÿœ

Infrastructure as Code (IaC) reuse

Photo by RODNAE Productions from Pexels

This is part of my series on learning to build an End-to-End Analytics Platform project.

TLDR; I made improvements to the Infrastructure as Code from the previous post by following best practices and promoting code reuse. Continued with parameters, but extended the code with scopes, modules, variables, functions, operators, and outputs. There is a list of Bicep best practices that is worth looking into.

Divide and conquer

We can use modules to group a set of one or more resources to be deployed together. We can reuse modules for better readability and reuse. They basically get converted to nested ARM templates from what I understand.

The first part that I want to move int a module is the data lake storage account and resolve dependencies. When that’s done, repeat the process for the other resources that we want to deploy.

Moving day.

Next up, update modules to use parameters and variables where possible to avoid hard coded values. We should be in a position where the module is bit of code that can be called with a set or parameters. Note that when the resources are in the same file, you can reference them directly. An example from my previous post was were I reference the datalake resource.

Same file resource references.

A module only exposes parameters and outputs to other Bicep files. When we move the data lake resource creation to a module, we need to leverage outputs which can then be passed between modules. The idea is to call a module -> deploy the resource -> output important things -> pass those things to another module as input parameters. So, the same property I referenced before now becomes an output in the module of the storage account:

Output for output.

Output variables can now be used in the main script as inputs to another module, etc. We just reference them using the module.output syntax.

Outputs as inputs.

We use operators in our deployments for things like conditional deployments.

On one condition.

Expanding on the use of parameters and variables, functions are a great way to drive flexibility and reuse into your deployments. Getting runtime details, resource references, resource information, arrays, dates, and more. Just remember most work at all scopes, some don’t. When they don’t you will probably figure that out with errors. One way to use them is to inherit the resource group location during resource deployment. In our case, setting variables with the resource group location, appending a deterministic hash string suffix for the storage account name from the resource group, or even enforcing lower case of names then using the variables for deployment.

Variables and functioning captain ๐Ÿ‘ฉโ€โœˆ๏ธ

FYI, the weird looking string notation ‘${var}‘… that’s call ‘string interpolation‘. Pretty simple compared to other ways I’ve had to write parameterised strings before with all kinds of place holders, parameters, and functions. I like!

As a good practice we use parameter decorators to control parameter constraints or metadata. Things like allowed values, lengths, secure strings, etc.

Prettier.

What we do next in our main deployment file is to change the scope. That way we can deploy at the subscription level which let’s us create resource groups in bicep instead of the Azure CLI which we did in the previous post.

Scoping things out ๐Ÿ”ญ

Note: It’s preferred in most cases to put all parameters/variables at the top of the file.

One other point of interest is that when we change the scope, our module to deploy resources error because they can’t be deployed at the subscription level only the resource group level. Make sense. So we need to change their scope in deployment.

Scope inception.

Polishing up the current solution with these practices was good learning. I continued with the approach across all modules and files. Then ran a few tests to make sure the resources deploy as expected.

That covers it off for this post. What I think we will do next is work on setting up a CI/CD pipeline in GitHub to build and deploy these resources into Azure.

๐Ÿœ

End-to-End Analytics Platform – Infrastructure as Code (IaC)

Photo by John Nail from Pexels

This is part of my series on learning to build an End-to-End Analytics Platform project.

TLDR; I set up a GitHub milestone with two issues. Started working with the Bicep language to build Azure resources. It’s basically a language that simplifies building Azure Resource Manager (ARM) templates. I installed the Bicep tooling. Defined resources using things like parameters, modules, and others using an ARM template guide. Used Bicep build to generate an ARM template from the Bicep file. Experimented with Bicep decompile to generating a Bicep file from an ARM template. Created my first gist to share some code. Lastly, used the Azure CLI to deploy the Bicep resource. Also… found a Bicep playground ๐Ÿคธโ€โ™‚๏ธ Just saying..

Prepping for Dev ๐Ÿ‘จโ€๐Ÿ’ป

We are using the development flow from my previous post. Not enough time? Check out the GitHub Flow.

We need a starting point to build out our end-to-end analytics platform. We are going to attempt to deploy a Azure Synapse Analytics and required services with Bicep templates. This gives us two key capabilities:

  • Data Lake Storage to store our data
  • Pipelines to support orchestration and batch ingestion of data

Let’s get started. We created a new issue, updated the project board, set up our new branch in GitHub. Pulled the updates locally. Then checkout to that new branch.

Getting the hang of issues.

In this post though I wanted to learn about GitHub Milestones. They make it easy track a bunch of related issues. They also have convenient progress tracking built in. So I added another issue. Then made my way to the milestone page from the issues tab:

More issues.

Used the ‘New milestone’ button to create a new milestone. Gave it an name and filled in the details.

A milestone. Yep.

After that jump back to the issue and assign it to the milestone we just created. Notice that the milestone has a progress bar.

I will walk 500 miles.

Nice. Issues, milestones, branches, in the flow. Time to get to building things.

Building Biceps ๐Ÿ’ช

To work with Bicep files we need to install the Bicep tools (Azure CLI + Bicep install, VS Code Bicep Extension) . Once that’s done, we add our first .bicep file ๐Ÿฆพ to the project. Remember to check which branch you are on locally.

Flexing our first Bicep file.

Stepping back, according to the documentation, Bicep is a domain-specific language (DSL) that uses declarative syntax to deploy Azure resources. We not covering every area of Bicep here. The documentation does a good job of that. There are a few things that we am going to use in this post:

There are a bunch of other capabilities that you can explore from loops, functions, and more. So much goodness, so little time… someday maybe.

Now to start building out our resources. Let’s start with some parameters:

Intellisense integration for Bicep development.

Yes please! Intellisense for the win! I also added a comment which is being kind to my future self. Now let’s add a resource. The intellisense really helps a bunch to expedite development. We can get to the resource, the API version, and more using the Tab key. Another nice thing is using Ctrl+Space to expand more options, properties, and more:

Building a Bicep resource.

We have some basic building blocks for figuring out how to create a resource. Next I expanded the declaration with more resources, parameters, comments, and properties using the template documentation and Azure-Samples.

Note: You could try create a Synapse Analytics Workspace using the portal and grab the template just before you create it as well.

Notice that you can reference the parameters in the resource declaration which helps with code reuse. I added a deployment condition to control what services get deployed. The code is in an intermediate state to showcase that I can use strings or parameters to assign values. I’ll gist show you what I did ๐Ÿ˜‰:

/*Global parameters*/
param resLocation string = resourceGroup().location
/*This controls if we deploy the resource our not*/
param deployDataLake bool = true
param deploySynapse bool = true
/*Resource specific parameters – Synapse Analytics*/
param synapsWorkspaceName string = 'fancy-name'
param synapseSqlAdministratorLogin string = 'majestic-username'
param synapseSqlAdministratorLoginPassword string = 'your-complex-password'
/*Create a data lake storage account which we use as the Synapse Analytics default data lake*/
resource datalake 'Microsoft.Storage/storageAccounts@2021-04-01' = if (deployDataLake == true) {
name: 'fancy-name'
location: resLocation
sku: {
name: 'Standard_LRS'
tier: 'Standard'
}
kind: 'StorageV2'
properties: {
isHnsEnabled: true
supportsHttpsTrafficOnly: true
accessTier: 'Hot'
networkAcls: {
defaultAction: 'Allow'
bypass: 'AzureServices'
virtualNetworkRules: []
ipRules: []
}
encryption: {
services: {
blob: {
enabled: true
}
file: {
enabled: true
}
}
keySource: 'Microsoft.Storage'
}
}
}
/*
I built this child resource by wroking my way back through these templates: https://github.com/Azure-Samples/Synapse/tree/main/Manage/DeployWorkspace/storage
It get's a little tricky, but we are building a dependency chain of parent-child resources. e.g. Storage account -> Blob -> Container
*/
resource blobService 'Microsoft.Storage/storageAccounts/blobServices@2021-04-01' = if (deployDataLake == true) {
parent: datalake
name: 'default'
properties: {
cors: {
corsRules: []
}
deleteRetentionPolicy: {
enabled: false
}
}
}
resource container 'Microsoft.Storage/storageAccounts/blobServices/containers@2021-04-01' = if (deployDataLake == true) {
parent: blobService
name: 'workspace'
properties: {
publicAccess: 'None'
}
}
/*Create a Synapse Analytics workspace*/
resource synapseWorkspace 'Microsoft.Synapse/workspaces@2021-04-01-preview' = if (deploySynapse == true) {
name: synapsWorkspaceName
location: resLocation
identity: {
type: 'SystemAssigned'
}
properties: {
defaultDataLakeStorage: {
/*I used the datalake resource and can use dot notation to reference information about it. This establishes a dependency.*/
accountUrl: datalake.properties.primaryEndpoints.dfs
filesystem: container.name
}
sqlAdministratorLogin: synapseSqlAdministratorLogin
sqlAdministratorLoginPassword: synapseSqlAdministratorLoginPassword
}
resource workspaceFirewall 'firewallRules@2021-04-01-preview' = {
name: 'allowAll'
properties: {
startIpAddress: '0.0.0.0'
endIpAddress: '255.255.255.255'
}
}
}

It’s a basic Synapse deployment. The goal is to start deploying using Bicep. We can add things like RBAC assignment for storage access, network configurations on the storage firewalls, and others.

To ship it ๐Ÿšข we can use the Azure CLI in the VS Code integrated terminal. The deployment is pretty simple. Login into your Azure subscription with the Azure CLI. Set your subscription context.

az login
az account list
az account set --subscription 'your-subscription-name-or-id'

Create a resource group in which we want to deploy the resources defined in the .bicep file. Bicep can do this which we will get to another day.

az group create --resource-group 'your-resource-group' -location 'azure-region'
Success.

Deploy the resources at a resource group level specifying the Bicep file path as our template file. Once you submit the terminal will indicate that the deployment is running. We should see a JSON summary output when it’s done similar to our resource group deployment.

az deployment group create --resource-group 'your-resource-group' --template-file 'path-to-your-bicep-file'
Deploying robots.

Checking the deployment in the Azure Portal is simple. Navigate to the resource group. On the ‘Overview’ page, there is a ‘Deployments’

Deployments are deploying.

If you keep following the trail, you end up at the deployment detail screen:

It’s working… It’s working! ๐Ÿš€

We can validate the resources are deployed in the Azure Portal:

Deployed!

Let’s close off one of our issues and see what it does to the milestone:

Milestone achieved โœ…

Awesome! To clean up, just delete the resource group ๐Ÿ‘

az group delete --resource-group 'your-resource-group'

Interesting finds

Bicep has nice capabilities for users coming from an ARM background is that you can use the Bicep build to have it build the ARM template ๐Ÿ˜‰.

az bicep build --file 'path-to-your-bicep-file'
Building ARMs from Biceps lol

If you have ARM templates, you can try out the Bicep decompile functionality to TRY (it’s not perfect, so no guarantees) convert your ARM templates to Bicep files.

az bicep decompile --file 'path-to-your-bicep-file'

Wow! Another lengthy post. Thanks for sticking around. We covered some serious ground. We learnt a bunch and kept building a foundation for our future work. Future posts we can tackle things like modules, advanced resource deployments, and deploying using GitHub actions which should be fun.

๐Ÿœ

End-to-End Analytics Platform – GitHub Setup

TLDR; I set up a GitHub repo for my learning project to build an End-to-End Analytics Platform. Used a GitHub project as reference (Atom). Cloned my repo with VS Code. Closed the loop with the GitHub Flow by creating an issue (GitHub Guides – Mastering Issues), then a branch, changed/commit stuff, pushed changes, created a pull request, merge the changes. Found an exciting learning tool (GitHub Lab | How it works), project (GitHub Training Kit), and VS Code GitHub integration along the way.

As part of my series on learning to build an End-to-End Analytics Platform project I want to start To get started here are a few things I want to get set up:

  • README.md – project on a page ๐Ÿ“„.
  • .gitignore file – ignores files starting with VS Code using the gitignore extension.
  • docs folder – using this to store documents related to the project.
  • src folder – planning to use this as the root folder for code in the repo.
  • test folder – planning to use this as the root folder for unit tests.
  • actions folder – as far as I can tell at the moment for GitHub action-related files.
  • GitHub Project – using this to track my work issues over time.

How did I figure out that’s how I wanted to setup the repo you ask? I took my inspiration from the atom/github: Git and GitHub integration for Atom repository. Why? Well, I am starting my learning for GitHub through a LinkedIn Learning course Learning GitHub and that was the repo showcased in intro videos.

When I grow up I’ll work on using an approach outlined by Rob Sewell who has a fabulous beard, and some next-level skills: How to fork a GitHub repository and contribute to an open source project – Rob Sewell – SQLDBAWithABeard. For now, no forking the repo. The plan is to use VS Code in with my development environment setup to clone the repo locally, setup a new remote/local branch to get into the swing of feature branching, then make the changes are complete a very simple pull request.

Creating and cloning repo was easy enough using VS Code. Now I didn’t do everything in VS Code purely because I want to get a used to a few things first. So I manually grabbed the Clone URL from the GitHub repo.

Screen clip of GitHub clone URL.
Grabbing the repository URL for cloning.

All by the book so far. I opened up the Command Palette with Ctrl+Shift+P. Start typing ‘git clone’ and select the ‘Git: Clone’ operation.

Using VS Code command palette to start Git clone.
Use the VS Code command palette to start Git Clone.

Pasted in the repo URL.

Passing in the repository URL to clone in VS Code.
Paste the URL we copied earlier.

I was prompted to choose an repository location to clone the repo to. I chose to go with a folder that I can use for future repos, something like <your path>\github\<your repo>. Once that’s done, we get prompted to open up the repo.

Open repository pop-up in VS Code.
VS Code picks up I did something.

Great! Repo has been cloned locally and the files are there. Local repository has been initialised and we are on the local main branch (see bottom left).

Enhance. We have our files.

Side note: I set a local repo user configs aligned with the user name and email to be used by git for committing to my remote GitHub repo:

git config --local user.name 'example'
git config --local user.email 'example@domain.com'

Before creating feature branches I want to track my work for updating the project structure and the documentation. To do that we create a ‘Project Board‘:

Create a new project from the Projects tab.

Give it a name, description, and if you want to use a template go for it. I opted for a ‘basic kanban’ project template. As I go a long learning how all the triggers and things work this will evolve to something more sophisticated.

First look at the new spiffy project board.

It’s beginning to take shape, nice! Once that is done I need to add a work item to track my work against, to do that we create an ‘Issue‘:

Adding a new issue from the project Issues tab.

Fill it with information. Assign it to myself. Then label it with a ‘documentation’ label. Then hit the ‘Submit new issue’ button. Awesome. We now have an new issue that we can discuss, subscribe to for notifications, assign to projects, milestones, and more.

Posted new issue.

Once the issue is logged, I jumped over to the Project Board and added the ‘card’ to the ‘To Do’ swim lane. Now it automatically links up to this project.

Adding a card to the Project Board.

Switching over to the Code tab, we create a branch from the main branch in GitHub so that we can make our changes to that branch.

Creating a new branch from the main branch.

Once the branch is created we need to sync our local repo with a Git Pull to get the latest changes one of which is the addition of the new branch. The next point is to switch to or ‘check out’ that branch we just created. Opening up the Command Palette with Ctrl+Shift+P. Start typing ‘git checkout’ and select the ‘Git: Checkout to..’ operation. Then the feature branch you want to switch to.

Switching to the feature branch.

Next up. Make changes.

I added all the files and folders that I mentioned earlier. A noteworthy mention though was adding a .gitignore file using the gitignore extension which pulls .gitignore templates from the https://github.com/github/gitignore repository. Awesome. Didn’t know there was a repo full of .gitignore templates.

Installing the gitignore extension.

Opening up the Command Palette with Ctrl+Shift+P. Start typing ‘add gitignore’ and select the ‘Add gitignore’ operation.

Using the Command Palette to add a gitignore file.

Then just choose a template. In my case that’s ‘VisualStudioCode’.

Adding a gitignore file for Visual Studio Code.

Nice! That was pretty easy.

gitignore file added with definition.

Next up push it up remote. Using the menu in the left, I switched to the Source Control view (Ctrl+Shift+G). Added a commit message. Then hit the commit button. In my case I went for the ‘stage all changes and commit them directly‘ option.

Committing the changes.

Now notice at the bottom of the screen in the status bar we can see we have a change waiting to be pushed to the remote repository in GitHub.

Awaiting the big push to the cloud.

Just click on that. Follow the prompts and VS Code will push your changes up to the remote repository. In future with branches we will follow this up with a pull request.

Once that’s done, we jump back to GitHub and we can see our branch is updated with the files we added. Next step is to finish of the flow by creating a Pull Request and merging our changes back into the main branch. To start that process we click on the ‘Compare & pull request’ button.

Starting the review and pull request process.

As part of the pull request we fill in as much information as we can and link any issues that we want to close automatically using keywords. Once we filled everything out and we are happy there are no conflicts, we can click on the ‘Create pull request’ button.

Filing out the pull request and closing issues.

We get a valuable amount of rich history related to the work we have done here. Looking through the details I see I have no conflicts either. Let the merge begin.

Merging.

The merge is successful. In this case the work I have done on the ‘anthonyfourie-skeleton’ branch is complete. I don’t need that branch anymore so I am going to delete it.

Deleting the branch after a successful merge.

That’s it! Flipping back to the Issues tab we can see our issue has been closed. Whoop there it is!

Issue closed. Work complete.

All in all I think that was a pretty good start. Looking forward to the next post where I think I will tackle spinning up some basic infrastructure by using infrastructure as code.

๐Ÿœ