kohera-logo-regular.svg

How to send emails with an SMTP server in Azure Databricks

A client asked if we could provide a simple form of monitoring on a part of a provided solution. The data platform we developed for them ingested a source that was afterwards used by a business team and our client’s clients. For this particular source, our client asked us to send a simple email with record counts to a mailing list. No problem! Let’s get to work.

SMTP and Databricks

To get this working there are a multitude of options you can explore. In this particular case, we were working with Azure components – mostly Azure Data Factory and Azure Databricks with a Python cluster – and we were looking for a quick solution with some flexibility. We opted to use an SMTP server called SendGrid in our Python Databricks scripts. Given that it’s a free, third-party server, we’re of course not going to be sending company secrets over it. A simple email with record counts, however, is not a problem.

Three easy steps

1. Set up your SMTP server

The first step is setting up your SMTP server. With SendGrid this was very easy. We created an account, set up an email address and created a log in. The process is very self-evident and it takes maybe 5 minutes.

2. Install a library on your Databricks cluster

Next, you need a suitable library to install on your Databricks cluster. Start by googling ‘smtplib whl’ and download the library from PyPi.org. In Databricks, click ‘Clusters’ in the sidebar on the left, click on your cluster and finally ‘Install New’ under ‘Libraries’. Upload the whl-library while making sure you’ve selected the correct extension and you’re good to go.

3. Create the right function

To actually get the mail sent, you need to create a function to send emails and call it where needed. You find the needed code for the function and an example of the call below. Make sure to set the SMTP server and port to the correct settings for your provider and don’t forget to fill out the proper names of the Azure KeyVault secrets you need (we’ve redacted them for obvious reasons ????). Of course, this implies that these secrets exist in the first place, so create those as well if you haven’t already. That’s it, nothing more to it. You can now send emails through an SMTP server from Databricks.

The function

We’d like to think the code is quite readable. But in short we import the SMTP library that you installed in step 2. Then we define our function. I’d suggest putting this in a separate notebook that you can call on when needed. Finally we make use of our function in any notebook we want.

Defining the function

# Send an email through sendgrid

import smtplib

def SendEmail(recipient, subject, message):
  server = smtplib.SMTP ('smtp.sendgrid.net', 587) # check server and port with your provider
  server.ehlo()
  server.starttls()
  server.login("apikey", dbutils.secrets.get(scope = "key-vault-secrets", key = "")) # insert secret name
  sender = dbutils.secrets.get(scope = "key-vault-secrets", key = "") # insert secret name

  msg = MIMEMultipart()
  msg['Subject'] = subject
  msg['From'] = sender
  msg['To'] = recipient
  msg.attach(MIMEText(message))

  server.sendmail(sender, recipient, msg.as_string())
  server.close()

Calling the notebook defining the function and after the function itself

%run /Shared/YourFolder/NotebookHoldingFunction # change according to your Databricks setup

recipient = dbutils.secrets.get(scope = "key-vault-secrets", key = "") # insert secret name
message = "Your message here"
subject = "Your subject here"

SendEmail(recipient,subject,message)

Meeting business demands quickly

Now, Spiderman’s uncle Ben told us that with great power comes great responsibility. So, in developing this power of sending emails through Databricks, we must ask ourselves, is it the right way to go? We discussed this part of the project with Competence Leader Ronny. He validly raised the point that sending these types of emails would be something you typically do with the controlling process/component. In our case, this would be the Azure Data Factory.

It’s something to be discussed with our client. This alternative would take a bit longer though, both in planning and execution. But we needed to tie business’ needs over as soon as possible, so we chose this this quick and flexible solution. We’re not trying to milk a cow with our hands in our pants: the show can go on. And in case we would decide to go in the different direction, suggested by our dear colleague, we now have the time to set it up properly. Great!

2319-blog-database-specific-security-featured-image
Database specific security in SQL Server
There are many different ways to secure your database. In this blog post we will give most of them a...
kohera-2312-blog-sql-server-level-security-featured-image
SQL Server security on server level
In this blog, we’re going to look at the options we have for server level security. In SQL Server we...
blog-security_1
Microsoft SQL Server history
Since its inception in 1989, Microsoft SQL Server is a critical component of many organizations' data infrastructure. As data has...
DSC_1388
Aggregaties
Power BI Desktop is een prachtige tool om snel data-analyses te kunnen uitvoeren. Je connecteert op een databron, importeert en...
dba image
DBA is not that scary
Often when talking to people who are looking for a career path in the data world I feel like there...
blog-2303
How do you link an SCD Type 2 table in Power Query?
This article uses a simple example to demonstrate how to link an SCD Type 2 table in Power Query to...