Beginner SQL Tutorial for Marketers

Share on facebook
Share on twitter
Share on linkedin

Marketers can struggle with accessing the data regarding their customers. After this SQL tutorial, you can learn more about your customers first hand and not rely on analysts or the development team to pull information.

There are multiple types of SQL databases, but we will be using PostgreSQL as it is the one I’m most well versed in.

Inspect All the Tables

When you are starting to understand a database you are not aware of, it is key to inspect the tables involved. Without knowing what the tables are in the database, you won’t be able to inspect the individual columns of each table.

I always recommend inspecting the tables first so that you can then select all their data and start understanding the structure of the database.

In PostgreSQL, you can look at all the tables by executing the following statement. This line of code will select all of the table names except for the metadata about the entire database. It is incredibly useful when starting to get the lay of the land of your database.

SELECT * FROM pg_catalog.pg_tables WHERE schemaname != 'pg_catalog' AND schemaname != 'information_schema';

Source: https://www.postgresqltutorial.com/postgresql-show-tables/

We can now start looking into each individual table and create tables that let us learn more about our database.

Select All Data from a Table

At the core of SQL is the SELECT statement. SELECT allows you to pull specific columns out of a database.

By starting a query with the SELECT keyword, you can now specify what columns you want to receive followed by the FROM keyword and your database name.

In order to pull all of the data from our users table, we will pass the star/asterisk in between the SELECT keyword and FROM keyword.

SELECT * FROM users;

This quick query will return all of your data from that specific table. You can try it on any table in your database by simply using this template.

SELECT * FROM [table_name]

Select Specific Column from a Table

If you want to keep your query manageable, it is a best practice to limit your columns rather than using the asterisk. If a table has more than twenty columns, it can become difficult to look at the results of your query.

As we join more and more tables, it can become overwhelming if you include all of the columns without filtering them.

Thankfully, you can reduce the number of columns by simply specifying the exact columns you are looking for.

Assuming your user has an email, address, and phone number store in your database you can access those columns in your user table by simply replacing the asterisk with those column names.

SELECT email, address, phone_number FROM users;

This allows you to get simply those three columns instead of all the columns in the database. By reducing the amount of columns, it makes it that much easier when you combine two tables together to see only the information you want to see.

Filter By a Column Value

If you wanted to find a specific user by email address or a group of users, then you can use the WHERE keyword. The WHERE keyword allows you to only see the values in a database where a column has a specific value or contains a specific value.

Let’s say we wanted to find one specific user using an email. We could call the following SQL query to pull all of their information.

SELECT * FROM users WHERE email = "kevin@scriptsformarketers.com";

This would return all of the columns of data in the users table that is associated with my user object.

Alternatively, you could query for all users with an address that includes a specific state. We can do this by using the ILIKE keyword.

The ILIKE keyword is used with the WHERE keyword instead of an equal sign to look for substrings within the column value. This means that if you are looking for all users with AZ as the state in their address you could execute the following query.

SELECT * FROM users WHERE address ILIKE "AZ";

This will return multiple values in your query of everyone with the Arizona State symbol in their address. This is helpful for user-inputted strings where they are not all exact values.

Overall, the WHERE keyword is very useful to filter down your query by a specific value or find values associated with a specific user.

Joining Tables Together

Let’s assume we have two tables that are our users table and our transactions table. In order to know which transactions are assigned to which users, most database tables have a foreign key and primary key.

A primary key is a unique identifier that each table record has to refer to that unique record. For a user, they may have a unique user id.

This can be referenced within another table record using a foreign key. A foreign key is a unique identifier of another table. Each transaction record may have a column called “user_id” where it holds the value of the user’s primary key that is associated with.

This is critical in order to ensure that you can accurate join SQL tables.

SELECT * from transactions JOIN users ON transactions.user_id = users.user_id;

By using the JOIN keyword, we will connect the two database tables of users and transactions together for every single value that their user_id is exactly the same. This means that each transaction can also have the address and email column from users if needed.

Aggregate Function in SQL

Now that we have a merged table of both users and their transactions, it may be helpful to find the total revenue generated, average cart value, and average order size.

SUM(cost) FROM transactions;

Assuming the cost column is the revenue from the transaction, this query will return the total amount of revenue generated by all transactions. We could filter it using the WHERE keyword to see the sum of revenue for specific segments.

In order to find the average revenue per transaction, we can use the AVG keyword.

AVG(cost) FROM transactions;

This query will return the average amount of revenue gained per transaction. If you wanted to see the average order size, you could run the following query:

AVG(quantity) FROM transactions;

Finally, you can count the total amount of items purchases across all transactions by using the COUNT keyword.

COUNT(quantity) FROM transactions;

You now can aggregate the values of your database to gain insights on your customers. That being said, taking aggregates of the entire database can only be so helpful. In the next step, we’re going to show you how to segment your database to identify aggregates for certain segments.

Group Data By a Column

In order to create segments in SQL, we need to group our data by a specific column. This is where we can start developing interesting analyses about the purchasing behavior of our users.

Let’s say that we wanted to see if different age groups were buying more or less of our product per purchase.

We would first merge our tables using the JOIN keyword.

SELECT * from transactions JOIN users ON transactions.user_id = users.user_id;

From there, we would use the GROUP BY keyword to group the transactions by the user’s age. We will need to include the values that we group by in the query after the SELECT statement followed by any aggregate method like COUNT, AVG, or SUM. This will return the aggregates for those transactions grouped by the user’s age. This query could give you insights into what age group best fits your product.

SELECT users.age, COUNT(transactions.quantity) AS Total_Item_Count, AVG(transactions.quantity) AS Average_Transaction_Size, SUM(transactions.cost) AS Total_Revenue FROM transactions JOIN users ON transactions.user_id = users.user_id GROUP BY users.age;

Age Total Item Count Average Transaction Size Total Revenue
20-30 1,287 $26 $33,462
30-40 873 $5 $4,365
40-50 752 $15 $11,280
60+ 501 $20 $10,020
An example of the Group By Query above

Order By a Column Value

Now that you have Grouped your transactions by age group, it may be helpful if you were able to see the top 2 age groups by total revenue. While this isn’t an issue on the example above, it may be a problem for a query that has 1,000 rows.

You can simply use the ORDER BY keyword to order the entire query by the Revenue Column.

SELECT users.age, COUNT(transactions.quantity) AS Total_Item_Count, AVG(transactions.quantity) AS Average_Transaction_Size, SUM(transactions.cost) AS Total_Revenue FROM transactions JOIN users ON transactions.user_id = users.user_id GROUP BY users.age ORDER BY Total_Revenue;

By adding the “ORDER BY Total_Revenue” to the end of the query we get a table that looks more like this.

Age Total Item Count Average Transaction Size Total Revenue
20-30 1,287 $26 $33,462
40-50 752 $15 $11,280
60+ 501 $20 $10,020
30-40 873 $5 $4,365
An example of the Order By Query above

Notice that we now have the age groups in order of their total revenue.

If you only wanted to return the top two of this query, you can add LIMIT 2 at the end of the query and only return the first two rows.

SELECT users.age, COUNT(transactions.quantity) AS Total_Item_Count, AVG(transactions.quantity) AS Average_Transaction_Size, SUM(transactions.cost) AS Total_Revenue FROM transactions JOIN users ON transactions.user_id = users.user_id GROUP BY users.age ORDER BY Total_Revenue LIMIT 2;

Age Total Item Count Average Transaction Size Total Revenue
20-30 1,287 $26 $33,462
40-50 752 $15 $11,280
An example of the LIMIT 2 Query above

Setting Up a Database in PyCharm

In order to actually use SQL, I recommend you set up your Database on PyCharm. It is an IDE specific to Python which this blog focuses on.

Once you have PyCharm installed, you will want to look to the right hand and see an orange icon that says “Database”.

After clicking on that, a blank pane will show. You want to click the blue plus button.

After clicking on that icon, you will see a menu with the first option being the Red “Data Source”.

Hover over it and click PostgreSQL or whatever data source you have.

You will see the following page:

 

Adding a Database in PyCharm

When you ask your developer team for read only access to your database, they will either send you a URL that you would put in the URL at the bottom of the screen or they will give you all the pieces you need including the following:

  • User
  • Password
  • Database
  • Host
  • Port

Simply put those values into the values seen here and you will be able to test the connection. Once it shows that you have access with a green checkmark, you can open a new SQL command screen and start querying for data.

This is a beginners overview of SQL; however, you can dive deeper by looking up more resources of SQL. I will do a deeper dive soon to focus on the intermediate ways to use SQL. Happy querying!