Introduction
When you first create a table in an SQL database, you must define its overall structure by listing out each column you want the table to hold and what kind of data those tables will store. Then, when you add data to the table, the values you insert must align with the data types you defined for each respective column. By forcing you to enter values that align with the table’s predefined structure, an SQL database can help you ensure that you don’t enter any values incorrectly. However, this rigid structure can make things difficult when you’re trying to compare two values that have different data types or when you’re trying to combine values from multiple columns into a single resulting value.
In this tutorial, you’ll learn how to manipulate data with CAST
functions to change the data type of a value or set of values into another, and use the concatenation expression to string character and numerical data values together. You’ll also practice running the CAST
function and concatenation expression in the same query to result in a complete statement.
Prerequisites
To complete this tutorial, you will need:
- A server running Ubuntu 20.04, with a non-root user with
sudo
administrative privileges and firewall enabled. Follow our Initial Server Setup with Ubuntu 20.04 to get started. - MySQL installed and secured on the server. Follow our How To Install MySQL on Ubuntu 20.04 guide to set this up. This guide assumes you’ve also set up a non-root MySQL user, as outlined in Step 3 of this guide.
Note: Please note that many relational database management systems use their own unique implementations of SQL. Although the commands outlined in this tutorial will work on most RDBMSs, the exact syntax or output may differ if you test them on a system other than MySQL.
To practice using CAST
functions and concatenation expressions in this tutorial, you’ll need a database and table loaded with sample data. If you do not have one ready to insert, you can read the following Connecting to MySQL and Setting up a Sample Database section to learn how to create a database and table. This tutorial will refer to this sample database and table throughout.
Connecting to MySQL and Setting up a Sample Database
If your SQL database runs on a remote server, SSH into your server from your local machine:
Next, open the MySQL prompt, replacing sammy
with your MySQL user account information:
Create a database named castconDB
:
If the database was created successfully, you’ll receive the following output:
Query OK, 1 row affected (0.01 sec)
To select the castconDB
database run the following USE
statement:
Database changed
After selecting the database, create a table within it. For this tutorial’s example, we’ll create a table that holds the grades for Mr. Frank’s sixth-grade fall semester class. This table will hold the following 15 columns:
student_id
: displays the values of theint
data type and will serve as the table’s primary key, meaning each value in this column will function as a unique identifier for its respective row.first_name
: uses thevarchar
data type with a maximum of 20 characters to hold the first names of the students.last_name
: holds the last names of the students with thevarchar
data type, again at a maximum of 20 characters.email_address
: uses thevarchar
data type with a maximum of 30 characters to store each student’s email.participation_grade
: shows each student’s total participation grade with theint
data type.attendance_grade
: uses theint
data type to display each student’s attendance grades.midterm_deadline
: uses theTIMESTAMP
data type to represent the deadline each student must submit their midterm exams by. This data type combines both date and time in one string and uses the following format: (YYYY-MM-DD HH:MM:SS
).midterm_submitted
: records the exact day and time students submitted their midterm exam with theTIMESTAMP
data type.midterm_grade
: uses thedecimal
data type to specify each student’s grade for their midterm exam. This column declaration specifies a precision of four with a scale of one, which means values in this column can have four digits, with one of those digits being to the right of the decimal point.essay_deadline
: displays the time and date students must submit their essays using theTIMESTAMP
data type.essay_submitted
: uses theTIMESTAMP
data type to track when students submitted their essay assignment.essay_grade
: holds each student’s essay grade with thedecimal
data type and a precision of four and a scale of one, with the latter being to the right of the decimal point.finalexam_deadline
: stores the deadline information for the final exam using theTIMESTAMP
data type.finalexam_submitted
: uses theTIMESTAMP
data type to record students’ actual time and date submissions for their final exam.finalexam_grade
: contains each student’s grade for their final exam with thedecimal
data type that has a precision of four stored digits and a scale of one digit to the right of the decimal point.
Create a table named fall_grades
that has each of these columns by running the following CREATE TABLE
command:
Next insert some sample data into the empty table:
Query OK, 8 rows affected (0.01 sec)
Records: 8 Duplicates: 0 Warnings: 0
Once you’ve inserted the data, you’re ready to start learning how to use CAST
functions and concatenation expressions in SQL.
Using CAST Functions
The CAST
function allows you to convert a literal value or the values held within a column into a specific data type. Using CAST
functions can be helpful for ensuring that the data types for values in a given expression are compatible.
To use the CAST
command, you must specify the column(s) holding existing data values that you want to convert and then write your preferred data values and or length in the expression. Here’s an example of the syntax:
. . .CAST(existing data value AS desired data value) . . .
It’s important to note, however, that CAST
functions adhere to certain rules in order to run correctly. For instance, it’s important to ensure that the data type you want to convert is compatible with the one you want to convert it to. Referring to the sample data, you wouldn’t be able to convert the values in the finalexam_grade
from a numerical data value (in this case the decimal
data type) directly into a string value such as a letter grade using the CAST
function. Similarly, you cannot convert data types that have a maximum length (such as the example columns which hold varchar(30)
values) to a longer length (for instance, varchar(35)
).
It’s also important to note that different SQL implementations will behave differently when running queries that use CAST
functions to convert data types. Running a query with a CAST
function in MySQL could produce different results than running the same query in PostgreSQL, for example.
To better understand how you can use CAST
functions, imagine the following scenario based on the sample data you inserted in the previous step. Mr. Frank, a sixth-grade teacher at PS 118, is getting ready to submit his grades for the fall semester. He’s been diligently tracking each student’s progress and is especially concerned about their grades for the midterm exam, essay, and final exam. Let’s say you’re helping out Mr. Frank as a teaching assistant, and he’s asked you to provide him with the information for those graded assignments.
You’ll retrieve this data by querying the fall_grades
table to retrieve the relevant information, such as the student’s first_name
, last_name
, and their grades for each assignment:
+------------+-----------+---------------+-------------+-----------------+
| first_name | last_name | midterm_grade | essay_grade | finalexam_grade |
+------------+-----------+---------------+-------------+-----------------+
| Arnold | Shortman | 85.8 | 90.1 | 82.5 |
| Helga | Pataki | 88.4 | 72.5 | 90.0 |
| Gerald | Johanssen | 94.2 | 95.8 | 88.1 |
| Phoebe | Heyerdahl | 98.8 | 90.4 | 100.0 |
| Harold | Berman | 75.7 | 67.5 | 90.9 |
| Eugene | Horowitz | 100.0 | 89.9 | 98.2 |
| Rhonda | Lloyd | 90.4 | 81.3 | 95.5 |
| Stinky | Peterson | 70.6 | 93.1 | 73.2 |
+------------+-----------+---------------+-------------+-----------------+
8 rows in set (0.00 sec)
After bringing this result set to Mr. Frank, he explains that the system he’s using only allows him to input grades as whole numbers and needs these decimal data values to be converted. You decide to use a CAST
function to convert the number of character values specified (in this case the four in the decimal
data type), into two character values instead.
For this query, use the syntax from the previous example but include a CAST
function to convert the decimal
data type to two characters for each of the graded assignments. You’ll implement a CAST
function in three different expressions (for midterm_grade
, essay_grade
, and finalexam_grade
) to convert them to only 2 character data values:
+------------+-----------+---------+-------+-----------+
| first_name | last_name | midterm | essay | finalexam |
+------------+-----------+---------+-------+-----------+
| Arnold | Shortman | 85 | 90 | 82 |
| Helga | Pataki | 88 | 72 | 90 |
| Gerald | Johanssen | 94 | 95 | 88 |
| Phoebe | Heyerdahl | 98 | 90 | 10 |
| Harold | Berman | 75 | 67 | 90 |
| Eugene | Horowitz | 10 | 89 | 98 |
| Rhonda | Lloyd | 90 | 81 | 95 |
| Stinky | Peterson | 70 | 93 | 73 |
+------------+-----------+---------+-------+-----------+
8 rows in set, 24 warnings (0.00 sec)
After reviewing each student’s grades, Mr. Frank asks if you can pull information about the exact dates and times each student submitted their assignments.
To retrieve this data, run the following SELECT
statement:
+------------+-----------+---------------------+---------------------+---------------------+
| first_name | last_name | midterm_deadline | essay_deadline | finalexam_deadline |
+------------+-----------+---------------------+---------------------+---------------------+
| Arnold | Shortman | 2022-10-16 12:00:00 | 2022-11-20 12:00:00 | 2022-12-11 12:00:00 |
| Helga | Pataki | 2022-10-16 12:00:00 | 2022-11-20 12:00:00 | 2022-12-11 12:00:00 |
| Gerald | Johanssen | 2022-10-16 12:00:00 | 2022-11-20 12:00:00 | 2022-12-11 12:00:00 |
| Phoebe | Heyerdahl | 2022-10-16 12:00:00 | 2022-11-20 12:00:00 | 2022-12-11 12:00:00 |
| Harold | Berman | 2022-10-16 12:00:00 | 2022-11-20 12:00:00 | 2022-12-11 12:00:00 |
| Eugene | Horowitz | 2022-10-16 12:00:00 | 2022-11-20 12:00:00 | 2022-12-11 12:00:00 |
| Rhonda | Lloyd | 2022-10-16 12:00:00 | 2022-11-20 12:00:00 | 2022-12-11 12:00:00 |
| Stinky | Peterson | 2022-10-16 12:00:00 | 2022-11-20 12:00:00 | 2022-12-11 12:00:00 |
+------------+-----------+---------------------+---------------------+---------------------+
8 rows in set (0.00 sec)
After reviewing this output, Mr. Frank sighs and tells you this information is too difficult to parse through. These columns are all set to store TIMESTAMP
values, which is why they’re extremely long. You decide to use the CAST
function to convert the output into something more digestible and also to break up the query into two: one for the dates and one for the times.
To query only for the time students submitted their assignments, run CAST
and specify that you want data values from that particular column to be converted into time
values:
+------------+-----------+----------+----------+-----------+
| first_name | last_name | midterm | essay | finalexam |
+------------+-----------+----------+----------+-----------+
| Arnold | Shortman | 06:30:00 | 03:00:00 | 03:00:00 |
| Helga | Pataki | 10:00:00 | 03:15:00 | 05:00:00 |
| Gerald | Johanssen | 02:00:00 | 02:45:00 | 11:00:00 |
| Phoebe | Heyerdahl | 11:00:00 | 11:15:00 | 11:40:00 |
| Harold | Berman | 08:00:00 | 09:15:00 | 09:15:00 |
| Eugene | Horowitz | 01:00:00 | 01:22:00 | 07:55:00 |
| Rhonda | Lloyd | 06:00:00 | 06:09:00 | 06:45:00 |
| Stinky | Peterson | 03:00:00 | 05:55:00 | 10:11:00 |
+------------+-----------+----------+----------+-----------+
8 rows in set (0.00 sec)
This output provides an overview of the time frames each student finished their assignments. You recall that every assignment is due on a Sunday at midnight. Several students were consistent with their timing, while others finished earlier, or finished slightly before the midnight deadline.
However, this is only half of the information Mr. Frank asked for, so let’s work on the next query which will use the CAST
function to convert those same TIMESTAMP
values to date
values. You’ll run the same query as before, but this time, replace time
with date
:
+------------+-----------+------------+------------+------------+
| first_name | last_name | midterm | essay | finalexam |
+------------+-----------+------------+------------+------------+
| Arnold | Shortman | 2022-10-16 | 2022-11-20 | 2022-12-11 |
| Helga | Pataki | 2022-10-16 | 2022-11-21 | 2022-12-11 |
| Gerald | Johanssen | 2022-10-16 | 2022-11-20 | 2022-12-11 |
| Phoebe | Heyerdahl | 2022-10-16 | 2022-11-20 | 2022-12-11 |
| Harold | Berman | 2022-10-16 | 2022-11-22 | 2022-12-11 |
| Eugene | Horowitz | 2022-10-16 | 2022-11-20 | 2022-12-11 |
| Rhonda | Lloyd | 2022-10-16 | 2022-11-20 | 2022-12-11 |
| Stinky | Peterson | 2022-10-16 | 2022-11-20 | 2022-12-11 |
+------------+-----------+------------+------------+------------+
8 rows in set (0.00 sec)
Based on this output, you can identify which students turned in their assignments after the deadline and why their grades may have been impacted due to fewer points given for tardiness. For example, Helga turned in her assignment a day late (2022-11-21
) and Harold turned in his assignment two days late (2022-11-22
) after the 2022-11-20
deadline for the essay assignment.
Although Mr. Frank is pleased with these cleaned-up results, he needs a bit more help articulating the data further for his grades report. In the next section, you’ll practice using concatenation expressions that can combine multiple literal values or column values into a single string value, helping to make the information clearer to interpret as a complete statement or sentence.
Using Concatenation Expressions
With the concatenation expression, CONCAT
, you can manipulate data by bringing character or numerical values together from different columns into a single result.
Generally, SQL databases return the data values in result sets separately in their respective columns. For example, if you were to query for the first_name
and last_name
of the PS 118 students, your output would display as the following:
+------------+-----------+
| first_name | last_name |
+------------+-----------+
| Arnold | Shortman |
| Helga | Pataki |
| Gerald | Johanssen |
| Phoebe | Heyerdahl |
| Harold | Berman |
| Eugene | Horowitz |
| Rhonda | Lloyd |
| Stinky | Peterson |
+------------+-----------+
8 rows in set (0.00 sec)
However, this information isn’t formatted in the way Mr. Frank prefers for the report. Let’s execute another query using concatenation to combine the first and last names of the students into a single string. The following query does this using the CONCAT
keyword and also provides the resulting column with the alias full_names
:
+-----------------+
| full_names |
+-----------------+
| ArnoldShortman |
| HelgaPataki |
| GeraldJohanssen |
| PhoebeHeyerdahl |
| HaroldBerman |
| EugeneHorowitz |
| RhondaLloyd |
| StinkyPeterson |
+-----------------+
8 rows in set (0.00 sec)
Concatenation expressions generally work on all data types, but if you don’t specify details such as spacing between data values, the output will appear as a run-on in a single string as evidenced by this output. To fix this, you can add a pair of single quotes with a space between them (' '
) in between the two first_name
and last_name
columns so that the values will output as a single string, but this time with a space between them to make it more readable:
+------------------+
| full_name |
+------------------+
| Arnold Shortman |
| Helga Pataki |
| Gerald Johanssen |
| Phoebe Heyerdahl |
| Harold Berman |
| Eugene Horowitz |
| Rhonda Lloyd |
| Stinky Peterson |
+------------------+
8 rows in set (0.00 sec)
By including a space between the single quotes in the query, the output now shows the students’ names clearly as two separate words, rather than one combined word.
Note: Most modern relational DBMSs use the syntax outlined in this section for concatenating values. However, this syntax (the CONCAT
keyword) is not the traditional syntax defined by the SQL standard.
The traditional way to concatenate values in SQL is to place a pair of vertical bars between the data values you want to concatenate. MySQL doesn’t allow you to use this syntax at all, but some DBMSs, like PostgreSQL, allow you to use either method. The following example query (run on a PostgreSQL database) produces the same result as the previous query, but this time using vertical bars:
full_name
------------------
Arnold Shortman
Helga Pataki
Gerald Johanssen
Phoebe Heyerdahl
Harold Berman
Eugene Horowitz
Rhonda Lloyd
Stinky Peterson
(8 rows)
Now let’s try another example where we’ll retrieve more information about each student. This time, we want to concatenate the first_name
, last_name
, e-mail_address
, finalexam_grade
, and finalexam_submitted
data values into a single column using CONCAT
. Don’t forget to add single quotes between each column you want to add a space between as in the following example:
+-----------------------------------------------------------------+
| student_info |
+-----------------------------------------------------------------+
| Arnold Shortman [email protected] 82.5 2022-12-11 03:00:00 |
| Helga Pataki [email protected] 90.0 2022-12-11 05:00:00 |
| Gerald Johanssen [email protected] 88.1 2022-12-11 11:00:00 |
| Phoebe Heyerdahl [email protected] 100.0 2022-12-11 11:40:00 |
| Harold Berman [email protected] 90.9 2022-12-11 09:15:00 |
| Eugene Horowitz [email protected] 98.2 2022-12-11 07:55:00 |
| Rhonda Lloyd [email protected] 95.5 2022-12-11 06:45:00 |
| Stinky Peterson [email protected] 73.2 2022-12-11 10:11:00 |
+-----------------------------------------------------------------+
8 rows in set (0.00 sec)
Mr. Frank is pleased with these results but would like to make it even more concise for his grades report by converting some of the data values. In this scenario, you’ll use the CAST
function to convert the finalexam_grade
data type into a whole number and the finalexam_submitted
, which uses the TIMESTAMP
data type, into a date
value as in the following:
+-----------------------------------------------------+
| student_info |
+-----------------------------------------------------+
| Arnold Shortman [email protected] 82 2022-12-11 |
| Helga Pataki [email protected] 90 2022-12-11 |
| Gerald Johanssen [email protected] 88 2022-12-11 |
| Phoebe Heyerdahl [email protected] 10 2022-12-11 |
| Harold Berman [email protected] 90 2022-12-11 |
| Eugene Horowitz [email protected] 98 2022-12-11 |
| Rhonda Lloyd [email protected] 95 2022-12-11 |
| Stinky Peterson [email protected] 73 2022-12-11 |
+-----------------------------------------------------+
8 rows in set, 8 warnings (0.00 sec)
Let’s take it a step further and write a query that uses the CAST
function and the concatenation expression to output full sentences. You can do this by writing a short statement between the single quotes. Be sure to maintain a space between each column by adding one space before and after the end of the written statement(s):
+------------------------------------------------------------------------------------------------------------------------------------+
| student_info |
+------------------------------------------------------------------------------------------------------------------------------------+
| Arnold Shortman can be contacted at [email protected] and received a grade of 82 after submitting the final exam on 2022-12-11 |
| Helga Pataki can be contacted at [email protected] and received a grade of 90 after submitting the final exam on 2022-12-11 |
| Gerald Johanssen can be contacted at [email protected] and received a grade of 88 after submitting the final exam on 2022-12-11 |
| Phoebe Heyerdahl can be contacted at [email protected] and received a grade of 10 after submitting the final exam on 2022-12-11 |
| Harold Berman can be contacted at [email protected] and received a grade of 90 after submitting the final exam on 2022-12-11 |
| Eugene Horowitz can be contacted at [email protected] and received a grade of 98 after submitting the final exam on 2022-12-11 |
| Rhonda Lloyd can be contacted at [email protected] and received a grade of 95 after submitting the final exam on 2022-12-11 |
| Stinky Peterson can be contacted at [email protected] and received a grade of 73 after submitting the final exam on 2022-12-11 |
+------------------------------------------------------------------------------------------------------------------------------------+
8 rows in set, 8 warnings (0.00 sec)
This output results in full sentences about each student in Mr. Frank’s class. These minor additions in between the single quotes make a big difference in making it clear who this information is about and their respective data. This is because of the specific columns you retrieved data from in your query. Mr. Frank cannot thank you enough for your incredible work. He’s extremely pleased that you’ve even saved him time by writing out full sentences that he can quickly input into his report.