Data Validation In SQL: 5 Powerful Data Validation Techniques
From Regular Expressions to OnValidate Events: Using Powerful SQL Data Validation Techniques In Five
Data validation is an essential part of web application development. Clean data, usually collected through forms, is an essential backbone of enterprise IT. And whether you are a developer building a form, or an end-user filling in a form: I'm sure everyone has had their fair share of form frustrations.
Poorly designed forms and inadequate data validation at the source are probably the reason that in any given data science project, one-quarter of the time is spent on data cleansing. Incorrect date formats, invalid email addresses, or unclear questions: poorly captured data are a major drain on time for back-end developers, DBAs, or data and business analysts.
And for those building forms, it is rarely a joy: even though there are form libraries in React, or Form Helpers in Ruby on Rails, it is still hard and tedious work to develop good forms.
With easy-to-use online database builders, such as Five, developers can rapidly build & deploy web applications on a MySQL database and add powerful data validation techniques to their applications.
In this blog post, we will delve into the world of data validation in SQL, exploring various techniques, best practices, and examples. Whether you're a database administrator, a data engineer, or a SQL enthusiast, this guide will equip you with the knowledge and tools needed to validate data effectively within your databases.
By the end of this blog post, you'll have a solid understanding of data validation in SQL, along with practical insights and techniques to ensure the reliability and accuracy of your data. So, let's dive in and unlock the power of SQL for data validation!
What is Data Validation in SQL?
In today's data-driven world, the importance of reliable and accurate data cannot be overstated. As organizations rely heavily on data for decision-making and analysis, it becomes crucial to establish robust mechanisms to validate the integrity and quality of data. This is where data validation in SQL comes into play.
Data validation refers to the process of examining and verifying data to ensure it meets specific criteria, such as data type, range, format, or any other predefined rules. While SQL (Structured Query Language) is commonly associated with querying and manipulating data, it also offers powerful features and techniques to validate data, making it a valuable tool for maintaining data integrity and accuracy.
Data Storage
Let's start by looking at the most fundamental layer of data capture first: data storage and infrastructure.
The most fundamental question of any application development project is this: Where is my data stored?
Entry-level form builders, such as Google Forms store data in a web-hosted spreadsheet. This is the ideal solution for small research projects, surveys, or questionnaires in your local football club: Should Johnny or Dave be the new captain? Do we prefer pink or blue jerseys?
But spreadsheets are not a solution that can support businesses, let alone large-scale data capture.
The best place to store data is open-source, relational databases. The two most popular databases are MySQL and PostgreSQL, according to StackOverflow's Annual Developer Survey.
Unlike spreadsheets, relational databases store data in a consistent, highly-efficient, safe, and secure manner. They are also scalable and portable, meaning they can support millions of read/write operations, and data can be moved, copied, or transferred easily from one database, storage, or IT environment to another.
Just like a spreadsheet, a database needs to be hosted (or stored) somewhere. Nowadays, this is usually on infrastructure provided by one of the three big cloud providers: AWS, Azure, or GCP. All three cloud providers offer different flavors of essentially the same service, i.e. a web-hosted and managed relational database (AWS RDS, Azure Database Server, or GCP Cloud SQL).
This setup - an open-source, relational database hosted by one of the big three cloud providers - gives your application development project a tried-and-tested, scalable application backend. In case you're wondering, just how "tried-and-tested and scalable" this setup is, here's a list of companies using MySQL on StackShare. The list includes Uber, Netflix, and Twitter.
And if now you are thinking: "We're just a small business and nowhere near the scale of Netflix", don't underestimate how quickly data is growing over time, especially if you are in a transactional business. Data can also grow to due to unforeseen circumstances, such as the Covid-pandemic, when Excel reached breaking point at health care providers.
How Five Stores Data: Build on an AWS-Hosted MySQL Database
Every application built with Five can be launched onto AWS, using a cloud-hosted MySQL database.
Sign up for Five's free download to develop and test applications locally free of charge. When applications are deployed to production, the application and its database are moved from local storage onto AWS. Note that deploying applications is a paid feature of Five.
SQL Data Validation: Ensuring Data Integrity and Accuracy
There are multiple ways to validate data in SQL. We will explore four data validation techniques in this blog post:
Using Built-In SQL Constraints such as
UNIQUE
orNOT NULL
.Through Data & Display Types, such as strings, texts, booleans, or ratings.
Through Regular Expressions (RegEx).
Through Events and JavaScript functions.
Through CRUD permissions, which govern data access.
1. Using Built-In Constraints for SQL Data Validation
SQL constraints are a simple and effective way to put constraints on the data that your database permits end-users to store. They are the most fundamental layer of data validation in SQL. Through constraints, developers can ensure that fields are properly populated or prevent duplicate or redundant data.
PRIMARY KEY
, FOREIGN KEY
, UNIQUE
and NOT NULL
are some of the most frequently used built-in SQL constraints.
These constraints are applied to a database field inside a CREATE
or ALTER
statement. For example:
CREATE TABLE Customer (
ID int NOT NULL,
Name varchar(255) NOT NULL,
Age int,
UNIQUE (ID)
);
Note how the ID field is both NOT NULL
and UNIQUE
.
The same outcome can easily be accomplished in Five, using Five's table wizard for MySQL.
To create a field with a NOT NULL
constraint, follow these four steps:
Inside one of your application, or a newly created application, go to Data \> Tables
Click on the New Table Wizard
Give your table a name, and click the Plus icon to add fields to it.
As you create your field, note how there is a checkbox for Req. By checking or unchecking this box, you can define whether the field requires an input (and can thus be
NULL
orNOT NULL
)
Primary Keys
Five automatically creates primary keys when a table is created. As Primary Keys are always unique and can never be empty, Five also automatically applies the UNIQUE
and NOT NULL
constraints to the PRIMARY KEY
field. For those database aficionados, Five's Primary Keys are Globally Unique Identifiers (GUID), or a 128-bit globally unique ID.
To understand how to check the UNIQUE constraint in Five, follow these steps:
Go to Data \> Tables
Select any of your tables, and then click on Indices. You will now see the primary and (if applicable) foreign keys inside your database table.
Click on the row that says Primary inside the Key Type column.
You can now see the
UNIQUE
constraint applied to yourPRIMARY KEY
.
Note that you can select multiple fields with a UNIQUE
constraint. If you choose multiple fields, the combined values in these fields must be unique: individually they may not be unique, but together they must.
2. Data & Display Types: Defining How Data is Stored and Displayed
As we created our NOT NULL field above, you might have noticed that Five's Form Wizard presented us with two drop-down menus: one for data type, and one for default display type.
Data and display types define how data is stored inside the database and displayed to end users. Five supports all commonly used SQL data types, such as strings, integers, floats, etc.
Data types aren't visible to the end user. They simply tell the database how to treat data: is something a string, an integer, or a password?
Display types, on the other hand, are visible to the end user, and can range from something as simple as a form field that accepts text to a date picker or a five-star rating input. The display type defines what the end-user can submit to the database using the application's user interface.
Custom Display Types
In addition, developers can create custom display types in Five.
For example, if your end-user is asked to submit satisfaction with a service on a scale from 0 to 100, this can be created by enforcing minimum and maximum values of 0 and 100 respectively. If an end-user inserts any different input, a pop-up with an error message will appear. The same can be achieved by defining the accepted length of the input. Say you're asking for a postcode and only accept four-digit postcodes. Then a display type with a required length of four characters can be created and applied to a field.
To see custom display types in action, watch this video on validating a field's input with a custom display type.
3. RegularExpressions (RegEx): Validating Data Patterns Using Pattern Matching
A third way to validate data is regular expressions (RegExs). Regular Expressions are a powerful tool to describe data patterns. If an end user provides an input that does not correspond to the pattern, the input will not be accepted.
By defining a permitted input, Regular Expressions are more granular than data or display type. For example, a database field can be defined as a string, accepting any text input. But what if you would only like to accept a text input that is an email address? That's where Regular Expressions come in.
To ensure that users can only enter email addresses, developers can define a custom display type inside Five. To do so,
Click on Setup.
Click on Display Types and then on the Plus icon.
Give your custom display type a Name and select Display Type text.
Scroll down and drag the sliding button for Regular Expression to right.
Now provide the Regular Expression inside the Mask field.
The regular expression that checks for a valid email address looks like this:
^(([^<>()[\]\\.,;:\s@”]+(\.[^<>()[\]\\.,;:\s@”]+)*)|(“.+”))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$
This probably looks a bit intimidating. However, Regular Expressions can be easily found or created online with helpful tools, such as this RegEx builder.
4. Functions and OnValidate Events
JavaScript or TypeScript functions and OnValidate events can be used in Five to validate data.
Using a function goes beyond the more general validations described above. For example, a function could check if there is enough stock to fulfill an order: if yes, the order is processed. If no, an error occurs. Functions incorporate logic, rather than just patterns as Regular Expressions do. Through the use of functions, developers can write highly business-specific validations.
To write a function in Five, follow these steps:
Click on Logic.
Click on Code Editor.
Click on the Plus icon on the right side of the screen.
Once you have named your function and selected your favorite programming language (JavaScript or TypeScript), you can now write functions in Five.
These functions can be attached to events almost anywhere in Five. Server- or client-side events can be used to execute functions, for example, when someone exits a form, or when someone completes a purchase by clicking a button.
Five also handles errors gracefully: developers can write custom error message messages which appear when the validation fails.
To better understand how to tie your function to an OnValidate event, watch this video on how to validate a field's input in Five:
5. Permissions & Access Control
One last advantage of relational databases is that they can come with very granular user access and permissions.
An acronym that is often used in this context is CRUD: Create, Read, Update, and Delete. These are the four basic database operations.
Inside Five, CRUD permissions can be granted on the table level, meaning a particular user role might only be able to create new records into a table, but cannot read, update or delete any data stored inside the table.
For example, say real estate agents are only allowed to create new listings in a property-for-sale database application. This can easily be accomplished using Five's Roles.
Click on Setup.
Then click on Roles.
Add a new user role by clicking the yellow Plus button.
Give the user role a descriptive name, such as Agent.
Next click on Permissions and on the Plus icon to the right.
You can now assign CRUD permission by selecting a Table from the drop-down and ticking the Create, Read, Update, and Delete boxes as required.
Permissions can be even more granular than just on the table level. By assigning a User Table Key Field, users can also be given access to particular records inside a table.
For example, you are developing a real estate business with four agents. Each agent is in charge of one area of town: North, South, East, or West. Even though all data is stored in one table, each user can only be given access to their territory's data by using the User Table Key Field.
Lastly, permissions can be on the UI / UX level. This can be accomplished by giving a particular role access to a certain Menu only. Sticking with the real estate example, you might want interested property buyers to have access to your application and browse your listings. This could be accomplished by creating a separate menu for buyers that only contains listings in it.
Next Steps: Go from SQL Data Validation to SQL Data Visualization 📊
By using the five powerful SQL data validation techniques above, you can use Five to create user-friendly web applications that store and collect clean data.
The next step after having created clean data is usually data visualization. Read our blog post on SQL Dashboards and SQL Data Visualization to learn how to build reports, charts and dashboards in Five.