Getting a good Apdex Score largely depends on making sure your application returns content at a sufficient speed. While there are a wide number of contributing factors that can affect the speed of your site, one often overlooked factor is how applications handle data that lives within a database. If handled incorrectly, it can severely slow down your site.
A basic rule in working with data is to render unto the database what rightfully belongs to the database. What this means is that databases are best for handling data; any manipulations or calculations off of stored data are often handled better in the database. The application layer should then be used for manipulating and presenting information within the database.
These days, it has become common for many developers to convert tables and data structures into objects and manipulate them with code. From a developer’s perspective, this makes sense; it is easy to work with objects, and there is little need to work in a different language (SQL).
However, SQL is well suited to handle data more efficiently than other coding languages.
In this article, we will cover a few basic techniques on how to optimize your databases and point out a few pitfalls that developers can fall into during the process of requesting data.
Efficient indexing is essential if you have several tables. When you have more than one table and need to search for specific fields, it makes sense to index any field used to retrieve information. This includes using primary keys and foreign keys.
You should consider creating an index using these keys:
*required
Creating an index using these keys will help speed up the process of retrieving data.
As a developer, you might be tempted to run several queries against a database by calling selects (or updates, etc.) in a loop. This is a fairly common mistake.
Here is an example of such a mistake:
Imagine you create a loop for each row in your table, and then you submit this data into a database.
for (int i = 0; i < 1000; i++)
{
DB QUERY: ("insert INTO user ('name', 'address', 'email') values...");
}
For each loop created, a query is run independently. Starting, committing, and stopping queries takes time. If you need to request data for only one or two rows, this isn’t much of an issue; however, if you are submitting, say 1000 rows (as in our example above), that means that 1000 separate queries need to run, which can have a negative impact on the performance of your database.
A better approach would be to handle all inserts at once using SQL. Create one query to insert the data into the database. If you need to loop through multiple rows, do that first to create an array and then use that array to create only one query.
For example, once you have your data, you can submit it like so:
insert into user (name,address,email) values (1,2,3),(4,5,6),(7,8,9)
Be careful to only update rows where changes have been made. Do not update the entire database if only a few hundred rows are being changed. This will lead to a positive return on speed, thereby potentially improving your Apdex score.
A common practice that many developers use when querying tables is to use a wildcard to bring in all of the keys for reference.
For example:
select * from user
This is not an efficient approach, particularly when a table is large and has several keys. Instead, it is better to call individual keys from the table, like so:
select name, email from user
This will speed up your application considerably.
A common mistake made while querying for data is to use correlated subqueries. As you'll see from the example below, this piece of code forces the database to process a separate query against the business table anytime a user is found.
SELECT u.name,
u.email,
(SELECT business_name FROM business WHERE ID = u.business_id) AS business
FROM user u
Instead of using a sub-select, you can obtain data by joining together the tables by connecting them with a common key.
SELECT u.name, u.email, b.business_name
FROM business b
LEFT join business b
ON u.business_id = b.id
The "LEFT join" will bring back all data in the table, regardless of whether there is a matching record or not. What you end up with is all the data at once. But, when you use the id on the business table and link it to the "business_id" on the user table, you can easily bring in the business names.
This runs considerably faster because it is only one query, as opposed to running a subquery for each and every row on the user table.
While temp tables can be handy for organizing and joining data into meaningful subgroups, they also have some drawbacks. Creating temporary tables in SQL uses up memory and, therefore, can slow down the processing time and your application's performance.
Here is an example of ways temp tables have been used:
SELECT * INTO #temp FROM user WHERE state_id = 'IL'
SELECT a.state_name, t.name FROM area a
JOIN #temp t
ON t.state_id = a.state_id
One way of dealing with this is to use subqueries instead, which allows you to obtain this data more easily and with fewer resources.
Here is an example of this same temp table coded in a more efficient way:
SELECT a.state_name, t.name
FROM area a
JOIN (SELECT name, state_id FROM user WHERE state_id = 'IL') AS t
ON t.state_id = a.state_id
When the temp table is set up like this, it will provide similar results to the one above, but in a fraction of the time.
A slow database can have a significant and negative impact on your Apdex Score if it is not optimized correctly. But how do you know if your application is running slow or that you've done enough to fix it? Investing in performance monitoring can help.
Stay up-to-date with how your application is running with Airbrake Performance Monitoring. Our version of Performance Monitoring includes tools developers need to understand their application's health, such as Apdex Score, detailed route information, request and response times, and so much more. See for yourself with a free 14-day Airbrake trial or log into your existing Airbrake account.