We've got another video in our series on common webapp performance problems. Today we're going to use TraceView to investigate ORMs, and their performance, within a PHP application.
Our PHP application is built on the CodeIgniter framework, and we're using CodeIgniter's built-in ORM. ORMs can be great: they simplify your code, allow cleaner abstractions, and even help prevent some common security issues, like SQL injection.
Latency breakdown of our PHP application
The application we're looking at here is a DB-heavy app -- the largest bar in the chart above is the PDO layer. We're looking at about 3,000 requests in the last 24 hours and, on average, each request spends about 300ms in the database, and only 100ms in PHP itelf. We can see more about what the DB is doing by clicking on that layer.
Average query time in PDO. These times are per query, not per request, as above.
In addition to removing the other layers from our app, we've also changed what we're looking at slightly. Instead of seeing that 300ms time, we're now only seeing an average of less than 1ms. The reason for that is we're now looking at individual calls to the database, instead of the total time per request. This change is also reflected at the top of the page -- instead of just 3,500 requests, we actually see nearly 800k database queries being made over the same time period.
Distribution of PDO query times
We can also look at this data through the heatmap, which confirms what we were seeing in the average. The upper bound, which defaults to the 98th percentile, is just around 50ms. The vast majority of calls to our database are fast. To understand why our requests are still relatively slow, let's switch over to the list of traces, and look at an individual request.
A single request, making nearly 250 calls to the DB
The request view here shows not only the timings of layers within out data, but also the structure. The Trace Notes below the main visualization actually shows us the (fingerprinted) query we made, and how often we made it. The time we saw spent in the database isn't from slow queries; it's from looping over the same fast query more than 250 times.
One of 247 nearly identical queries we made to the database
To see the full query, let's drill down on one of these segments. Along with a backtrace, this exposes the full SQL query. The literals here are the limit and offset parameters generated by the ORM, and we're only selecting a single row on each of these calls! This is a classic issue with ORMs, known as the "N+1" problem. Instead of doing a join or a finite number of queries to each table we need information from, we're selecting a number of objects from the DB, then accessing a property on each object that requires a fetch from a separate table.
We could make this much faster by rewriting this data access to batch up these calls. That would reduce this down to merely 2 database queries, avoiding the overhead of those 250 trips, and give us a much more responsive website.