How can I optimize a Mysql query that searches for rows in a certain date range

De openkb
Aller à : Navigation, rechercher

Sommaire

Questions

Here is the query:

select timespans.id as timespan_id, count(*) as num
 from reports, timespans
 where  timespans.after_date >=  2011-04-13 22:08:38  and
        timespans.after_date <= reports.authored_at and
        reports.authored_at < timespans.before_date
 group by timespans.id;

Here are the table defs:

CREATE TABLE `reports` (
  `id` int(11) NOT NULL auto_increment,
  `source_id` int(11) default NULL,
  `url` varchar(255) default NULL,
  `lat` decimal(20,15) default NULL,
  `lng` decimal(20,15) default NULL,
  `content` text,
  `notes` text,
  `authored_at` datetime default NULL,
  `created_at` datetime default NULL,
  `updated_at` datetime default NULL,
  `data` text,
  `title` varchar(255) default NULL,
  `author_id` int(11) default NULL,
  `orig_id` varchar(255) default NULL,
  PRIMARY KEY  (`id`),
  KEY `index_reports_on_title` (`title`),
  KEY `index_content_on_reports` (`content`(128))

CREATE TABLE `timespans` (
  `id` int(11) NOT NULL auto_increment,
  `after_date` datetime default NULL,
  `before_date` datetime default NULL,
  `after_offset` int(11) default NULL,
  `before_offset` int(11) default NULL,
  `is_common` tinyint(1) default NULL,
  `created_at` datetime default NULL,
  `updated_at` datetime default NULL,
  `is_search_chunk` tinyint(1) default NULL,
  `is_day` tinyint(1) default NULL,
  PRIMARY KEY  (`id`),
  KEY `index_timespans_on_after_date` (`after_date`),
  KEY `index_timespans_on_before_date` (`before_date`)

And here is the explain:

+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+----------------------------------------------+
| id | select_type | table     | type  | possible_keys                                                | key                           | key_len | ref  | rows   | Extra                                        |
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+----------------------------------------------+
|  1 | SIMPLE      | timespans | range | index_timespans_on_after_date,index_timespans_on_before_date | index_timespans_on_after_date | 9       | NULL |     84 | Using where; Using temporary; Using filesort | 
|  1 | SIMPLE      | reports   | ALL   | NULL                                                         | NULL                          | NULL    | NULL | 183297 | Using where                                  | 
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+----------------------------------------------+

And here is the explain after I create an index on authored_at. As you can see, the index is not actually getting used (I think...)

+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+------------------------------------------------+
| id | select_type | table     | type  | possible_keys                                                | key                           | key_len | ref  | rows   | Extra                                          |
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+------------------------------------------------+
|  1 | SIMPLE      | timespans | range | index_timespans_on_after_date,index_timespans_on_before_date | index_timespans_on_after_date | 9       | NULL |     86 | Using where; Using temporary; Using filesort   | 
|  1 | SIMPLE      | reports   | ALL   | index_reports_on_authored_at                                 | NULL                          | NULL    | NULL | 183317 | Range checked for each record (index map: 0x8) | 
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+------------------------------------------------+

There are about 142k rows in the reports table, and far fewer in the timespans table.

The query is taking about 3 seconds now.

The strange thing is that if I add an index on reports.authored_at, it actually makes the query far slower, about 20 seconds. I would have thought it would do the opposite, since it would make it easy to find the reports at either end of the range, and throw the rest away, rather than having to examine all of them.

Can someone clarify? I m stumped.

Answers

Instead of two separate indexes for the timespan table, try merging them into a single multi-column index with before_date and after_date in a single index. Then add that index to authored_at as well.

Source

License : cc by-sa 3.0

http://stackoverflow.com/questions/5658639/how-can-i-optimize-a-mysql-query-that-searches-for-rows-in-a-certain-date-range

Related

Outils personnels
Espaces de noms

Variantes
Actions
Navigation
Outils