Home IT Linux Windows Database Network Programming Server Mobile  
           
  Home \ Database \ Oracle 11g statistics collection - collection of multi-column statistics     - fcntl file locking function add (Programming)

- Linux System Getting Started Learning: modify environment variables in Linux PATH (Linux)

- Linux dd command make U disk boot disk (Linux)

- Hadoop 2.6.0 stand-alone / pseudo-distributed installation (Server)

- How to configure MariaDB replication in CentOS Linux (Database)

- To install Git on Ubuntu systems (Server)

- Setting the RedHat9 Intrusion Detection System (Linux)

- C ++ precision performance test function (Programming)

- Use FFmpeg processing high quality GIF (Programming)

- Report generation CPU, memory, and input and output ports with the sar command (Linux)

- Ubuntu development Nodejs (Linux)

- Linux Oracle delete archived logs (Database)

- MySQL Statistics (Database)

- How to install Kernel 4.0.2 on CentOS 7 (Linux)

- Installation on Ubuntu class Winamp audio player Qmmp 0.9.0 (Linux)

- Linux in order to make NMAP hide and seek with the firewall (Linux)

- Linux Detailed instructions alias settings (Linux)

- Spring REST Exception Handling (Programming)

- See how --nand flash timing diagram of a read operation Comments (Programming)

- Advanced network security tips Linux backdoor Technology and Practice (Linux)

 
         
  Oracle 11g statistics collection - collection of multi-column statistics
     
  Add Date : 2018-11-21      
         
       
         
  We write SQL statements when sometimes encounter where clause followed by a plurality of conditions, that is, according to the conditions screened multiple columns of data. By default, Oracle will multicolumn selectivity (selectivity) multiplying the selectivity to give where statements such selective variations may make Oracle is not accurate enough, causing the optimizer to make wrong judgments. For example, for automobile manufacturers and car models, in fact, there is a relationship, once you know the car's model, which can determine a car manufacturer. Another example star hotels and hotel price level has a similar correspondence. In order to allow the optimizer to make accurate judgments, to generate accurate execution plans, oracle introduced the concept of multi-column statistics in the 11g database.

Optional: In this case, 1 / unique value

We have a table BOOKS, two columns hotel_id, rate_category, we look at the distribution of data in these two columns:
SQL> select hotel_id, rate_category, count (1) from books
2 group by hotel_id, rate_category
3 order by hotel_id;

HOTEL_ID RATE_CATEGORY COUNT (1)
---------- ------------- ----------
101119943
101239385
101320036
20215106
202210041
20235039

6 rows selected.

Check the data carefully: rate_category hotel_id 10 columns contain only 11, 12 and 13, and hotel_id 20 of this column contains only 21, 22 and 23 (11, 12 and 13 do not contain a). why

What? May be due to the star about the hotel. Hotel 20 is a high priced hotel, and rental grades 11, 12 and 13 is a lower class, and therefore they are not suitable for a high fee hotel. Similarly,

21, 22 and 23 are higher rent levels, so they are not available for this hotel 10 hotels. Moreover, the hotel room is more than 10 hotels in 20 predetermined number.

Create an index on two columns of the table books, and collect statistics on the table.
SQL> create index book_idx1 on books (hotel_id);
Index created.

SQL> create index book_idx2 on books (rate_category);
Index created.

SQL> analyze table books compute statistics;
Table analyzed.

If we want to find a table that meet the conditions of 20 hotels price level is a record 21, the execution plan look like?
SQL> set autotrace trace exp
SQL> select hotel_id, rate_category from books where hotel_id = 20 and rate_category = 21;

Execution Plan
-------------------------------------------------- --------
Plan hash value: 2688610195

-------------------------------------------------- -------------------------
| Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time |
-------------------------------------------------- -------------------------
| 0 | SELECT STATEMENT | | 8296 | 33184 | 47 (3) | 00:00:01 |
| * 1 | TABLE ACCESS FULL | BOOKS | 8296 | 33184 | 47 (3) | 00:00:01 |
-------------------------------------------------- -------------------------

Predicate Information (identified by operation id):
-------------------------------------------------- -

1 - filter ( "RATE_CATEGORY" = 21 AND "HOTEL_ID" = 20)

SQL> set autotrace off

SQL> select count (1) from books;

COUNT (1)
----------
99550

SQL> select 99550/8296 from dual;

99550/8296
----------
11.9997589

As can be seen from the above example, oracle chose to go full table scan, determines the number of records is 8296, and I watch it in the real data is 5106, for the entire table 99550 records, it should be used to index. But the oracle did not, because oracle two columns will be considered separately, and the selectivity is calculated hotel_id 1/2, rate_category 1/6, resulting in a statement selectivity is 1/12, which also

We see 8296 (1 * 99550/12) records in the execution plan causes.

In order to allow the implementation of oracle accurate records, we can take two ways
1. Use the package dbms_stats new function create_extended_stats create a virtual column, then on the table to collect statistics.
As follows:
dbms_stats.create_extended_stats ( 'SCOTT', 'BOOKS', '(HOTEL_ID, RATE_CATEGORY)')
The next time you re-collect the statistics table will automatically collect multi-column statistics for your column groups.

2. Direct method_opt specified in the package dbms_stats, when the collection of statistical information, the combination of columns as a single column uses

Here we use the second method
SQL> begin
2 dbms_stats.gather_table_stats (
3 ownname => 'SCOTT',
4 tabname => 'BOOKS',
5 estimate_percent => 100,
6 method_opt => 'FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS (HOTEL_ID, RATE_CATEGORY)',
7 cascade => TRUE
8 );
9 end;
10 /

PL / SQL procedure successfully completed.

After collecting the Plan of Implementation End column group statistics, look at the statement
SQL> set autotrace trace exp
SQL> select hotel_id, rate_category from books where hotel_id = 20 and rate_category = 21;

Execution Plan
-------------------------------------------------- --------
Plan hash value: 1484887743

-------------------------------------------------- ---------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time |
-------------------------------------------------- ---------------------------------------
| 0 | SELECT STATEMENT | | 5106 | 30636 | 19 (0) | 00:00:01 |
| * 1 | TABLE ACCESS BY INDEX ROWID | BOOKS | 5106 | 30636 | 19 (0) | 00:00:01 |
| * 2 | INDEX RANGE SCAN | BOOK_IDX2 | 5106 | | 11 (0) | 00:00:01 |
-------------------------------------------------- ---------------------------------------

Predicate Information (identified by operation id):
-------------------------------------------------- -

1 - filter ( "HOTEL_ID" = 20)
2 - access ( "RATE_CATEGORY" = 21)

The output clearly shows the index BOOK_IDX2 used. Why now use the index? Note that "Rows" column value (5106) below. Optimization program correctly identified an estimate of the number of combinations of values ​​of the line, rather than the estimated value of the number of separate individual values ​​row.

Of course, for other conditions, oracle can also make accurate judgments

SQL> set autotrace trace exp
SQL> select hotel_id, rate_category from books where hotel_id = 10 and rate_category = 12;

Execution Plan
-------------------------------------------------- --------
Plan hash value: 2688610195

-------------------------------------------------- -------------------------
| Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time |
-------------------------------------------------- -------------------------
| 0 | SELECT STATEMENT | | 39385 | 230K | 47 (3) | 00:00:01 |
| * 1 | TABLE ACCESS FULL | BOOKS | 39385 | 230K | 47 (3) | 00:00:01 |
-------------------------------------------------- -------------------------

Predicate Information (identified by operation id):
-------------------------------------------------- -

1 - filter ( "RATE_CATEGORY" = 12 AND "HOTEL_ID" = 10)

In view USER_STAT_EXTENSIONS, you can see the extended statistics defined in the database:
SQL> select extension_name, extension
2 from user_stat_extensions
3 where table_name = 'BOOKS';

EXTENSION_NAME EXTENSION
------------------------------ -------------------- -------------------------------------------------- ----------
SYS_STUW3MXAI1XLZHCHDYKJ9E4K90 ( "HOTEL_ID", "RATE_CATEGORY")
     
         
       
         
  More:      
 
- Ubuntu compiler installation R Full Record (Linux)
- Python class of operator overloading (Programming)
- CentOS / Linux kernel upgrade (Linux)
- Virtual Judge structures under Ubuntu 14.04 (Server)
- Create a custom pixel format based on an existing image data BufferedImage (Programming)
- Linux System Getting Started Learning: hard disk partition, and to deal with traps (Linux)
- Node.js simple interface server (Server)
- CRF ++ Linux use (Linux)
- RHEL5 multipath configuration (Linux)
- Kickstart automated installation and deployment RHEL 7.0 (Linux)
- Why do you need close contact Rust 1.0 (Programming)
- pdf.js Tutorial (Linux)
- Python programmers most often committed ten errors (Programming)
- Java String and StringBuffer and StringBuilder Comments (Programming)
- MySQL Tutorial: Using tpcc-mysql pressure measurement (Database)
- Docker: Mirroring and container operations (Server)
- Let OpenCV face detection score output codes (Programming)
- Build ASP.NET 5 development environment in Ubuntu (Server)
- Oracle restrict certain IP, the malicious user actions on important table (Database)
- Mounting kit under Fedora Linux (Linux)
     
           
     
  CopyRight 2002-2016 newfreesoft.com, All Rights Reserved.