Home PC Games Linux Windows Database Network Programming Server Mobile  
           
  Home \ Server \ Elasticsearch 2.20 Beginners: aggregation     - Zombie process under Linux (Linux)

- The maximum subsequence algorithm and optimization problems (Programming)

- MySQL5.6.17 compiler installation under CentOS (Database)

- Shell Common Command Summary (Programming)

- apt-get and apt-cache show command examples (Linux)

- Java is simple to read and write HDFS Demo (Programming)

- In-depth summary of the PHP core of object-oriented (Programming)

- Linux Getting Started Tutorial: How to set up a static MAC address on VMware ESXi virtual machine (Mobile)

- JavaScript Advanced Programming notes event capture and event bubbling (Programming)

- Java deserialization test (Programming)

- Analysis examples: Intrusion Response Linux platform Case (Linux)

- Gentoo: existing preserved libs problem solving (Linux)

- ElasticSearch - Basic Concepts (Server)

- VPS xen openvz kvm (Server)

- Installation salt-minion on RHEL5 (Linux)

- MongoDB simple replication configuration (Database)

- Linux environment installation of rvm and ruby (Linux)

- ORA-00911 invalid character error Solution (Database)

- Ubuntu 14.10 Install Ubuntu Touch Music App 2.0 (Linux)

- Do not find ifconfig eth0 and IP address under CentOS6.5 (Linux)

 
         
  Elasticsearch 2.20 Beginners: aggregation
     
  Add Date : 2018-11-21      
         
         
         
  Polymerization (Aggregations) provides the ability to group and statistical documents. Similar polymerization relational database group by grouping functions in Elasticsearch in which the primary aggregate query can get the specific results of the polymerization are polymerized Again, this is a very useful feature. You can get the results many times by one operation of the polymerization, thus avoiding repeated requests, to reduce the burden on the network and servers.

Polymerization (Aggregations) provides the ability to group and statistical documents. Similar polymerization relational database group by grouping functions in Elasticsearch in which the primary aggregate query can get the specific results of the polymerization are polymerized Again, this is a very useful feature. You can get the results many times by one operation of the polymerization, thus avoiding repeated requests, to reduce the burden on the network and servers.

Data Preparation: We insert a few data:

Request: POST localhost:9200/customer/external/?pretty

parameter:

{ "Name": "secisland", "age": 25, "state": "open", "gender": "woman", "balance": 87}

{ "Name": "zhangsan", "age": 32, "state": "close", "gender": "man", "balance": 95}

{ "Name": "zhangsan1", "age": 33, "state": "close", "gender": "man", "balance": 91}

{ "Name": "lisi", "age": 34, "state": "open", "gender": "woman", "balance": 99}

{ "Name": "wangwu", "age": 46, "state": "close", "gender": "woman", "balance": 78}

Wherein the insert 5 as the test data.

Once we have data aggregation test:

Example: all customers grouped by status, then return to the top 10 (the default), press statistics (also the default) Sort:

Request: POST http: // localhost: 9200 / customer / _search pretty?

parameter:

{
  "Size": 0,
  "Aggs": {
    "Group_by_state": {
      "Terms": {
        "Field": "state"
      }
    }
  }
}

This condition is similar to the query in a relational database group by:
SELECT state, COUNT (*) FROM customer GROUP BY state ORDER BY COUNT (*) DESC

Return:

  {
  "Took": 1,
  "Timed_out": false,
  "_shards": {
    "Total": 5,
    "Successful": 5,
    "Failed": 0
  },
  "Hits": {
    "Total": 5,
    "Max_score": 0.0,
    "Hits": []
  },
  "Aggregations": {
    "Group_by_state": {
      "Doc_count_error_upper_bound": 0,
      "Sum_other_doc_count": 0,
      "Buckets": [{
        "Key": "close",
        "Doc_count": 3
      }, {
        "Key": "open",
        "Doc_count": 2
      }]
    }
  }
}

We can be seen, there are two close states customers, users two open states.

Here we are again on the basis of the above increase is a function of the state, while the statistical average balance is calculated for each state.

Request and just the same, but the parameters have changed, consider the following parameters:

{
  "Size": 0,
  "Aggs": {
    "Group_by_state": {
      "Terms": {
        "Field": "state"
      },
      "Aggs": {
        "Average_balance": {
          "Avg": {
            "Field": "balance"
          }
        }
      }
    }
  }
}

Results obtained are as follows:

{
  "Took": 16,
  "Timed_out": false,
  "_shards": {
    "Total": 5,
    "Successful": 5,
    "Failed": 0
  },
  "Hits": {
    "Total": 5,
    "Max_score": 0.0,
    "Hits": []
  },
  "Aggregations": {
    "Group_by_state": {
      "Doc_count_error_upper_bound": 0,
      "Sum_other_doc_count": 0,
      "Buckets": [{
        "Key": "close",
        "Doc_count": 3,
        "Average_balance": {
          "Value": 88.0
        }
      }, {
        "Key": "open",
        "Doc_count": 2,
        "Average_balance": {
          "Value": 93.0
        }
      }]
    }
  }
}

Look carefully at how the nested average_balance gathered in group_by_state ACCUMULATION. This is a common pattern polymerization. Any field can be aggregated to get the results we want again after polymerization.

Look at the following example, we obtain the results of the above-average amount of accounts again in descending order:

Request the same as before:

parameter:

{
  "Size": 0,
  "Aggs": {
    "Group_by_state": {
      "Terms": {
        "Field": "state",
        "Order": {
          "Average_balance": "desc"
        }
      },
      "Aggs": {
        "Average_balance": {
          "Avg": {
            "Field": "balance"
          }
        }
      }
    }
  }
}

Results obtained:

{
  "Took": 1,
  "Timed_out": false,
  "_shards": {
    "Total": 5,
    "Successful": 5,
    "Failed": 0
  },
  "Hits": {
    "Total": 5,
    "Max_score": 0.0,
    "Hits": []
  },
  "Aggregations": {
    "Group_by_state": {
      "Doc_count_error_upper_bound": 0,
      "Sum_other_doc_count": 0,
      "Buckets": [{
        "Key": "open",
        "Doc_count": 2,
        "Average_balance": {
          "Value": 93.0
        }
      }, {
        "Key": "close",
        "Doc_count": 3,
        "Average_balance": {
          "Value": 88.0
        }
      }]
    }
  }
}

This article by the Mosaic Rand (secisland) original, reproduced, please indicate the author and source.

The following example is more complex: demonstrates how age group (ages 20-29, 30-39, 40-49), and then through sex, and finally get every age group, the average account balance each sex:

{
  "Size": 0,
  "Aggs": {
    "Group_by_age": {
      "Range": {
        "Field": "age",
        "Ranges": [
          {
            "From": 20,
            "To": 30
          },
          {
            "From": 30,
            "To": 40
          },
          {
            "From": 40,
            "To": 50
          }
        ]
      },
      "Aggs": {
        "Group_by_gender": {
          "Terms": {
            "Field": "gender"
          },
          "Aggs": {
            "Average_balance": {
              "Avg": {
                "Field": "balance"
              }
            }
          }
        }
      }
    }
  }
}

Check out the return:

{
  "Took": 15,
  "Timed_out": false,
  "_shards": {
    "Total": 5,
    "Successful": 5,
    "Failed": 0
  },
  "Hits": {
    "Total": 5,
    "Max_score": 0.0,
    "Hits": []
  },
  "Aggregations": {
    "Group_by_age": {
      "Buckets": [{
        "Key": "20.0-30.0",
        "From": 20.0,
        "From_as_string": "20.0",
        "To": 30.0,
        "To_as_string": "30.0",
        "Doc_count": 1,
        "Group_by_gender": {
          "Doc_count_error_upper_bound": 0,
          "Sum_other_doc_count": 0,
          "Buckets": [{
            "Key": "woman",
            "Doc_count": 1,
            "Average_balance": {
              "Value": 87.0
            }
          }]
        }
      }, {
        "Key": "30.0-40.0",
        "From": 30.0,
        "From_as_string": "30.0",
        "To": 40.0,
        "To_as_string": "40.0",
        "Doc_count": 3,
        "Group_by_gender": {
          "Doc_count_error_upper_bound": 0,
          "Sum_other_doc_count": 0,
          "Buckets": [{
            "Key": "man",
            "Doc_count": 2,
            "Average_balance": {
              "Value": 93.0
            }
          }, {
            "Key": "woman",
            "Doc_count": 1,
            "Average_balance": {
              "Value": 99.0
            }
          }]
        }
      }, {
        "Key": "40.0-50.0",
        "From": 40.0,
        "From_as_string": "40.0",
        "To": 50.0,
        "To_as_string": "50.0",
        "Doc_count": 1,
        "Group_by_gender": {
          "Doc_count_error_upper_bound": 0,
          "Sum_other_doc_count": 0,
          "Buckets": [{
            "Key": "woman",
            "Doc_count": 1,
            "Average_balance": {
              "Value": 78.0
            }
          }]
        }
      }]
    }
  }
}

As can be seen from the above examples, Elasticsearch aggregation capability is very powerful.
     
         
         
         
  More:      
 
- Quickly locate the mistakes by gdb location (Programming)
- How to make a U disk to install Ubuntu (Linux)
- Spark SQL job of a lifetime (Server)
- Java programmers talk about those advanced knowledge and direction (Programming)
- Getting Started with Linux system to learn: how to install the kernel headers on Linux (Linux)
- Iscsi package is installed on RHEL 6.3 x86-64 systems (Linux)
- How to implement large-scale distributed Yahoo depth study on the Hadoop cluster (Server)
- Make full use of the Raspberry Pi SD card space (Linux)
- Use py2exe to generate exe files Python script (Programming)
- ARM Linux system call (Linux)
- Ubuntu 14.04 / Linux Mint 17 How to install the MintMenu 5.5.2 menu (Linux)
- Set up MySQL master and slave servers under Ubuntu 14.04 (Server)
- Java learning problems encountered (Programming)
- Iptables on the request URL for IP access control (Linux)
- HBase Application Development Review and Summary of Series (Database)
- Ubuntu Install OpenSSL (Linux)
- Install Java JDK 8 in CentOS 7 / 6.5 / 6.4 (Linux)
- Java, on the dfile.encoding Systemproperty (Programming)
- Linux System Getting Started Tutorial: Installing Brother printer in Linux (Linux)
- PHP parsing algorithm of the interview questions (Programming)
     
           
     
  CopyRight 2002-2022 newfreesoft.com, All Rights Reserved.