Home PC Games Linux Windows Database Network Programming Server Mobile  
           
  Home \ Server \ Elasticsearch 2.20 Beginners: aggregation     - Linux System Getting Started Learning: DeVeDe installed on Linux to create a video DVD (Linux)

- Iptables in Ubuntu (Linux)

- Python maketrans () method (Programming)

- Why is better than Git SVN (Linux)

- Offline (local) Yum source structures (Linux)

- Ubuntu install Eclipse can not find JAVA_HOME problem (Linux)

- Install Redis 2.6 5.5 32 position CentOS error resolved (Linux)

- Create Your Own Docker base image in two ways (Linux)

- Oracle VirtualBox Problem Solving Case (Linux)

- Linux scp remote file copy (Linux)

- In the case of using cgroups Ubuntu 14.04 and Docker (Linux)

- Kernel compile under Debian (Linux)

- Flask deploy applications using Nginx on Ubuntu (Server)

- Three kinds of implementation model of the Linux thread history (Programming)

- Linux system started to learn: how to view the Linux thread of a process (Linux)

- Enterprise Encrypting File System eCryptfs Comments (Linux)

- Three minutes to teach you to easily grasp the grep command regular expression (Linux)

- OpenSSH server configuration file for each Common Definition (Server)

- Neo4j map data processing tab (Database)

- Linux boot process and run level (Linux)

 
         
  Elasticsearch 2.20 Beginners: aggregation
     
  Add Date : 2018-11-21      
         
         
         
  Polymerization (Aggregations) provides the ability to group and statistical documents. Similar polymerization relational database group by grouping functions in Elasticsearch in which the primary aggregate query can get the specific results of the polymerization are polymerized Again, this is a very useful feature. You can get the results many times by one operation of the polymerization, thus avoiding repeated requests, to reduce the burden on the network and servers.

Polymerization (Aggregations) provides the ability to group and statistical documents. Similar polymerization relational database group by grouping functions in Elasticsearch in which the primary aggregate query can get the specific results of the polymerization are polymerized Again, this is a very useful feature. You can get the results many times by one operation of the polymerization, thus avoiding repeated requests, to reduce the burden on the network and servers.

Data Preparation: We insert a few data:

Request: POST localhost:9200/customer/external/?pretty

parameter:

{ "Name": "secisland", "age": 25, "state": "open", "gender": "woman", "balance": 87}

{ "Name": "zhangsan", "age": 32, "state": "close", "gender": "man", "balance": 95}

{ "Name": "zhangsan1", "age": 33, "state": "close", "gender": "man", "balance": 91}

{ "Name": "lisi", "age": 34, "state": "open", "gender": "woman", "balance": 99}

{ "Name": "wangwu", "age": 46, "state": "close", "gender": "woman", "balance": 78}

Wherein the insert 5 as the test data.

Once we have data aggregation test:

Example: all customers grouped by status, then return to the top 10 (the default), press statistics (also the default) Sort:

Request: POST http: // localhost: 9200 / customer / _search pretty?

parameter:

{
  "Size": 0,
  "Aggs": {
    "Group_by_state": {
      "Terms": {
        "Field": "state"
      }
    }
  }
}

This condition is similar to the query in a relational database group by:
SELECT state, COUNT (*) FROM customer GROUP BY state ORDER BY COUNT (*) DESC

Return:

  {
  "Took": 1,
  "Timed_out": false,
  "_shards": {
    "Total": 5,
    "Successful": 5,
    "Failed": 0
  },
  "Hits": {
    "Total": 5,
    "Max_score": 0.0,
    "Hits": []
  },
  "Aggregations": {
    "Group_by_state": {
      "Doc_count_error_upper_bound": 0,
      "Sum_other_doc_count": 0,
      "Buckets": [{
        "Key": "close",
        "Doc_count": 3
      }, {
        "Key": "open",
        "Doc_count": 2
      }]
    }
  }
}

We can be seen, there are two close states customers, users two open states.

Here we are again on the basis of the above increase is a function of the state, while the statistical average balance is calculated for each state.

Request and just the same, but the parameters have changed, consider the following parameters:

{
  "Size": 0,
  "Aggs": {
    "Group_by_state": {
      "Terms": {
        "Field": "state"
      },
      "Aggs": {
        "Average_balance": {
          "Avg": {
            "Field": "balance"
          }
        }
      }
    }
  }
}

Results obtained are as follows:

{
  "Took": 16,
  "Timed_out": false,
  "_shards": {
    "Total": 5,
    "Successful": 5,
    "Failed": 0
  },
  "Hits": {
    "Total": 5,
    "Max_score": 0.0,
    "Hits": []
  },
  "Aggregations": {
    "Group_by_state": {
      "Doc_count_error_upper_bound": 0,
      "Sum_other_doc_count": 0,
      "Buckets": [{
        "Key": "close",
        "Doc_count": 3,
        "Average_balance": {
          "Value": 88.0
        }
      }, {
        "Key": "open",
        "Doc_count": 2,
        "Average_balance": {
          "Value": 93.0
        }
      }]
    }
  }
}

Look carefully at how the nested average_balance gathered in group_by_state ACCUMULATION. This is a common pattern polymerization. Any field can be aggregated to get the results we want again after polymerization.

Look at the following example, we obtain the results of the above-average amount of accounts again in descending order:

Request the same as before:

parameter:

{
  "Size": 0,
  "Aggs": {
    "Group_by_state": {
      "Terms": {
        "Field": "state",
        "Order": {
          "Average_balance": "desc"
        }
      },
      "Aggs": {
        "Average_balance": {
          "Avg": {
            "Field": "balance"
          }
        }
      }
    }
  }
}

Results obtained:

{
  "Took": 1,
  "Timed_out": false,
  "_shards": {
    "Total": 5,
    "Successful": 5,
    "Failed": 0
  },
  "Hits": {
    "Total": 5,
    "Max_score": 0.0,
    "Hits": []
  },
  "Aggregations": {
    "Group_by_state": {
      "Doc_count_error_upper_bound": 0,
      "Sum_other_doc_count": 0,
      "Buckets": [{
        "Key": "open",
        "Doc_count": 2,
        "Average_balance": {
          "Value": 93.0
        }
      }, {
        "Key": "close",
        "Doc_count": 3,
        "Average_balance": {
          "Value": 88.0
        }
      }]
    }
  }
}

This article by the Mosaic Rand (secisland) original, reproduced, please indicate the author and source.

The following example is more complex: demonstrates how age group (ages 20-29, 30-39, 40-49), and then through sex, and finally get every age group, the average account balance each sex:

{
  "Size": 0,
  "Aggs": {
    "Group_by_age": {
      "Range": {
        "Field": "age",
        "Ranges": [
          {
            "From": 20,
            "To": 30
          },
          {
            "From": 30,
            "To": 40
          },
          {
            "From": 40,
            "To": 50
          }
        ]
      },
      "Aggs": {
        "Group_by_gender": {
          "Terms": {
            "Field": "gender"
          },
          "Aggs": {
            "Average_balance": {
              "Avg": {
                "Field": "balance"
              }
            }
          }
        }
      }
    }
  }
}

Check out the return:

{
  "Took": 15,
  "Timed_out": false,
  "_shards": {
    "Total": 5,
    "Successful": 5,
    "Failed": 0
  },
  "Hits": {
    "Total": 5,
    "Max_score": 0.0,
    "Hits": []
  },
  "Aggregations": {
    "Group_by_age": {
      "Buckets": [{
        "Key": "20.0-30.0",
        "From": 20.0,
        "From_as_string": "20.0",
        "To": 30.0,
        "To_as_string": "30.0",
        "Doc_count": 1,
        "Group_by_gender": {
          "Doc_count_error_upper_bound": 0,
          "Sum_other_doc_count": 0,
          "Buckets": [{
            "Key": "woman",
            "Doc_count": 1,
            "Average_balance": {
              "Value": 87.0
            }
          }]
        }
      }, {
        "Key": "30.0-40.0",
        "From": 30.0,
        "From_as_string": "30.0",
        "To": 40.0,
        "To_as_string": "40.0",
        "Doc_count": 3,
        "Group_by_gender": {
          "Doc_count_error_upper_bound": 0,
          "Sum_other_doc_count": 0,
          "Buckets": [{
            "Key": "man",
            "Doc_count": 2,
            "Average_balance": {
              "Value": 93.0
            }
          }, {
            "Key": "woman",
            "Doc_count": 1,
            "Average_balance": {
              "Value": 99.0
            }
          }]
        }
      }, {
        "Key": "40.0-50.0",
        "From": 40.0,
        "From_as_string": "40.0",
        "To": 50.0,
        "To_as_string": "50.0",
        "Doc_count": 1,
        "Group_by_gender": {
          "Doc_count_error_upper_bound": 0,
          "Sum_other_doc_count": 0,
          "Buckets": [{
            "Key": "woman",
            "Doc_count": 1,
            "Average_balance": {
              "Value": 78.0
            }
          }]
        }
      }]
    }
  }
}

As can be seen from the above examples, Elasticsearch aggregation capability is very powerful.
     
         
         
         
  More:      
 
- Sublime Text 3 shortcuts summary (Linux)
- Git you do not know about some of the things (Linux)
- ActiveMQ5.10.2 version configuration JMX (Linux)
- When Vim create Python scripts, vim autocomplete interpreter and encoding method (Programming)
- Ubuntu server 8.04 Firewall Guide (Linux)
- Java Prototype Pattern (Programming)
- Use $ BASH ENV variable to mention the right way under Linux (Linux)
- Talk about Java EE Learning (Programming)
- The hash function under OpenSSL (Linux)
- Linux basic introductory tutorial ---- regex basis (Linux)
- Ubuntu 14.04 Fixed update information is outdated error (Linux)
- Linux support exFAT and NTFS (Linux)
- CentOS7 virtual machine creation failed Solution (Linux)
- Install snort intrusion detection system on Debian (Linux)
- Linux 4.0+ kernel support for hardware switching module (HW Switch Offload) (Linux)
- Docker command Detailed (Linux)
- Zypper command for SUSE Linux package management (Linux)
- Linux modify the system time (Linux)
- Linux Apache server security (Linux)
- The execution order of Oracle WHERE condition is not from right to left (Database)
     
           
     
  CopyRight 2002-2022 newfreesoft.com, All Rights Reserved.