Home PC Games Linux Windows Database Network Programming Server Mobile  
           
  Home \ Server \ Elasticsearch 2.20 Highlight     - Docker use Dockerfile created since the launch of the service support SSH container mirror (Server)

- Hadoop 2.7.1 Installation and Configuration under RedHat Linux 6.5 (Server)

- Linux-- sub-volume compression and decompression (Linux)

- HTTPS Encryption Algorithm (Linux)

- JavaScript original values and complex values (Programming)

- Oracle Execute to Parse perform analytical Ratio Analysis (Database)

- Linux foundation tutorial: how to modify the host name on CentOS or RHEL 7 (Linux)

- Linux print file and send mail (Linux)

- Visual Studio Code experience (Linux)

- To explore the caching mechanism for Android ListView (Programming)

- When Linux Detailed time zone and common function of time (Linux)

- C ++ why we chose to use the smart pointer (Programming)

- Httpclient4.4 of principle (Http execution context) (Programming)

- Zabbix installation under Linux (Server)

- A detailed introduction to the Hadoop ecosystem (Server)

- Flow control message transmission between RAC (Database)

- Ubuntu How to install and upgrade Linux Kernel 3.15 (Linux)

- The Rabbitmq installation under CentOS 6.4 (Linux)

- Ubuntu 14.04 installed VirtualBox 4.3 appears vboxdrv: Unknown symbol mcount (Linux)

- Git commands (Linux)

 
         
  Elasticsearch 2.20 Highlight
     
  Add Date : 2018-11-21      
         
         
         
 

Elasticsearch the highlight function is derived from the lucene, he allowed on one or more fields highlighted content search, lucene supports three ways to highlight highlighter, fast-vector-highlighter, postings-highlighter, the first one is the default standard type. The following look at an example, before the search, the first increase in a document.

request: PUT http: // localhost: 9200 / secilog / log / 10 pretty

?

Parameters:

{
"type": "file",
"message": "secilog is a log real-time analyse software, it's full text search is based on Elasticsearch"
}

Once the document is created, we highlighted during the search:

request: POST http: // localhost: 9200 / secilog / log / _search pretty

?

Parameters:


{
    "query": {
        "term": {
            "message": "analyse"
       }
   },
    "highlight": {
        "fields": {
            "message": {}
       }
   }
}

returned the following results:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total" : 1,
    "successful": 1,
    "failed": 0
 },
  "hits": {
    "total": 1,
    "max_score": 0.4232868,
    "hits": [{
      "_index": "secilog",
      "_type": "log",
      "_id": "10",
      "_score": 0.4232868,
      "_source": {
        "type": "file",
        "message": "secilog is a log real-time analyse software, it's full text search is based on elasticsearch"
     },
      "highlight": {
        "message": [ "secilog is a log real-time < em > analyse < / em > software, it's full text search is based on elasticsearch "]
     }
   }]
 }
}
 


    As can be seen from the results, have highlighted content, < em > analyse < / em >. In order to highlight the implementation of the field must have the actual content. This field must be stored and the process is in the field map value store must be ture, not only in memory. Otherwise, the system will automatically load _source field and match-related columns. Field name support wildcard symbols, for example, you can use "message *": {} matches all parameters message at the beginning of the field.
 
fast-vector-highlighter

    highlighted in front of an ordinary highlighted, lucene supports fast-vector-highlighter highlight, fast-vector-highlighter highlight has the following characteristics:

• fast, especially the content of other large fields, such as greater than 1M.


• customizable boundary_chars, boundary_max_scan, and fragment_offset.


• you need to set term_vector value with_positions_offsets, increasing the size of the index.


• You can combine multiple fields into a match result.


• can assign different weights to match different positions,

    Elasticsearch required when making an index field mapping type, we can achieve postings-highlighter to highlight, for example, the use of fast-vector field content highlighting Type:

{
    "type_name": {
        "content": { "type": "string", "term_vector": "with_positions_offsets"}
   }
}
 

 
postings-highlighter

    lucene supports postings-highlighter highlight, postings-highlighter highlight has the following characteristics:

• fast, because it does not need to re-analyze the document: especially for large files to improve performance is more obvious.


• take up less disk space.


• the highlight and sentences apart, this is more conducive to human reading.


• use BM25 algorithm, so that when searching like the entire document.

    Elasticsearch required when making an index field mapping type, we can achieve postings-highlighter highlighting, for example, to highlight the content field type using postings:

{
    "type_name": {
        "content": { "type": "string", "index_options": "offsets"}
   }
}
 

Notes: Highlight the query does not support complex queries, such as the query type to match_phrase_prefix queries.

    for the latter two types of special, it will increase the size of the index, but to highlight the query execution time is reduced.


    using the type field can be forced to use a specific type of highlight, when the type is set term_vectors highlighted when you want to display an ordinary highlighted when useful. Only three in this type, plain, postings, fvh correspond to three types of highlighting, for example:

{
    "query": {...},
    "highlight": {
        "fields" : {
            "content": { "type": "plain"}
       }
   }}
 

highlighted by default html tag

Under

By default, text is highlighted in < em > and < / em > in. This can be modified by setting pre_tags and post_tags, for example:

{
    "query": {...},
    "highlight": {
        "pre_tags" : [ "< b >"],
        "post_tags": [ "< / b >"],
        "fields ": {
           " _all ": {}
       }
   }}


Quick vector notation can have several labels, in accordance with the importance sort, for example:

{
    "query": {...},
    "highlight": {
        "pre_tags" : [ "< tag1 >", "< tag2 >"],
        "post_tags": [ "< / tag1 >", "< / tag2 >"] ,
        "fields": {
            "_all": {}
       }
   }
}

  In this case the system has a default plurality pre_tags, you need to set tags_schema is styled, post_tags default is < / em >, a plurality pre_tags default label:

< em class = "hlt1" >, < em class = "hlt2" >, < em class = "hlt3" >, < em class = "hlt4" >, < em class = "hlt5" >, < em class = "hlt6" >, < em class = "hlt7" >, < em class = "hlt8" >, < em class = "hlt9" >, < em class = "hlt10" >
 

  when we need to set up multiple tabs by default when the examples are as follows:

{
    "query": {...},
    "highlight": {
        "tags_schema" : "styled",
        "fields": {
            "content": {}
       }
   }
}

    each field can set the character sheet fragment_size segment size highlighted (default is 100), and returns the maximum number of segments number_of_fragments (default is 5), if number_of_fragments value is set to 0 the clips when the order is set to score when you can sort according to ratings. For example:


{
    "query": {...},
    "highlight": {
        "order": "score",
        "fields": {
            "content": { "fragment_size" : 150, "number_of_fragments": 3}
       }
   }
}

     
         
         
         
  More:      
 
- How to adjust the system time CentOS (Linux)
- Linux stand-alone OGG synchronous Oracle 11g DB test (Database)
- Python configuration tortuous road of third-party libraries Numpy and matplotlib (Programming)
- CentOS 6.5 Linux System Customization and Packaging Quick Implementation Script (Linux)
- Cooling solutions Ubuntu system (Linux)
- CentOS 6.x systems installation + NIC driver installation (Realtek PCIe GBE Family Controller for Linux) (Linux)
- Linux installation is larger than 2TB (UEFI interface) hard disk solution (Linux)
- CentOS 6.5 start ActiveMQ being given to solve (Server)
- Denyhosts prevent hackers using SSH scanning (Linux)
- Timeout control related to Python threads and a simple application (Programming)
- Linux firewall anti-hacker disguise malicious attacks (Linux)
- CentOS 6.5 installation VNCServer implement graphical access (Server)
- Eclipse remove all comments and code spaces (Linux)
- In Debian 4.3 compiler under Linux-2.6.28 kernel Summary (Programming)
- DB2 Version SQLJ to access Oracle Server (Database)
- tespeed - test speed of Python tools (Linux)
- How to create a secure and easy to remember password (Linux)
- PHP with FastCGI and mod_php Comments (Server)
- Ubuntu system process is bound CPU core (Linux)
- Installed in the desktop version of Ubuntu Unity Tweak Tool (Linux)
     
           
     
  CopyRight 2002-2022 newfreesoft.com, All Rights Reserved.