Home PC Games Linux Windows Database Network Programming Server Mobile  
           
  Home \ Server \ Lucene Getting Started Tutorial     - Linux Network Programming - raw socket instance: MAC Address Scanner (Programming)

- How to install Kernel 4.0.2 on CentOS 7 (Linux)

- ORA-04091 and Compound Trigger (Oracle 11g) (Database)

- Ubuntu U disk do not have write privileges can only read but not write (Linux)

- redis main building and disaster recovery from a cluster deployment (Database)

- OpenJDK 7 compiled under Ubuntu 14.04.3 64-bit (Linux)

- Oracle Linux 5.5 (64bit) Install Oracle 11gR2 RAC detailed tutorial (Database)

- SME Linux network security policy server security (Linux)

- 12 Linux Process Management Commands (Linux)

- Oracle to read and modify the data block process (Database)

- Tab set to four spaces in Vim (Linux)

- Installation and use GAMIT / GLOBK Software (Linux)

- How to use the character in C ++ without pressing the Enter key to enter the Show (Programming)

- Hadoop 2.6.0 stand-alone / pseudo-distributed installation (Server)

- MongoDB 3.0 New Features (Database)

- Linux garbled file delete method (Linux)

- Text analysis tools - awk (Linux)

- Linux variable learning experience (Linux)

- MySQL Error Code Complete (Database)

- Is Linux the most secure operating system (Linux)

 
         
  Lucene Getting Started Tutorial
     
  Add Date : 2018-11-21      
         
         
         
  First, Lucene Introduction

Lucene is an apache under by performance, a full-featured text search using pure java development engine library. It is suitable for almost any application requires full-text search, especially cross-platform. Lucene is a free open source project. Lucene provides simple but very powerful. Related features are as follows:

Speed on the hardware more than 150GB / hr
Smaller memory requirements, just 1MB heap space
Rapid increase in the index, and the index batch
The index is greater than the size of the text to be indexed by 20% -30%
Lucene download address: http: //lucene.apache.org/

Text example project uses maven build, Lucene version 5.2.1. Related dependencies are as follows:

< Project xmlns = "http://maven.apache.org/POM/4.0.0" xmlns: xsi = "http://www.w3.org/2001/XMLSchema-instance"
    xsi: schemaLocation = "http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
    < ModelVersion> 4.0.0 < / modelVersion>
    < GroupId> com.shh < / groupId>
    < ArtifactId> lucene < / artifactId>
    < Packaging> war < / packaging>
    < Version> 0.0.1-SNAPSHOT < / version>
    < Name> lucene Maven Webapp < / name>
    < Url> http://maven.apache.org < / url>
    < Properties>
        < Project.build.sourceEncoding> UTF-8 < /project.build.sourceEncoding>
        < Lucene.version> 5.2.1 < /lucene.version>
    < / Properties>

    < Dependencies>
        < Dependency>
            < GroupId> org.apache.lucene < / groupId>
            < ArtifactId> lucene-core < / artifactId>
            < Version> $ {lucene.version} < / version>
        < / Dependency>

        < Dependency>
            < GroupId> org.apache.lucene < / groupId>
            < ArtifactId> lucene-queryparser < / artifactId>
            < Version> $ {lucene.version} < / version>
        < / Dependency>
        < Dependency>
            < GroupId> org.apache.lucene < / groupId>
            < ArtifactId> lucene-analyzers-common < / artifactId>
            < Version> $ {lucene.version} < / version>
        < / Dependency>

        < ! - Splitter words ->
        < Dependency>
            < GroupId> org.apache.lucene < / groupId>
            < ArtifactId> lucene-analyzers-smartcn < / artifactId>
            < Version> $ {lucene.version} < / version>
        < / Dependency>

        < Dependency>
            < GroupId> org.apache.lucene < / groupId>
            < ArtifactId> lucene-highlighter < / artifactId>
            < Version> $ {lucene.version} < / version>
        < / Dependency>
    < / Dependencies>

    < Build>
        < FinalName> lucene < / finalName>
    < / Build>
< / Project>

Second, the example

1, index creation

Related code is as follows:

package com.test.lucene;

import java.io.IOException;
import java.nio.file.Paths;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field.Store;
import org.apache.lucene.document.IntField;
import org.apache.lucene.document.StringField;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;

/ **
 * Create an index
 * /
public class IndexCreate {
    
    public static void main (String [] args) {
        // Specify segmentation technique used here is the standard word
        Analyzer analyzer = new StandardAnalyzer ();

        The configuration information // indexWriter
        IndexWriterConfig indexWriterConfig = new IndexWriterConfig (analyzer);

        // Index Open: No is created, it is open
        indexWriterConfig.setOpenMode (OpenMode.CREATE_OR_APPEND);

        Directory directory = null;
        IndexWriter indexWriter = null;
        try {
            // Path indexes are stored on the hard disk
            directory = FSDirectory.open (Paths.get ( "D: // index / test"));
            // IndexWriter used to create an index file
            indexWriter = new IndexWriter (directory, indexWriterConfig);
        } Catch (IOException e) {
            e.printStackTrace ();
        }
        
        // Create a document
        Document doc1 = new Document ();
        doc1.add (new StringField ( "id", "abcde", Store.YES));
        doc1.add (new TextField ( "content", "Guangzhou, China", Store.YES));
        doc1.add (new IntField ( "num", 1, Store.YES));

        // Create two documents
        Document doc2 = new Document ();
        doc2.add (new StringField ( "id", "asdff", Store.YES));
        doc2.add (new TextField ( "content", "Shanghai China", Store.YES));
        doc2.add (new IntField ( "num", 2, Store.YES));

        try {
            // Add documents to index
            indexWriter.addDocument (doc1);
            indexWriter.addDocument (doc2);
 
            // Will submit indexWrite operation, if you do not submit, before the operation will not be saved to the hard disk
            // But this step is very consuming system resources, index perform this operation requires a certain strategy
            indexWriter.commit ();
        } Catch (IOException e) {
            e.printStackTrace ();
        } Finally {
            // Close the resource
            try {
                indexWriter.close ();
                directory.close ();
            } Catch (IOException e) {
                e.printStackTrace ();
            }
        }
    }
}

2, search

Related code is as follows:

package com.test.lucene;

import java.io.IOException;
import java.nio.file.Paths;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;

/ **
 * search for
 * /
public class IndexSearch {
    
    public static void main (String [] args) {
        // Index storage location
        Directory directory = null;
        try {
            // Index of hard disk storage path
            directory = FSDirectory.open (Paths.get ( "D: // index / test"));
            // Read the index
            DirectoryReader directoryReader = DirectoryReader.open (directory);
            // Create an index to retrieve objects
            IndexSearcher searcher = new IndexSearcher (directoryReader);
            // Word segmentation
            Analyzer analyzer = new StandardAnalyzer ();
            // Create Query
            QueryParser parser = new QueryParser ( "content", analyzer);
            Query query = parser.parse ( "Canton"); // query content to Guangzhou
            // Retrieval index, acquire a qualifying record before 10
            TopDocs topDocs = searcher.search (query, 10);
            if (topDocs! = null) {
                System.out.println ( "matching records is:" + topDocs.totalHits);
                for (int i = 0; i < topDocs.scoreDocs.length; i ++) {
                    Document doc = searcher.doc (topDocs.scoreDocs [i] .doc);
                    System.out.println ( "id =" + doc.get ( "id"));
                    System.out.println ( "content =" + doc.get ( "content"));
                    System.out.println ( "num =" + doc.get ( "num"));
                }
            }
            directory.close ();
            directoryReader.close ();
        } Catch (IOException e) {
            e.printStackTrace ();
        } Catch (ParseException e) {
            e.printStackTrace ();
        }
    }
}

Third, Lucene works

Lucene full-text search is divided into two steps:

Index creation: data (data including databases, files, etc.) to extract information, and create an index file.

Search index: according to the user's search request, the index was created to search, and the search results returned to the user.
     
         
         
         
  More:      
 
- 2016, the new Node project Precautions (Programming)
- Camera-based face recognition OpenCV crawl and storage format (Python) (Linux)
- Git and GitHub use of Eclipse and Android Studio (Programming)
- sa weak passwords intrusion prevention (Linux)
- There are three ways to run a Linux operating system from a USB stick (Linux)
- Ubuntu 14.04 / 14.10 how to install Mate 1.10.0 (Linux)
- Udev: Device Manager for Linux Fundamentals (Linux)
- Ubuntu 12.04 64-bit installation Redmine + Git + ReviewBoard (Linux)
- To create someone else can not afford to delete the administrator user (Linux)
- Swift notes - let you two hours to learn Swift (Programming)
- How to monitor Nginx (Database)
- Set up MySQL master and slave servers under Ubuntu 14.04 (Server)
- error no.2013 lost connection Tom with SQLServer during query (Database)
- Linux Getting Started tutorial: Borrow Windows fonts in Ubuntu 14.10 (Linux)
- cp: omitting directory error solutions under Linux (Linux)
- Ubuntu 14.04 Nvidia proprietary drivers for install two graphic cards (Linux)
- PCM audio under Linux (Linux)
- Using nmcli commands to manage network in RedHat / CentOS 7.x (Linux)
- Oracle local user login authentication fails ORA-01031 insufficient privileges (Database)
- Timing Nginx logs cut and remove the log records of the specified number of days before (Server)
     
           
     
  CopyRight 2002-2020 newfreesoft.com, All Rights Reserved.