Home IT Linux Windows Database Network Programming Server Mobile  
           
  Home \ Database \ HBase table data processing tab     - Java Annotation Comments (Programming)

- Reported too many open files Linux solutions (Server)

- How to create a cloud encrypted file system in Linux systems (Linux)

- Java and Python use make way dictionary word search script (Programming)

- HttpClient Tutorial (Programming)

- Python interview must look at 15 questions (Programming)

- C language files update in real time (Programming)

- Upgrading from Fedora 20 to 21 (Linux)

- Log4Net (Linux)

- Ubuntu deploying Solr (4.4) to Tomcat (7.0.53) (Server)

- MySQL High Availability plan several options (Database)

- Kafka cluster deployment (Server)

- Android in the event delivery and handling mechanism (Programming)

- Linux System Getting Started Tutorial: Using the Linux common commands (Linux)

- Log4j configuration file Explanation (Linux)

- Eclipse-4.4 crash problem solving under Debian-7.6 (Linux)

- Forbid screen change the window size when creating a new window under CentOS (Linux)

- Log in CentOS 6.5 Multi-user setting VNC (Server)

- Java recognize simple codes (Programming)

- Java learning problems encountered (Programming)

 
         
  HBase table data processing tab
     
  Add Date : 2018-11-21      
         
       
         
  HBase is the Hadoop Big Data ecological technology circles a key technology, a distributed storage for large data column-based database, HBase on a more detailed description and technical details, my friends can search on the web, the author I will write a technical presentations HBase aspect in the next day, friends who are interested can look forward to a little bit. But the focus of this chapter is to introduce the next page of data processing HBase table, the other will not say more.

First talk about table data in a tab index can not be avoided: the total number of records. In a relational database, it is easy to count the total number of records, but in HBase, which is a major problem, at least for now, my friends simply do not expect to be able to through a similar "SELECT COUNT (*) FROM TABLE" the way the statistics table the total number of rows. Table rows statistics HBase itself provides a MapReduce task is extremely time-consuming, so to HBase table paging through data processing, we can ignore the total number of records that the statistical indicators.

If the total number of records is uncertain, the scores of pages is uncertain whether there is a next is unknown, as well as other problems caused, we are making HBase table when processing data pages that require special attention.

1, HBase table data paging model class

import java.io.Serializable;
import java.text.DecimalFormat;
import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.hbase.client.Result;
/ **
 * Description: HBase table data paging model class.

 * The use of such objects can manage multiple HBaseQualifierModel.
 * Copyright: Copyright (c) 2014

 * Company: Henan Electric Power Research Institute Smart Grid

 * @author Shangbingbing 2014-01-01 write
 * @version 1.0
 * /
public class HBasePageModel implements Serializable {
    private static final long serialVersionUID = 330410716100946538L;
    private int pageSize = 100;
    private int pageIndex = 0;
    private int prevPageIndex = 1;
    private int nextPageIndex = 1;
    private int pageCount = 0;
    private int pageFirstRowIndex = 1;
    private byte [] pageStartRowKey = null;
    private byte [] pageEndRowKey = null;
    private boolean hasNextPage = true;
    private int queryTotalCount = 0;
    private long startTime = System.currentTimeMillis ();
    private long endTime = System.currentTimeMillis ();
    private List resultList = new ArrayList ();
    public HBasePageModel (int pageSize) {
        this.pageSize = pageSize;
    }
    / **
    * Get the number of records tab
    * @return
    * /
    public int getPageSize () {
        return pageSize;
    }
    / **
    * Set page number of records
    * @param PageSize
    * /
    public void setPageSize (int pageSize) {
        this.pageSize = pageSize;
    }
    / **
    * Get the current page number
    * @return
    * /
    public int getPageIndex () {
        return pageIndex;
    }
    / **
    * Set the current page number
    * @param PageIndex
    * /
    public void setPageIndex (int pageIndex) {
        this.pageIndex = pageIndex;
    }
    / **
    * Get the total number of tabs
    * @return
    * /
    public int getPageCount () {
        return pageCount;
    }
    / **
    * Set the total number of tabs
    * @param PageCount
    * /
    public void setPageCount (int pageCount) {
        this.pageCount = pageCount;
    }
    / **
    * Get the first line of each page number
    * @return
    * /
    public int getPageFirstRowIndex () {
        this.pageFirstRowIndex = (this.getPageIndex () - 1) * this.getPageSize () + 1;
        return pageFirstRowIndex;
    }
    / **
    * Get page starting line key
    * @return
    * /
    public byte [] getPageStartRowKey () {
        return pageStartRowKey;
    }
    / **
    * Set starting line per key
    * @param PageStartRowKey
    * /
    public void setPageStartRowKey (byte [] pageStartRowKey) {
        this.pageStartRowKey = pageStartRowKey;
    }
    / **
    * Gets end line per key
    * @return
    * /
    public byte [] getPageEndRowKey () {
        return pageEndRowKey;
    }
    / **
    * Set the end of line per key
    * @param PageStartRowKey
    * /
    public void setPageEndRowKey (byte [] pageEndRowKey) {
        this.pageEndRowKey = pageEndRowKey;
    }
    / **
    * Get Previous number
* @return
    * /
    public int getPrevPageIndex () {
        if (this.getPageIndex ()> 1) {
            this.prevPageIndex = this.getPageIndex () - 1;
        } Else {
            this.prevPageIndex = 1;
        }
        return prevPageIndex;
    }
    / **
    * Get the next number
    * @return
    * /
    public int getNextPageIndex () {
        this.nextPageIndex = this.getPageIndex () + 1;
        return nextPageIndex;
    }
    / **
    * Get Is there Next
    * @return
    * /
    public boolean isHasNextPage () {
// This judgment is not rigorous, because most likely the rest of the data just enough page.
        if (this.getResultList (). size () == this.getPageSize ()) {
            this.hasNextPage = true;
        } Else {
            this.hasNextPage = false;
        }
        return hasNextPage;
    }
    / **
    * Gets the total number of records retrieved
    * /
    public int getQueryTotalCount () {
        return queryTotalCount;
    }
    / **
    * Gets the total number of records retrieved
    * @param QueryTotalCount
    * /
    public void setQueryTotalCount (int queryTotalCount) {
        this.queryTotalCount = queryTotalCount;
    }
    / **
    * Initialization start time (ms)
    * /
    public void initStartTime () {
        this.startTime = System.currentTimeMillis ();
    }
    / **
    * Initialize deadline (ms)
    * /
    public void initEndTime () {
        this.endTime = System.currentTimeMillis ();
    }
    / **
    * Get the millisecond time-consuming format information
    * @return
    * /
    public String getTimeIntervalByMilli () {
        return String.valueOf (this.endTime - this.startTime) + "milliseconds";
    }
    / **
    * Get information and seconds of time-consuming
    * @return
    * /
    public String getTimeIntervalBySecond () {
        double interval = (this.endTime - this.startTime) /1000.0;
        DecimalFormat df = new DecimalFormat ( "# ##.");
        return df.format (interval) + "seconds";
    }
    / **
    * Print time information
    * /
    public void printTimeInfo () {
        LogInfoUtil.printLog ( "Start time:" + this.startTime);
        LogInfoUtil.printLog ( "Deadline:" + this.endTime);
        LogInfoUtil.printLog ( "time-consuming:" + this.getTimeIntervalBySecond ());
    }
    / **
    * Get HBase result set retrieval
    * @return
    * /
    public List getResultList () {
        return resultList;
    }
    / **
    * Set HBase result set retrieval
    * @param ResultList
    * /
    public void setResultList (List resultList) {
        this.resultList = resultList;
    }
}

In summary, we have no record of the total number and the total number of pages processed statistically, and with the "number of records have been retrieved," instead of "the total number of records." In addition, each consuming information retrieval were statistical records, to facilitate developers to debug statistical efficiency.

2, HBase table data retrieval methods tab

Like Oracle relational database, as we often comes with a lot of search condition for data retrieval, HBase table data retrieval is no exception. HBase table data retrieval condition usually have the following: RowKey row key ranges (If you're unsure if the range for the whole table), filters, data version. So, when we decided to design a more generic data retrieval paging interface method, you have to consider more than a few search condition.

/ **
* Page to retrieve table data.

* (If you specify a non-default namespace when you create the table for this table, you need namespace name spelling, format [namespace: tablename]).
* @param TableName table name (*).
* @param StartRowKey starting line keys (can be empty, if empty, from the table in the first row retrieval).
* @param EndRowKey end row of keys (can be empty).
* @param FilterList search condition filter set (does not include paging filter; can be empty).
* @param MaxVersions specify the maximum number of versions if the maximum integer value, retrieves all versions; if it is the smallest integer value, retrieve the latest version; otherwise retrieves only the specified version number.
* @param PageModel paging model (*).
* @return Returns HBasePageModel tab object.
* /
public static HBasePageModel scanResultByPageFilter (String tableName, byte [] startRowKey, byte [] endRowKey, FilterList filterList, int maxVersions, HBasePageModel pageModel) {
    if (pageModel == null) {
        pageModel = new HBasePageModel (10);
    }
    if (maxVersions <= 0) {
        // Default to retrieve only the latest version of the data
        maxVersions = Integer.MIN_VALUE;
    }
    pageModel.initStartTime ();
    pageModel.initEndTime ();
    if (StringUtils.isBlank (tableName)) {
        return pageModel;
    }
    HTable table = null;
    
    try {
        // According to HBase table name to give HTable table objects, where the author himself used to build a table of their own management class.
        table = HBaseTableManageUtil.getHBaseTable (tableName);
        int tempPageSize = pageModel.getPageSize ();
        boolean isEmptyStartRowKey = false;
        if (startRowKey == null) {
            // Read the first line of the table record, I own I used here to build a data table class operation.
            Result firstResult = HBaseTableDataUtil.selectFirstResultRow (tableName, filterList);
            if (firstResult.isEmpty ()) {
                return pageModel;
            }
            startRowKey = firstResult.getRow ();
        }
        if (pageModel.getPageStartRowKey () == null) {
            isEmptyStartRowKey = true;
            pageModel.setPageStartRowKey (startRowKey);
        } Else {
            if (pageModel.getPageEndRowKey ()! = null) {
                pageModel.setPageStartRowKey (pageModel.getPageEndRowKey ());
            }
            // From second page, each time taking a more records, because the first record to be deleted.
            tempPageSize + = 1;
        }
        
        Scan scan = new Scan ();
        scan.setStartRow (pageModel.getPageStartRowKey ());
        if (endRowKey! = null) {
            scan.setStopRow (endRowKey);
        }
        PageFilter pageFilter = new PageFilter (pageModel.getPageSize () + 1);
        if (filterList! = null) {
            filterList.addFilter (pageFilter);
            scan.setFilter (filterList);
        } Else {
            scan.setFilter (pageFilter);
        }
        if (maxVersions == Integer.MAX_VALUE) {
            scan.setMaxVersions ();
        } Else if (maxVersions == Integer.MIN_VALUE) {
            
        } Else {
            scan.setMaxVersions (maxVersions);
        }
        ResultScanner scanner = table.getScanner (scan);
        List resultList = new ArrayList ();
        int index = 0;
        for (Result rs: scanner.next (tempPageSize)) {
            if (isEmptyStartRowKey == false && index == 0) {
                index + = 1;
                continue;
            }
            if (! rs.isEmpty ()) {
                resultList.add (rs);
            }
            index + = 1;
        }
        scanner.close ();
        pageModel.setResultList (resultList);
    } Catch (Exception e) {
        e.printStackTrace ();
    } Finally {
        try {
            table.close ();
        } Catch (IOException e) {
            e.printStackTrace ();
        }
    }
    
    int pageIndex = pageModel.getPageIndex () + 1;
    pageModel.setPageIndex (pageIndex);
    if (pageModel.getResultList (). size ()> 0) {
        // Get The paged data first and last lines of the row key information
        byte [] pageStartRowKey = pageModel.getResultList () get (0) .getRow ().;
        . Byte [] pageEndRowKey = pageModel.getResultList () get (. PageModel.getResultList () size () - 1) .getRow ();
        pageModel.setPageStartRowKey (pageStartRowKey);
        pageModel.setPageEndRowKey (pageEndRowKey);
    }
    . Int queryTotalCount = pageModel.getQueryTotalCount () + pageModel.getResultList () size ();
    pageModel.setQueryTotalCount (queryTotalCount);
    pageModel.initEndTime ();
    pageModel.printTimeInfo ();
    return pageModel;
}

Incidentally posted "Get HBase table first row of data" interface methods.

/ **
 * Retrieve the first row of the specified table records.

 * (If you specify a non-default namespace when you create the table for this table, you need namespace name spelling, format namespace: tablename).
 * @param TableName table name (*).
 * @param FilterList filter set, may be null.
 * @return
 * /
public static Result selectFirstResultRow (String tableName, FilterList filterList) {
    if (StringUtils.isBlank (tableName)) return null;
    HTable table = null;
    try {
        table = HBaseTableManageUtil.getHBaseTable (tableName);
        Scan scan = new Scan ();
        if (filterList! = null) {
            scan.setFilter (filterList);
        }
        ResultScanner scanner = table.getScanner (scan);
        Iterator iterator = scanner.iterator ();
        int index = 0;
        while (iterator.hasNext ()) {
            Result rs = iterator.next ();
            if (index == 0) {
                scanner.close ();
                return rs;
            }
        }
    } Catch (IOException e) {
        e.printStackTrace ();
    } Finally {
        try {
            table.close ();
        } Catch (IOException e) {
            e.printStackTrace ();
        }
    }
    return null;
}

3, HBase table data retrieval application examples page

HBasePageModel pageModel = new HBasePageModel (pageSize);
pageModel = scanResultByPageFilter ( "DLQX: SZYB_DATA", null, null, null, pageModel);
if (pageModel.getResultList (). size () == 0) {
    // There is no data on this page, indicating that this is the last one up.
    return;
}
     
         
       
         
  More:      
 
- Ubuntu install video conversion tool Selene (Linux)
- Linux ls command (Linux)
- MySQL & NoSQL - Memcached widget (Database)
- Ubuntu system grub repair method (Linux)
- Linux remote connectivity tools -OpenSSH (Linux)
- Ubuntu 14.04 LTS to compile the source code Android4.4.2 (Linux)
- PyCharm new Python file name and the name of the module will import the same problem might arise (Programming)
- Graphing tool: Gnuplot (Linux)
- OpenvSwitch 2.1.2 shell script to start and stop (Linux)
- IIS virtual host of safety knowledge (Linux)
- Linux development environment to build and use the directory structure and file --Linux (Linux)
- The difference between statement and preparedStatement of the jdbc (Database)
- mysqldump MySQL command-line tool (Database)
- Performance Optimization: Using Ramlog transfer log files to memory (Linux)
- How to install Ubuntu California - the calendar application (Linux)
- Android Activity launchMode (Programming)
- Linux System Getting Started Learning: Fix ImportError: No module named scapy.all (Linux)
- Linux environment variable settings methods and differences (Linux)
- Python uses multi-process pool (Programming)
- zBackup: A versatile tool to remove duplicate backup (Linux)
     
           
     
  CopyRight 2002-2016 newfreesoft.com, All Rights Reserved.