Home PC Games Linux Windows Database Network Programming Server Mobile  
           
  Home \ Database \ HBase table data processing tab     - PHP 7.0 Upgrade Notes (Linux)

- Graphical interface for the CentOS 6.4 installed and connected by remote VNC (Linux)

- Broadcom transplanted to OpenWrt summary (Programming)

- How to remove the files inside the privacy of data on Linux (Linux)

- Windows7 system using Vagrant to build Linux virtualized development environment (Linux)

- Nginx-1.9.7 TCP reverse proxy (Server)

- Open container cluster management system architecture and components introduced Kubernetes (Server)

- Using shell users or virtual users to login to pureftpd (Linux)

- MySQL database master never solve the synchronization method (Database)

- CentOS 5.3 under broadcom NIC dual activation issues (Linux)

- Install DB2 V10 and Data Studio V3 under Linux (Ubuntu) environment (Database)

- iostat command Detailed (Linux)

- Linux Nginx FastDFS integration module is installed Nginx and FastDFS (Server)

- C ++ precision performance test function (Programming)

- Forwarding module with Apache reverse proxy server (Server)

- C # socket udp broadcast (Programming)

- SecureCRT remote connection Ubuntu fails to solve the case (Linux)

- How do I cancel (almost) any operations in Git, (Linux)

- sudoers file parsing (Linux)

- Java integrated development environment common set of operations (Linux)

 
         
  HBase table data processing tab
     
  Add Date : 2018-11-21      
         
         
         
  HBase is the Hadoop Big Data ecological technology circles a key technology, a distributed storage for large data column-based database, HBase on a more detailed description and technical details, my friends can search on the web, the author I will write a technical presentations HBase aspect in the next day, friends who are interested can look forward to a little bit. But the focus of this chapter is to introduce the next page of data processing HBase table, the other will not say more.

First talk about table data in a tab index can not be avoided: the total number of records. In a relational database, it is easy to count the total number of records, but in HBase, which is a major problem, at least for now, my friends simply do not expect to be able to through a similar "SELECT COUNT (*) FROM TABLE" the way the statistics table the total number of rows. Table rows statistics HBase itself provides a MapReduce task is extremely time-consuming, so to HBase table paging through data processing, we can ignore the total number of records that the statistical indicators.

If the total number of records is uncertain, the scores of pages is uncertain whether there is a next is unknown, as well as other problems caused, we are making HBase table when processing data pages that require special attention.

1, HBase table data paging model class

import java.io.Serializable;
import java.text.DecimalFormat;
import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.hbase.client.Result;
/ **
 * Description: HBase table data paging model class.

 * The use of such objects can manage multiple HBaseQualifierModel.
 * Copyright: Copyright (c) 2014

 * Company: Henan Electric Power Research Institute Smart Grid

 * @author Shangbingbing 2014-01-01 write
 * @version 1.0
 * /
public class HBasePageModel implements Serializable {
    private static final long serialVersionUID = 330410716100946538L;
    private int pageSize = 100;
    private int pageIndex = 0;
    private int prevPageIndex = 1;
    private int nextPageIndex = 1;
    private int pageCount = 0;
    private int pageFirstRowIndex = 1;
    private byte [] pageStartRowKey = null;
    private byte [] pageEndRowKey = null;
    private boolean hasNextPage = true;
    private int queryTotalCount = 0;
    private long startTime = System.currentTimeMillis ();
    private long endTime = System.currentTimeMillis ();
    private List resultList = new ArrayList ();
    public HBasePageModel (int pageSize) {
        this.pageSize = pageSize;
    }
    / **
    * Get the number of records tab
    * @return
    * /
    public int getPageSize () {
        return pageSize;
    }
    / **
    * Set page number of records
    * @param PageSize
    * /
    public void setPageSize (int pageSize) {
        this.pageSize = pageSize;
    }
    / **
    * Get the current page number
    * @return
    * /
    public int getPageIndex () {
        return pageIndex;
    }
    / **
    * Set the current page number
    * @param PageIndex
    * /
    public void setPageIndex (int pageIndex) {
        this.pageIndex = pageIndex;
    }
    / **
    * Get the total number of tabs
    * @return
    * /
    public int getPageCount () {
        return pageCount;
    }
    / **
    * Set the total number of tabs
    * @param PageCount
    * /
    public void setPageCount (int pageCount) {
        this.pageCount = pageCount;
    }
    / **
    * Get the first line of each page number
    * @return
    * /
    public int getPageFirstRowIndex () {
        this.pageFirstRowIndex = (this.getPageIndex () - 1) * this.getPageSize () + 1;
        return pageFirstRowIndex;
    }
    / **
    * Get page starting line key
    * @return
    * /
    public byte [] getPageStartRowKey () {
        return pageStartRowKey;
    }
    / **
    * Set starting line per key
    * @param PageStartRowKey
    * /
    public void setPageStartRowKey (byte [] pageStartRowKey) {
        this.pageStartRowKey = pageStartRowKey;
    }
    / **
    * Gets end line per key
    * @return
    * /
    public byte [] getPageEndRowKey () {
        return pageEndRowKey;
    }
    / **
    * Set the end of line per key
    * @param PageStartRowKey
    * /
    public void setPageEndRowKey (byte [] pageEndRowKey) {
        this.pageEndRowKey = pageEndRowKey;
    }
    / **
    * Get Previous number
* @return
    * /
    public int getPrevPageIndex () {
        if (this.getPageIndex ()> 1) {
            this.prevPageIndex = this.getPageIndex () - 1;
        } Else {
            this.prevPageIndex = 1;
        }
        return prevPageIndex;
    }
    / **
    * Get the next number
    * @return
    * /
    public int getNextPageIndex () {
        this.nextPageIndex = this.getPageIndex () + 1;
        return nextPageIndex;
    }
    / **
    * Get Is there Next
    * @return
    * /
    public boolean isHasNextPage () {
// This judgment is not rigorous, because most likely the rest of the data just enough page.
        if (this.getResultList (). size () == this.getPageSize ()) {
            this.hasNextPage = true;
        } Else {
            this.hasNextPage = false;
        }
        return hasNextPage;
    }
    / **
    * Gets the total number of records retrieved
    * /
    public int getQueryTotalCount () {
        return queryTotalCount;
    }
    / **
    * Gets the total number of records retrieved
    * @param QueryTotalCount
    * /
    public void setQueryTotalCount (int queryTotalCount) {
        this.queryTotalCount = queryTotalCount;
    }
    / **
    * Initialization start time (ms)
    * /
    public void initStartTime () {
        this.startTime = System.currentTimeMillis ();
    }
    / **
    * Initialize deadline (ms)
    * /
    public void initEndTime () {
        this.endTime = System.currentTimeMillis ();
    }
    / **
    * Get the millisecond time-consuming format information
    * @return
    * /
    public String getTimeIntervalByMilli () {
        return String.valueOf (this.endTime - this.startTime) + "milliseconds";
    }
    / **
    * Get information and seconds of time-consuming
    * @return
    * /
    public String getTimeIntervalBySecond () {
        double interval = (this.endTime - this.startTime) /1000.0;
        DecimalFormat df = new DecimalFormat ( "# ##.");
        return df.format (interval) + "seconds";
    }
    / **
    * Print time information
    * /
    public void printTimeInfo () {
        LogInfoUtil.printLog ( "Start time:" + this.startTime);
        LogInfoUtil.printLog ( "Deadline:" + this.endTime);
        LogInfoUtil.printLog ( "time-consuming:" + this.getTimeIntervalBySecond ());
    }
    / **
    * Get HBase result set retrieval
    * @return
    * /
    public List getResultList () {
        return resultList;
    }
    / **
    * Set HBase result set retrieval
    * @param ResultList
    * /
    public void setResultList (List resultList) {
        this.resultList = resultList;
    }
}

In summary, we have no record of the total number and the total number of pages processed statistically, and with the "number of records have been retrieved," instead of "the total number of records." In addition, each consuming information retrieval were statistical records, to facilitate developers to debug statistical efficiency.

2, HBase table data retrieval methods tab

Like Oracle relational database, as we often comes with a lot of search condition for data retrieval, HBase table data retrieval is no exception. HBase table data retrieval condition usually have the following: RowKey row key ranges (If you're unsure if the range for the whole table), filters, data version. So, when we decided to design a more generic data retrieval paging interface method, you have to consider more than a few search condition.

/ **
* Page to retrieve table data.

* (If you specify a non-default namespace when you create the table for this table, you need namespace name spelling, format [namespace: tablename]).
* @param TableName table name (*).
* @param StartRowKey starting line keys (can be empty, if empty, from the table in the first row retrieval).
* @param EndRowKey end row of keys (can be empty).
* @param FilterList search condition filter set (does not include paging filter; can be empty).
* @param MaxVersions specify the maximum number of versions if the maximum integer value, retrieves all versions; if it is the smallest integer value, retrieve the latest version; otherwise retrieves only the specified version number.
* @param PageModel paging model (*).
* @return Returns HBasePageModel tab object.
* /
public static HBasePageModel scanResultByPageFilter (String tableName, byte [] startRowKey, byte [] endRowKey, FilterList filterList, int maxVersions, HBasePageModel pageModel) {
    if (pageModel == null) {
        pageModel = new HBasePageModel (10);
    }
    if (maxVersions <= 0) {
        // Default to retrieve only the latest version of the data
        maxVersions = Integer.MIN_VALUE;
    }
    pageModel.initStartTime ();
    pageModel.initEndTime ();
    if (StringUtils.isBlank (tableName)) {
        return pageModel;
    }
    HTable table = null;
    
    try {
        // According to HBase table name to give HTable table objects, where the author himself used to build a table of their own management class.
        table = HBaseTableManageUtil.getHBaseTable (tableName);
        int tempPageSize = pageModel.getPageSize ();
        boolean isEmptyStartRowKey = false;
        if (startRowKey == null) {
            // Read the first line of the table record, I own I used here to build a data table class operation.
            Result firstResult = HBaseTableDataUtil.selectFirstResultRow (tableName, filterList);
            if (firstResult.isEmpty ()) {
                return pageModel;
            }
            startRowKey = firstResult.getRow ();
        }
        if (pageModel.getPageStartRowKey () == null) {
            isEmptyStartRowKey = true;
            pageModel.setPageStartRowKey (startRowKey);
        } Else {
            if (pageModel.getPageEndRowKey ()! = null) {
                pageModel.setPageStartRowKey (pageModel.getPageEndRowKey ());
            }
            // From second page, each time taking a more records, because the first record to be deleted.
            tempPageSize + = 1;
        }
        
        Scan scan = new Scan ();
        scan.setStartRow (pageModel.getPageStartRowKey ());
        if (endRowKey! = null) {
            scan.setStopRow (endRowKey);
        }
        PageFilter pageFilter = new PageFilter (pageModel.getPageSize () + 1);
        if (filterList! = null) {
            filterList.addFilter (pageFilter);
            scan.setFilter (filterList);
        } Else {
            scan.setFilter (pageFilter);
        }
        if (maxVersions == Integer.MAX_VALUE) {
            scan.setMaxVersions ();
        } Else if (maxVersions == Integer.MIN_VALUE) {
            
        } Else {
            scan.setMaxVersions (maxVersions);
        }
        ResultScanner scanner = table.getScanner (scan);
        List resultList = new ArrayList ();
        int index = 0;
        for (Result rs: scanner.next (tempPageSize)) {
            if (isEmptyStartRowKey == false && index == 0) {
                index + = 1;
                continue;
            }
            if (! rs.isEmpty ()) {
                resultList.add (rs);
            }
            index + = 1;
        }
        scanner.close ();
        pageModel.setResultList (resultList);
    } Catch (Exception e) {
        e.printStackTrace ();
    } Finally {
        try {
            table.close ();
        } Catch (IOException e) {
            e.printStackTrace ();
        }
    }
    
    int pageIndex = pageModel.getPageIndex () + 1;
    pageModel.setPageIndex (pageIndex);
    if (pageModel.getResultList (). size ()> 0) {
        // Get The paged data first and last lines of the row key information
        byte [] pageStartRowKey = pageModel.getResultList () get (0) .getRow ().;
        . Byte [] pageEndRowKey = pageModel.getResultList () get (. PageModel.getResultList () size () - 1) .getRow ();
        pageModel.setPageStartRowKey (pageStartRowKey);
        pageModel.setPageEndRowKey (pageEndRowKey);
    }
    . Int queryTotalCount = pageModel.getQueryTotalCount () + pageModel.getResultList () size ();
    pageModel.setQueryTotalCount (queryTotalCount);
    pageModel.initEndTime ();
    pageModel.printTimeInfo ();
    return pageModel;
}

Incidentally posted "Get HBase table first row of data" interface methods.

/ **
 * Retrieve the first row of the specified table records.

 * (If you specify a non-default namespace when you create the table for this table, you need namespace name spelling, format namespace: tablename).
 * @param TableName table name (*).
 * @param FilterList filter set, may be null.
 * @return
 * /
public static Result selectFirstResultRow (String tableName, FilterList filterList) {
    if (StringUtils.isBlank (tableName)) return null;
    HTable table = null;
    try {
        table = HBaseTableManageUtil.getHBaseTable (tableName);
        Scan scan = new Scan ();
        if (filterList! = null) {
            scan.setFilter (filterList);
        }
        ResultScanner scanner = table.getScanner (scan);
        Iterator iterator = scanner.iterator ();
        int index = 0;
        while (iterator.hasNext ()) {
            Result rs = iterator.next ();
            if (index == 0) {
                scanner.close ();
                return rs;
            }
        }
    } Catch (IOException e) {
        e.printStackTrace ();
    } Finally {
        try {
            table.close ();
        } Catch (IOException e) {
            e.printStackTrace ();
        }
    }
    return null;
}

3, HBase table data retrieval application examples page

HBasePageModel pageModel = new HBasePageModel (pageSize);
pageModel = scanResultByPageFilter ( "DLQX: SZYB_DATA", null, null, null, pageModel);
if (pageModel.getResultList (). size () == 0) {
    // There is no data on this page, indicating that this is the last one up.
    return;
}
     
         
         
         
  More:      
 
- Getting Started with Linux system to learn: How to compress JPEG images on the command line (Linux)
- RHEL7 system making use of OpenStack mirror (Linux)
- Java MVC CRUD examples (Programming)
- XP virtual machine under VirtualBox solve occupy 100% CPU problem (Linux)
- Linux Security (Linux)
- Linux variable learning experience (Linux)
- MySQL use the integer type (Database)
- Setting the RedHat9 Intrusion Detection System (Linux)
- Use Tails 1.4 Linux system to protect the privacy and anonymity (Linux)
- CentOS 6.5 installation VNCServer implement graphical access (Server)
- Linux awk text analysis tool (Linux)
- Ubuntu Learning Advanced article - to teach you to further enhance system security (Linux)
- The compiler installed Kaldi under Ubuntu 12.04 (Linux)
- An Example of GoldenGate Extract Process Hang Problem Solving (Database)
- Zabbix monitoring different versions of RAID installation and monitoring and MySQL master-slave monitor (Server)
- C ++ Replication Control: Assignment operators and destructors (Programming)
- Profile Linux users login shell and login to read (Linux)
- Use libpq under Ubuntu 14.04 (Linux)
- Oracle 11g modify MEMORY_TARGET (Database)
- Linux ps command (Linux)
     
           
     
  CopyRight 2002-2022 newfreesoft.com, All Rights Reserved.