Home PC Games Linux Windows Database Network Programming Server Mobile  
           
  Home \ Programming \ Python KNN algorithm of actual realization     - ORA-01000 Solution (Database)

- The difference Docker save and export commands (Linux)

- Oracle 11g partition maintenance (Nice) - Truncating And Partitions (Database)

- C ++ 11 feature: decltype keywords (Programming)

- Setting Squid successful anti-hotlinking (Linux)

- How to install the Linux text editor Atom 0.124.0 (Linux)

- DDOS Attacks and Prevention (Linux)

- Elaborate .NET Multithreading: Using Task (Programming)

- Django Web dynamic three linkage (Programming)

- Analysis examples: Intrusion Response Linux platform Case (Linux)

- C ++ Supplements --new delete overload (Programming)

- RHEL7.0 log system (Linux)

- Linux Getting Started Tutorial: / var / spool / clientmqueue fill the root directory (Linux)

- Linux Live CD lets your PC is no longer secure (Linux)

- RHEL7 Apache MPM configuration (Server)

- How to set up FTP server on Linux (Server)

- Bitmap memory footprint of computing Android memory optimization (Linux)

- DOM event handlers add notes (Programming)

- SYN attack hacker attack and defense of the basic principles and prevention technology (Linux)

- Read and write files efficiently from Apache Kafka (Server)

 
         
  Python KNN algorithm of actual realization
     
  Add Date : 2017-01-08      
         
         
         
  Use Python to achieve K nearest neighbor classification algorithm (KNN) is already a commonplace problem, there are already a lot of information online, but here I decided to record your own learning experience.

1, the configuration numpy library

numpy Python libraries are third-party libraries for matrix operations, most will rely on the math libraries for about numpy library configuration see: Python Configuration tortuous road of third-party libraries and matplotlib Numpy, configuration, when completed, will numpy library overall imported into the current project.

2. Preparation of training samples

Here a simple structure with four points and the corresponding label as training samples of KNN:

# Create training samples ==================== ====================
def createdataset ():
    group = array ([[1.0, 1.1], [1.0, 1.0], [0, 0], [0, 0.1]])
    labels = [ 'A', 'B', 'C', 'D']
    return group, labels
There is a small detail, is through the array () function to initialize the old structure and numpy array objects when you want to ensure that only one parameter, so you need to code parameters in brackets, like this is not a legitimate way to call of:

group = array ([1.0, 1.1], [1.0, 1.0], [0, 0], [0, 0.1])
3. Create a classification function

K-nearest neighbor algorithm for classification is usually classified according to Euclidean distance, data and training input data needs to be reduced further in all dimensions relative to the square sum, before you prescribe, as follows:

# ==================== Euclidean distance classifier ====================
def classify (Inx, Dataset, labels, k):
    DataSetSize = Dataset.shape [0] # Get the number of rows of data, shape [1] ranked number
    diffmat = tile (Inx, (DataSetSize, 1)) - Dataset
    SqDiffMat = diffmat ** 2
    SqDistances = SqDiffMat.sum (axis = 1)
    Distance = SqDistances ** 0.5
    SortedDistanceIndicies = Distance.argsort ()
    ClassCount = {}
Here tile () function is a numpy matrix extension functions, such as training samples in this example there are four two-dimensional coordinate point, the input sample (a two-dimensional coordinate point), it needs to be extended to a row 1 of 4 matrix, and then performing matrix subtraction, summation law in the flat, and then the square root of the distance calculation. After computation of the distance, calling Matrix object sorting member function argsort () distance in ascending order. Here introduce a Pycharm View source life tips: join in the preparation of this program is that we are not sure argsort () whether the array object member function, we select this function and right -> Go to -> Declaration, so jumps to argsort () function declaration code sheet, by looking at the code can be confirmed affiliation does include this array class member function calls no problem

After the sort of distance, the next step according to the first K values corresponding to the minimum distance label to determine the current sample belongs to which category:

    for i in range (k):
        VoteiLabel = labels [SortedDistanceIndicies [i]]
        ClassCount [VoteiLabel] = ClassCount.get (VoteiLabel, 0) + 1
    SortedClassCount = sorted (ClassCount.items (), key = operator.itemgetter (1), reverse = True)
There is a small problem is to obtain in Python2 the dictionary elements using dict.iteritems () member function, and instead dict.items () function in the Python3. "Key = operator.itemgetter (1)" means that the specified function for the second dimension dictionary sort the elements, attention needed here before you import the symbol library operator. Here is the number of times by the value of each type of label that appears before recording the lowest K distances judgment attributable to the test sample.

4, the test

Here is the complete KNN test code:

# Coding: utf-8
from numpy import *
import operator


# Create training samples ==================== ====================
def createdataset ():
    group = array ([[1.0, 1.1], [1.0, 1.0], [0, 0], [0, 0.1]])
    labels = [ 'A', 'B', 'C', 'D']
    return group, labels

# ==================== Euclidean distance classifier ====================
def classify (Inx, Dataset, labels, k):
    DataSetSize = Dataset.shape [0] # Get the number of rows of data, shape [1] ranked number
    diffmat = tile (Inx, (DataSetSize, 1)) - Dataset
    SqDiffMat = diffmat ** 2
    SqDistances = SqDiffMat.sum (axis = 1)
    Distance = SqDistances ** 0.5
    SortedDistanceIndicies = Distance.argsort ()
    ClassCount = {}
    for i in range (k):
        VoteiLabel = labels [SortedDistanceIndicies [i]]
        ClassCount [VoteiLabel] = ClassCount.get (VoteiLabel, 0) + 1
    SortedClassCount = sorted (ClassCount.items (), key = operator.itemgetter (1), reverse = True)
    return SortedClassCount [0] [0]

Groups, Labels = createdataset ()
Result = classify ([0, 0], Groups, Labels, 1)
print (Result)
Run the code, the program promised result "C". It should mention that is for a single training sample (each class has only one training sample) the classification, K KNN value should be set to 1.
     
         
         
         
  More:      
 
- Linux System Getting Started Learning: Fix ImportError: No module named scapy.all (Linux)
- Debian 7.6 install Nvidia graphics driver (Linux)
- Polymorphism of the C ++ compiler and run-time polymorphism (Programming)
- Configuring a Linux operating system against syn attack (Linux)
- OGG-01496 OGG-01031 Error Resolution (Database)
- Awk include binding capacity larger than the specified size of all files directory (Linux)
- Mutt - an email client that sends messages through the terminal (Linux)
- Under CentOS 7 installation and deployment environment Ceph (Server)
- MySQL 5.6 use GTIDs build the master database (Database)
- KVM QEMU virtual machine installation configuration under CentOS (Linux)
- Getting Started with Linux system to learn: how to install autossh (Linux)
- SLF4J Tutorial (Programming)
- C language programming entry - macro definitions and enum (Programming)
- Some common regular expressions (Linux)
- Three kinds of implementation model of the Linux thread history (Programming)
- How to install Linux Kernel 4.0 On CentOS 7 system (Linux)
- Remote installation of Net-SNMP whole process (Linux)
- To install Spotify in Ubuntu / Mint (Linux)
- Understand the security restore accidentally deleted critical system files (Linux)
- Ubuntu install Eclipse can not find JAVA_HOME problem (Linux)
     
           
     
  CopyRight 2002-2020 newfreesoft.com, All Rights Reserved.