Home PC Games Linux Windows Database Network Programming Server Mobile  
  Home \ Programming \ Java and Python use make way dictionary word search script     - There are three ways to run a Linux operating system from a USB stick (Linux)

- Linux network monitoring strategy (Linux)

- MySQL 5.7 can not log in problem (Database)

- Unix average load average load calculation method (Server)

- Let you Fun Ubuntu desktop eleven weapon (Linux)

- Linux system performance tuning of Analysis (Linux)

- Linux alpine use the command line to access Gmail (Linux)

- Virtualbox virtual machine can not copy CentOS Network (Linux)

- Getting Started with Linux: Learn how to install and access CentOS 7 Remote Desktop on a VPS (Server)

- Ubuntu is not in the sudoers file problem solving (Linux)

- Merge sort Java implementation (Programming)

- How to use the beta / unstable version of the software in Debian library (Linux)

- CMake Quick Start Tutorial (Linux)

- To install and deploy Apache under the CentOS (Server)

- How to Install Android Studio on Ubuntu 15.04 / CentOS7 (Linux)

- Close and limit unused ports computer server security protection (Linux)

- GlusterFS distributed storage deployment (Server)

- Oracle 11g RAC automatically play GI PSU patch ( (Database)

- The user of fedora is not in the sudoers file method to solve (Linux)

- Use Python automatically cleared Android Engineering excess resources (Programming)

  Java and Python use make way dictionary word search script
  Add Date : 2017-08-31      
  Today whim, want to be a search word things, they rush to the way dictionary official website looked, that we want to query word is embedded in a Web page address to the proper way dictionary, then the result is that the page we need the interpretation of the word, so the only thing needed technical knowledge:

Regular Expressions

We have to do is extract the interpretation of the word from the acquired Web page source code, so just say here that regular expressions to extract the word interpretation.
Analysis page source code, we can see that the interpretation of the word in a div tag inside

The primary goal is to get this part of the regular expression can be written:

(? S) < div class = \ "trans-container \">. *? < Ul>. *? < / Div>
// (? S) is to make the meaning of '' can match a newline, the default is mismatched
? //.* Mean, in the non-greedy pattern matches any number of characters access to this section, the further we need is the interpretation of the word inside, so we can do:

(? M) < li> (. *?) < / Li>
// (? M) is the meaning of matching rows in a row are not in accordance with this regular expression matching, default is not a branch, unified matching
.? // Here to use parentheses * wrap, in order to obtain direct meaning of the word, lay down next to the label below is specific code:

A, Java code,
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;

import java.io.IOException;
import java.util.Scanner;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Test {
    public static void main (String [] args) throws IOException {
        CloseableHttpClient httpClient = HttpClients.createDefault ();

        System.out.print ( "Please enter the word you want to check:");
        Scanner s = new Scanner (System.in);
        String word = s.nextLine ();
        word = word.replaceAll ( "", "+");

        // Find the address lookup words based on configuration
        HttpGet getWordMean = new HttpGet ( "http://dict.youdao.com/search?q=" + word + "& keyfrom = dict.index");
        CloseableHttpResponse response = httpClient.execute (getWordMean); // Get reverse page source

        String result = EntityUtils.toString (response.getEntity ());
        response.close ();
        // Note (? S), which means let '' matches a newline does not match the default
        Pattern searchMeanPattern = Pattern.compile ( "? (S) < div class = \" trans-container \ "> * < ul> * < / div>.?.?");
        Matcher m1 = searchMeanPattern.matcher (result); // m1 is the translation of the entire Gets the < div>

        if (m1.find ()) {
            String means = m1.group (); // all the explanations, including the page tags
            Pattern getChinese = Pattern.compile ( "(m) < li> (*) < / li>?.?"); // (? M) on behalf of row match
            Matcher m2 = getChinese.matcher (means);

            System.out.println ( "Interpretation:");
            while (m2.find ()) {
                // In Java (. *?) Is Group 1, so with group (1)
                System.out.println ( "\ t" + m2.group (1));
        } Else {
            System.out.println ( "not find the interpretation.");
            System.exit (0);
} Two, Python Code
#! / Usr / bin / python
#coding: utf-8
import urllib
import sys
import re

if len (sys.argv) == 1: # there is no word on Usage Tips
    print "Usage: ./ Dict.py want to find the word"
    sys.exit ()

word = ""
for x in range (len (sys.argv) - 1): # find may be the phrase, with a space, such as "join in", the word here splicing
    word + = "" + sys.argv [x + 1]
print "word:" + word

searchUrl = "http://dict.youdao.com/search?q=" + word + "& keyfrom = dict.index" # Find Address
response = urllib.urlopen (searchUrl) .read () # get the page to find the source code

# Source code from a Web page to extract the word interpretation of that part of the
searchSuccess = re.search (r "(? s) < div class = \" trans-container \ ">. *? < ul>. *? < / div>", response)

if searchSuccess:
    # Get the word we want to extract the core of the interpretation in the case of only one packet, findall returns a list of the sub-group of strings
    means = re.findall (r "(? m) < li> (. *?) < / li>", searchSuccess.group ())
    print "Interpretation:"
    for mean in means:
        print "\ t" + mean # output interpretation
- Linux instructions and examples GPG encryption and decryption (Linux)
- Linux simple commands (Linux)
- How to use jgit to manage Git submodule (Linux)
- Node.js form --formidable (Programming)
- Bitmap memory footprint of computing Android memory optimization (Linux)
- Linux how to view the graphics models notebook (Linux)
- How x2g0 install Remote Desktop on Linux VPS (Server)
- First start with Kali Linux 2.0 (Linux)
- How to adjust the system time CentOS (Linux)
- CentOS Nginx achieve 3 virtual machine load balancing (Server)
- Comparison of Nginx and Nginx + (Server)
- Linux, security encryption to transfer files between machines (Linux)
- Search Linux commands and files - which, whereis, locate, find (Linux)
- CentOS7 install JAVA notes (Linux)
- In addition to wget and curl, what better alternatives (Linux)
- Linux virtual machines to solve end MySQL database can not remote access (Database)
- To remove those IP is prohibited Fail2ban on CentOS 6/7 (Server)
- CentOS 6.4 dial-up Raiders (Linux)
- Use matplotlib scientific drawing in Linux (Linux)
- Compare Dalvik virtual machine and the JVM (Linux)
  CopyRight 2002-2022 newfreesoft.com, All Rights Reserved.