Home PC Games Linux Windows Database Network Programming Server Mobile  
  Home \ Programming \ Java and Python use make way dictionary word search script     - Ubuntu 15.04 using the Eclipse 4.4, Java 8 and WTP (Linux)

- Struts2 form of non-use component tags (Programming)

- The Objects in JavaScript (Programming)

- How to Install Foreman under Ubuntu (Server)

- pkg-config to use (Linux)

- Differential test piece using MongoDB performance YCSB (Database)

- Ubuntu ADSL dial-up Internet access (Linux)

- How to configure FirewallD in RHEL / CentOS 7 and Fedora in (Linux)

- Why JavaScript basic types can invoke methods (Programming)

- Why you can have JavaScript string method (Programming)

- To install the Contiki development toolchain on Ubuntu (Linux)

- About Linux iptables firewall interview questions and answers (Linux)

- Hibernate + JUnit test entity class generate database table (Programming)

- Configuring a Linux operating system security management services (Linux)

- Linux file system data file deletion problem space is not freed (Database)

- Replace element and non-replaced elements of learning (Programming)

- MongoDB query timeout exception SocketTimeoutException (Database)

- Jump table (skiplist) of code (Programming)

- Hadoop - Task Scheduling System Comparison (Server)

- Ubuntu Linux installation GAMIT10.6 (Linux)

  Java and Python use make way dictionary word search script
  Add Date : 2017-08-31      
  Today whim, want to be a search word things, they rush to the way dictionary official website looked, that we want to query word is embedded in a Web page address to the proper way dictionary, then the result is that the page we need the interpretation of the word, so the only thing needed technical knowledge:

Regular Expressions

We have to do is extract the interpretation of the word from the acquired Web page source code, so just say here that regular expressions to extract the word interpretation.
Analysis page source code, we can see that the interpretation of the word in a div tag inside

The primary goal is to get this part of the regular expression can be written:

(? S) < div class = \ "trans-container \">. *? < Ul>. *? < / Div>
// (? S) is to make the meaning of '' can match a newline, the default is mismatched
? //.* Mean, in the non-greedy pattern matches any number of characters access to this section, the further we need is the interpretation of the word inside, so we can do:

(? M) < li> (. *?) < / Li>
// (? M) is the meaning of matching rows in a row are not in accordance with this regular expression matching, default is not a branch, unified matching
.? // Here to use parentheses * wrap, in order to obtain direct meaning of the word, lay down next to the label below is specific code:

A, Java code,
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;

import java.io.IOException;
import java.util.Scanner;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Test {
    public static void main (String [] args) throws IOException {
        CloseableHttpClient httpClient = HttpClients.createDefault ();

        System.out.print ( "Please enter the word you want to check:");
        Scanner s = new Scanner (System.in);
        String word = s.nextLine ();
        word = word.replaceAll ( "", "+");

        // Find the address lookup words based on configuration
        HttpGet getWordMean = new HttpGet ( "http://dict.youdao.com/search?q=" + word + "& keyfrom = dict.index");
        CloseableHttpResponse response = httpClient.execute (getWordMean); // Get reverse page source

        String result = EntityUtils.toString (response.getEntity ());
        response.close ();
        // Note (? S), which means let '' matches a newline does not match the default
        Pattern searchMeanPattern = Pattern.compile ( "? (S) < div class = \" trans-container \ "> * < ul> * < / div>.?.?");
        Matcher m1 = searchMeanPattern.matcher (result); // m1 is the translation of the entire Gets the < div>

        if (m1.find ()) {
            String means = m1.group (); // all the explanations, including the page tags
            Pattern getChinese = Pattern.compile ( "(m) < li> (*) < / li>?.?"); // (? M) on behalf of row match
            Matcher m2 = getChinese.matcher (means);

            System.out.println ( "Interpretation:");
            while (m2.find ()) {
                // In Java (. *?) Is Group 1, so with group (1)
                System.out.println ( "\ t" + m2.group (1));
        } Else {
            System.out.println ( "not find the interpretation.");
            System.exit (0);
} Two, Python Code
#! / Usr / bin / python
#coding: utf-8
import urllib
import sys
import re

if len (sys.argv) == 1: # there is no word on Usage Tips
    print "Usage: ./ Dict.py want to find the word"
    sys.exit ()

word = ""
for x in range (len (sys.argv) - 1): # find may be the phrase, with a space, such as "join in", the word here splicing
    word + = "" + sys.argv [x + 1]
print "word:" + word

searchUrl = "http://dict.youdao.com/search?q=" + word + "& keyfrom = dict.index" # Find Address
response = urllib.urlopen (searchUrl) .read () # get the page to find the source code

# Source code from a Web page to extract the word interpretation of that part of the
searchSuccess = re.search (r "(? s) < div class = \" trans-container \ ">. *? < ul>. *? < / div>", response)

if searchSuccess:
    # Get the word we want to extract the core of the interpretation in the case of only one packet, findall returns a list of the sub-group of strings
    means = re.findall (r "(? m) < li> (. *?) < / li>", searchSuccess.group ())
    print "Interpretation:"
    for mean in means:
        print "\ t" + mean # output interpretation
- Dom4j change XML coding (Programming)
- Linux System Getting Started Learning: Linux command in w (Linux)
- Network traffic monitoring ntopng (Linux)
- CMake Quick Start Tutorial (Linux)
- Linux Monitoring Command (Linux)
- Three details reflect the Unix system security (Linux)
- Cobbler remotely install CentOS system (Linux)
- Ubuntu Thunderbird 24.4.0 (Linux)
- How to Install Android Studio on Ubuntu 15.04 / CentOS7 (Linux)
- Docker container plaintext password problem-solving way (Server)
- Linux file time Comments ctime mtime atime (Linux)
- Access clipboard content across multiple vim instances in a terminal (Linux)
- Python in os.path Magical (Programming)
- Java, on the dfile.encoding Systemproperty (Programming)
- Oracle 11g to create a second instance on Linux (Database)
- Ubuntu 14.04 Solution login interface infinite loop (Linux)
- Inherent limitations of Linux systems network security (Linux)
- Zabbix monitoring of the switch (Server)
- PL / SQL how to make the program every few seconds to insert a data (Database)
- Quick Install software RAID on Linux (Linux)
  CopyRight 2002-2022 newfreesoft.com, All Rights Reserved.