Home PC Games Linux Windows Database Network Programming Server Mobile  
           
  Home \ Linux \ Linux Fundamentals of the text, data flow processing orders     - Node.js installed on Ubuntu Upstream version (Linux)

- ACL permissions Linux command (Linux)

- Linux security settings Notes (Linux)

- Linux environment to configure Apache + Django + wsgi (Server)

- Java memory-mapped file MappedByteBuffer (Programming)

- Spring Data MongoDB combat (Database)

- Android Get App version number and version name (Programming)

- Log analysis is done with Grafana Elasticsearch (Server)

- Forbid screen change the window size when creating a new window under CentOS (Linux)

- Tsunami-udp accelerated network transmission (Linux)

- Tomcat itself through simple movement separation (Server)

- Role Object of registerNatives () method (Programming)

- Talk about the Linux ABI compatibility Application (Linux)

- Use $ BASH ENV variable to mention the right way under Linux (Linux)

- Installation and use of Linux Sniffer tool Tcpdump (Linux)

- Linux security configuration (Linux)

- NFS installation process under the CentOS (Linux)

- Dell R710 server disk recovery database one case (record) (Server)

- How to adjust the system time CentOS (Linux)

- Default permissions Linux file and directory permissions and hide - umask, chattr, lsattr, SUID, SGID, SBIT, file (Linux)

 
         
  Linux Fundamentals of the text, data flow processing orders
     
  Add Date : 2017-08-31      
         
         
         
  1 Awk: text and data processing tools

-------------------------------------------------- ------------------------------
Awk is good at analyzing the data and generate reports, awk is simply read the file line by line to space for the default delimiters will slice each line, cut some of the parts and then a variety of analysis and processing.

Usage: awk '{pattern + action}' {filenames}

Where pattern is the content of awk's lookup in the data, and action is a list of commands that are executed when a match is found. Curly braces ({}) do not need to be present in the program at all times, but they are used to group a series of instructions according to a particular pattern. Pattern is to express the regular expression, with a slash enclosed. Prepare the example action file: netstat -t >> netstat.txt

-------------------------------------------------- ------------------------------
1.1 print output: print, formatted print output: printf

-------------------------------------------------- ------------------------------
Awk '{print $ 1, $ 4}' netstat.txt
Awk '{printf "% 8s% -8s% -8s% -18s% -22s% -15s \ n", $ 1, $ 2, $ 3, $ 4, $ 5, $ 6}' netstat.txt


1.2 Filtering records: awk '$ 3 == 0 && $ 6 == "LISTEN"' netstat.txt

-------------------------------------------------- ------------------------------
Where "==" is the comparison operator. Other comparison operators:! =,>, < ,> =, < =
Awk '$ 3> 0 {print $ 0}' netstat.txt
Added header: built-in variable NR
Awk '$ 3 == 0 && $ 6 == "LISTEN" || NR == 1' netstat.txt
Plus the formatted output

Awk '$ 3 == 0 && $ 6 == "LISTEN" || NR == 1 {printf "% -20s% -20s% s \ n", $ 4, $ 5, $ 6}' netstat.txt
Ps: awk built-in variables
$ 0
 The current record (this variable holds the contents of the entire line)
 
$ 1 ~ $ n
 The nth field of the current record, separated by FS
 
FS
 The input field delimiter defaults to either a space or a tab
 
NF
 The number of fields in the current record, that is, how many columns
 
NR
 

The number of records that have been read, that is, the line number, starting from 1, if there are multiple files, this value is constantly accumulating.
 
FNR
 The current number of records, unlike NR, is the line number of each file
 
RS
 Enter the record delimiter, the default is a line break
 
OFS
 Output field delimiter, the default is also a space
 
ORS
 The output record delimiter, the default is a newline character
 
FILENAME
 The name of the current input file
 
Output line number: awk '$ 3 == 0 && $ 6 == "ESTABLISHED" || NR == 1 {printf "% 02s% s% -20s% -20s% s \ n", NR, FNR, $ 4, $ 5, $ 6} 'netstat.txt


Specify the delimiter: awk 'BEGIN {FS = ":"} {print $ 1, $ 3, $ 6}' / etc / passwd
Or awk -F: '{print $ 1, $ 3, $ 6}' / etc / passwd
Output as \ t as a delimiter: awk -F: '{print $ 1, $ 3, $ 6}' OFS = "\ t" / etc / passwd


1.3 String Matching: ~ means that the match pattern starts and the regular expression matches.

-------------------------------------------------- ------------------------------
Awk '$ 6 ~ / TIME / || NR == 1 {print NR, $ 4, $ 5, $ 6}' OFS = "\ t" netstat.txt
Awk '$ 6 ~ / ESTABLISHED / || NR == 1 {print NR, $ 4, $ 5, $ 6}' OFS = "\ t" netstat.txt
Awk '/ LISTEN /' netstat.txt
Use "/ TIME | ESTABLISHED /" to match TIME or ESTABLISHED:
Awk '$ 6 ~ / FIN | TIME / || NR == 1 {print NR, $ 4, $ 5, $ 6}' OFS = "\ t" netstat.txt
Inversion mode:! ~
Awk '$ 6! ~ / TIME / || NR == 1 {print NR, $ 4, $ 5, $ 6}' OFS = "\ t" netstat.txt
Or awk '! / WAIT /' netstat.txt


1.4 Split File: Use Stream Redirection>

-------------------------------------------------- ------------------------------
Awk 'NR! = 1 {print> $ 6}' netstat.txt NR! = 1 that does not deal with the first table
Output the specified column to the file: awk 'NR! = 1 {print $ 4, $ 5> $ 6}' netstat.txt
Use procedural flow for conditional splitting: if else
Awk 'NR! = 1 {if ($ 6 ~ / TIME | ESTABLISHED /) print> "1.txt";
Else if ($ 6 ~ / LISTEN /) print> "2.txt";
Else print> "3.txt"} 'netstat.txt


1.5 Statistics

-------------------------------------------------- ------------------------------
Calculate the sum of the file sizes of all C files, CPP files, and H files:
Ls -l * .cpp * .c * .h | awk '{sum + = $ 5} END {print sum}'
Count each connection state usage: use the array
Awk 'NR! = 1 {a [$ 6] ++;} END {for (i in a) print i "," a [i];}' netstat.txt
Statistics of each user's process accounted for the number of memory:
For example, ps aux | awk 'NR! = 1 {a [$ 1] + = $ 6;} END {for (i ina) print i "," a [i] "KB"


Arrays: Since subscripts to arrays in awk can be numbers and letters, subscripts to arrays are often referred to as keys. Values and keywords are stored in an internal hash table for key / value. Since hash is not a sequential store, you will see that the array contents are not displayed in the order you expected. Arrays and variables, are automatically created in use, awk will also automatically determine whether it is stored in digital or string. In general, the awk array used to collect information from the record can be used to calculate the sum of statistical words and tracking template is matched the number of times and so on.


1.6 use the script for text, data processing

-------------------------------------------------- ------------------------------
BEGIN, END keyword: BEGIN that deal with all the previous line of the logo, END said that after all the processing line logo, the specific syntax:
BEGIN {This is put inside the pre-implementation statement}
END {This is the end of the deal with all the lines to be implemented after the statement}
{This is to deal with the implementation of each line when the statement}
Example Action File: cat cal.awk
#! / Bin / awk -f
# Run before
BEGIN {
  Math = 0
  English = 0
  Computer = 0


  Printf "NAME NO. MATH ENGLISH COMPUTER TOTAL \ n"
  Printf "--------------------------------------------- \ n"
}}
# Running
{
  Math + = $ 3
  English + = $ 4
  Computer + = $ 5
  Printf "% -6s% -6s% 4d% 8d% 8d% 8d \ n", $ 1, $ 2, $ 3, $ 4, $ 5, $ 3 + $ 4 + $ 5
}}
# After running
END {
  Printf "--------------------------------------------- \ n"
  Printf "TOTAL:% 10d% 8d% 8d \ n", math, english, computer
  Printf "AVERAGE:% 10.2f% 8.2f% 8.2f \ n", math / NR, english / NR, computer / NR
}}
Execute: awk -f cal.awk score.txt


Variables Declaration and Environment Variables: Use the -v parameter for variable declarations, the ENVIRON keyword for environment variables

-------------------------------------------------- ------------------------------
$ X = 5
$ Y = 10
$ Export y #y is exported as an environment variable

$ Echo $ x $ y
5 10
$ Awk -v val = $ x '{print $ 1, $ 2, $ 3, $ 4 + val, $ 5 + ENVIRON [ "y"]}' OFS = "\ t" score.txt

2 sed: stream editor


-------------------------------------------------- ------------------------------
Stream editor, the flow editor, with the way to edit the text, the regular expression pattern matching. Sed itself is a pipeline command, you can analyze the standard input, you can also replace the data, delete, add, Jie, and other functions to take a particular line.
Presentation text: catpets.txt
This is my cat
My cat's name is betty
This is my dog
My dog's name is frank
This is my fish
My fish's name is george
This is my goat
My goat's name is adam
Use: sed [-nefr] action
Action: -i directly modify the contents of the read file, rather than by the screen output, -r that supports extended regular expression syntax.
Action Description: [n1 [, n2]] function n1, n2 that to select the number of rows, function include:
A-addition, c-substitution, d-deletion, i-insertion, p-printing, s-substitution (direct substitution work, e.g. 1,20 s / old / new / g)

-------------------------------------------------- ------------------------------
2.1 Use the s command to replace

-------------------------------------------------- ------------------------------
My string is replaced by Rango Chen's

Sed "s / my / Rango Chen's / g" pets.txt
Ps: If you use single quotes, you can not escape with '\'. The command does not change the contents of the file, but the contents of the output after processing, if you want to write back the file, you can use the redirection: sed "s / my / Rango Chen's / g" pets.txt> Chen_pets.txt, Or use the -i option: sed -i "s / my / Rango Chen's / g" pets.txt
S represents a substitution action, and / g represents a substitution on a row for all matches.
Put #: sed 's / ^ / # / g' pets.txt in front of each line
At the end of each line, add: sed 's / $ / --- / g' pets.txt


Basic Regular Expressions Special Characters:
^ Represents the beginning of a line. Such as: / ^ # / beginning with the # match.
$ Indicates the end of a line. Such as: /} $ / to the end of the match.
\ < Indicates the prefix. Such as \ < abc said to abc led to the word.
\> Indicates the suffix. Such as abc \> that abc end of the word.
\ Escape the special characters, restore its own meaning: grep-n \ 'pets.txt Search contains single quotes' that line.
Represents any single character.
* Indicates that a character has occurred 0 or more times.
[] The character set. For example, [abc] means matching a or b or c, and [a-zA-Z] means matching all 26 characters. If there are ^ inversion, such as [^ a] that non-a character
\ {N, m \} consecutive n to m "previous RE characters" grep -n 'go \ {2,3} g' 1.txt There are 2 to 3 o's between g and g String, that is (goog) (gooog)


Remove the html in a tags: html.txt:
  < B> This < / b> is what < span style = "text-decoration: underline;"> I < / span> meant. Understand?
Sed 's / < [^>] *> // g' html.txt
Just replace the text from lines 3 to 6: sed "3,6s / my / your / g" pets.txt
Just replace the text in line 3: sed "3s / my / your / g" pets.txt
Only replace the first s in each line: sed 's / s / S / 1' my.txt 1 for the first
Replace only the second s of each line: sed 's / s / S / 2' my.txt 2 for the second
Only replace the first line after the first three s: sed 's / s / S / 3g' my.txt


2.2 Multiple matches

-------------------------------------------------- ------------------------------
3, $ s / This / That / g 'my.txt, which is used to replace multiple patterns at once, with an interval between each pattern: sed' 1,3s / my / your / g;
The above command is equivalent to: sed -e'1, 3s / my / your / g '-e' 3, $ s / This / That / g 'my.txt
Use & as the matching variable, adding some characters: sed 's / my / [&] / g' my.txt
This command is equivalent to adding my []


2.3 parentheses match

-------------------------------------------------- ------------------------------
The parentheses of the regular expression matching the string will be used as variables, sed is used in the \ 1, \ 2 ...

Sed '/ This is my \ ([^,] * \),. * Is \ (. * \) / \ 1: \ 2 / g' my.txt
Cat: betty
Dog: frank
Fish: george
Goat: adam


2.4 Basic knowledge points

-------------------------------------------------- ------------------------------
1) Pattern Space: on the parameters-n, said the abolition of the default output, the equivalent --quiet, - silent. When a file is processed by sed, each row is stored in a temporary buffer called mode space, and all lines processed are printed on the screen, unless the row is deleted or the output is canceled. The pattern space is then cleared and a new row is stored for processing.
2) Address: [address [, address]] [!] {Cmd}, where "!" Indicates whether the command is executed after successful matching, address can be a number or a pattern, two addresses can be separated by commas, The address of the interval.
3) command packing: cmd can be more than one, they can be separated by a semicolon, you can use brackets as a nested command
On line 3 to line 6, match / This / successful, and then match / fish /, after the success of the implementation of d command:
Sed '3,6 {/ This / {/ fish / d}}' pets.txt
From the first line to the last line, if the match to this, then delete; If there are spaces in front, then remove the space:
Sed '1, $ {/ This / d; s / ^ * // g}' pets.txt
4) HoldSpace: Keep space
G: the contents of the hold space copy to the pattern space, the contents of the original pattern space cleared
G: hold space in the content append to pattern space \ n after
H: copy the contents of the pattern space to hold space, the contents of the original hold space is cleared
H: append the contents of pattern space to hold space \ n
X: swap the content of pattern space and hold space
Example: In this example, the line matching the test is found, it will be deposited into the model space, h command to copy it and save it into a cache buffer Of the special buffer. The second statement means that when the last line is reached, the G command fetches the line holding the buffer and then puts it back into the pattern space and appends to the end of the line that is now in the pattern space. In this case is appended to the last line. In simple terms, any line containing test is copied and appended to the end of the file.
Example: swap the contents of the schema space and hold buffers. Sed -e '/ test / h' -e '/ check / x' That is to include the test and check the exchange of lines.
5) the implementation of sed script: sed-f test.sed
Sed is very critical of the commands entered in the script. There can not be any blank or text at the end of the command. If there are multiple commands in a line, use semicolons to separate them. A comment line that starts with a # and can not span rows.
Ps: remove blank lines: sed '/ ^ * $ / d' file

3 sort: Text content sort


-------------------------------------------------- ------------------------------
Syntax: sort [-bcdfimMnr] [-o < output file>] [-t < separator character>] [+ < start field> - < end field>] [--help] [--verison] ]
Parameters:
-b Ignore the leading-out space character before each line.
-c Checks whether the files are sorted in order.
-d sort, the treatment of English letters, numbers and space characters, ignoring the other characters.
-f Sorts lowercase letters as uppercase.
-i When sorting, in addition to 040 to 176 between the ASCII characters, ignore the other characters.
-m Merges several sorted files.
-M Sorts the first three letters according to the abbreviations of the month.
-n Sorts by the size of the value.
-o < output file> Sorts the sorted result into the specified file.
-r Sorts in reverse order.
-t < separator character> Specifies the field delimiter character used for sorting.
+ < Start field> - < end field> Sorts by the specified field, ranging from the start field to the previous field in the end field.
--help Displays help.
--version Displays the version information


4 uniq: Filter adjacent matching lines from INPUT (or standard input), writingto OUTPUT (or standard output)

-------------------------------------------------- ------------------------------
Show unique rows, show only once for those rows that repeat continuously, and count duplicate rows

Uniq: without any parameters that match the first line that appears
Uniq -c: Displays the number of duplicate rows
Uniq -u: Show only the file does not appear in a row, the only line
Uniq -d: Displays only the lines that are repeated in a file.


5 cut: fragment extraction tool, you can extract from a text file or text stream text column

-------------------------------------------------- ------------------------------
5.1 Command Usage

-------------------------------------------------- ------------------------------
Cut -b list [-n] [file ...]
Cut -c list [file ...]
Cut -f list [-d delim] [- s] [file ...]
-b, -c, -f, respectively. -b, -c, -f respectively represent the byte, character, field (ie byte, character, field); Del split (full delimiter) that delimiters, by default, TAB; -s that does not include those without delimiters of the line (so that the definition of the text file); t split multibyte characters; file that is the name of the operation of the text file; Help to remove the comments and title). --output-delimiter = string, using the specified string as the output delimiter, the default delimiter using the input. In the above three methods, it indicates that the byte (-b), or the character (-c), or the field (-f) is extracted from the specified range.
Range of LIST:

Lt; / RTI & gt;

Only the Nth item

N-

From item N up to the end of the line

N-M

From item N to item M (including M)

-M

From the beginning of a line to the Mth entry (including M)

-

All items from the beginning to the end of a line

 
5.2 Usage Examples

-------------------------------------------------- ------------------------------
The first fifteen usernames in the / etc / passwd file are:
Cut -f1 -d: / etc / passwd | head -15
Jie take / etc / passwd file the contents of each line of the first ten bytes:
Cut -b 1-10 / etc / passwd
Jie take the first 1,4,7 bytes of each line of the contents of the document:
Cut -b 1,4,7 / etc / passwd
Replace the delimiter in / etc / passwd with the |
Cut -d: -f 1- -s --output-delimiter = "|" / etc / passwd

6 grep: Jie has a specific information line, line by line operation

-------------------------------------------------- ------------------------------
Usage: grep [-acinv] [--color = auto] 'Search for the string' filename '
-a: binary file to the text file to search for data
-c: Count the number of times the 'search string' was found
-i: Ignore the difference in case, so the case is treated as the same
-n: Incidentally output line number
-v: Inverse selection, which shows that there is no 'search string' content of that line!
--color = auto: You can find the keyword part of the display with color!


7 wc: wordcount, printnewline, word, and byte counts for each file

-------------------------------------------------- ------------------------------
Usage: wc [-lwmc]
-l: Only rows are listed;
-w: List only how many words (English words);

-m: how many characters;
-c: How many bytes
All parameters are listed without any parameters.
Examples: /etc/man.config There are many related words, firms, the number of characters:
Cat /etc/man.config | wc
     
         
         
         
  More:      
 
- Linux --- process handle limit summary (Linux)
- Computer security protection remove local and remote system log files (Linux)
- Linux command execution order control and pipeline (Linux)
- DM9000 timing settings (Programming)
- Ubuntu under VirtualBox virtual machine serial port settings (Linux)
- Five useful commands to manage file types and system time in linux (Linux)
- Linux beginners to develop the seven habits (Linux)
- CentOS 6.7 compile and install LAMP (Server)
- Linux alpine use the command line to access Gmail (Linux)
- MySQL stored procedures and triggers (Database)
- 10 important Linux ps command combat (Linux)
- Use Tmux and Vim to make IDE (Linux)
- To install Redis under Linux (Database)
- Ubuntu installed Komodo editor by PPA (Linux)
- Redis master-slave replication switch (Database)
- Linux system Passwd file detailed analysis (Linux)
- Red Hat Enterprise Linux 6.4 Configuring VNC Remote Desktop login access (Linux)
- RMAN parameters of ARCHIVELOG DELETION (Database)
- CentOS 6.6 install Oracle 11gR2 database (Database)
- Linux (RHEL6 CENTOS6 OLE6) VNC-SERVER Installation and Configuration (Server)
     
           
     
  CopyRight 2002-2022 newfreesoft.com, All Rights Reserved.