Summary of the invention
In view of this, the embodiment of the present invention provides a kind of access log merging method, log processing server and system, with
Realize the purpose automatically collecting the access log of multiple web server records and merging.
To achieve the above object, the embodiment of the present invention provides the following technical solutions:
A kind of access log merging method, is applied to log processing server, and the access log merging method includes:
The access log that each web server is recorded is replicated, wherein the suffix of the log name of each access log carries pair
Answer the server identification of web server;
From the access log replicated, determine that the server identification that log name is identical but the suffix of log name carries is different
Access log set;
Access log in each access log set is mutually merged, the corresponding conjunction of each access log set is obtained
And access log.
Optionally, the access log that each web server of the duplication is recorded includes:
Determine the server identification of each web server, the log name of the access log that need to be replicated and each web server
Login password;
According to the server identification and login password of each web server, each web server of Telnet, by each web services
The access log corresponding with the log name of the access log that need to be replicated of device record is replicated;
The server identification of corresponding web server is added in the suffix of the log name of the access log replicated.
Optionally, the access log by each access log set mutually merges, and obtains each access log
Gathering corresponding merging access log includes:
For each access log set, the log content of each access log in access log set is exported to setting
Determine in download directory, duplicate removal and sequence, combined log content, weight are carried out to the log content of each access log of output
The file with combined log content is named, obtains merging access log.
Optionally, the server identification of each web server of the determination, the log name for the access log that need to be replicated, and it is each
The login password of web server includes:
It executes the first script and the second script is called by the first script, by the first script by the clothes of each web server
Business device mark, the log name for the access log that need to be replicated and the login password of each web server pass to second script;
The server identification and login password according to each web server, each web server of Telnet, by each web
The access log corresponding with the log name of the access log that need to be replicated of server record carries out duplication
Pass through this server identification and login password according to each web server of crus secunda, each web services of Telnet
Device, and by the second script execution remote copy order is recording each web server with the access day that need to be replicated
The corresponding access log of log name of will is replicated;
The suffix of the log name in the access log replicated adds the server identification packet of corresponding web server
It includes:
The server of corresponding web server is added in the suffix of the log name of the access log replicated by the second script
Mark.
Optionally, through first script from the access log replicated, determine that log name is identical but log name
The different access log set of the server identification that suffix carries;And it will be in each access log set by the first script
Access log mutually merges, and obtains the corresponding merging access log of each access log set.
The embodiment of the present invention also provides a kind of log processing server, comprising:
First script, for calling the second script to replicate the access log that each web server is recorded, wherein respectively access day
The suffix of the log name of will carries the server identification of corresponding web server;And from the access log replicated, day is determined
The different access log set of the server identification that will name is identical but the suffix of log name carries, by each access log set
In access log mutually merge, obtain the corresponding merging access log of each access log set;
Second script replicates the access log that each web server is recorded, wherein each visit for being called by the first script
Ask that the suffix of the log name of log carries the server identification of corresponding web server.
Optionally, first script includes:
Transfer unit, for by the server identification of each web server, the log name for the access log that need to be replicated, and it is each
The login password of web server passes to second script;
Second script includes:
Telnet copied cells, for the server identification and login password according to each web server, Telnet is each
Web server, and remote copy order is executed, the log with the access log that need to be replicated that each web server is recorded
The corresponding access log of name is replicated;
Adding unit is identified, the suffix for the log name in the access log replicated adds corresponding web server
Server identification.
Optionally, first script includes:
Combining unit is used for for each access log set, by the day of each access log in access log set
Will content is exported into setting download directory, is carried out duplicate removal and sequence to the log content of each access log of output, is closed
And log content, rename have combineds log content file, obtain merging access log.
The embodiment of the present invention also provides a kind of access log combination system, comprising: proxy server, multiple web servers,
Log processing server;
Wherein, the proxy server is used to user's access branching to the multiple web server;
The web server accesses corresponding access log for recording user;
The log processing server, the access log recorded for replicating each web server, wherein each access log
The suffix of log name carry the server identification of corresponding web server;From the access log replicated, log name is determined
The different access log set of the server identification that identical but log name suffix carries;It will be in each access log set
Access log mutually merges, and obtains the corresponding merging access log of each access log set.
Optionally, the log processing server has the first script and the second script;
First script, for calling the second script to replicate the access log that each web server is recorded, wherein each visit
Ask that the suffix of the log name of log carries the server identification of corresponding web server;And from the access log replicated, really
The different access log set of the server identification that settled date will name is identical but the suffix of log name carries, by each access log
Access log in set mutually merges, and obtains the corresponding merging access log of each access log set;
Second script, for replicating the access log that each web server is recorded by the first script calling, wherein
The suffix of the log name of each access log carries the server identification of corresponding web server.
Based on the above-mentioned technical proposal, access log merging method provided in an embodiment of the present invention, can be applied to log processing
In server;The access log that the reproducible each web server of log processing server is recorded, wherein the log of each access log
The suffix of name carries the server identification of corresponding web server, and from the access log replicated, determines that log name is identical
But the different access log set of the server identification that the suffix of log name carries, and then will be in each access log set
Access log mutually merges, and obtains the corresponding merging access log of each access log set.It can be seen that the embodiment of the present invention
The access log merging method of offer can be carried and be corresponded on the suffix of the log name for the access log collected from web server
The server identification of web server, so that it is determined that the server identification that log name is identical but the suffix of log name carries is different
Access log needs to merge, and then realizes the automatic merging of access log.Access log merging method provided in an embodiment of the present invention,
The purpose automatically collecting the access log of multiple web server records and merging can be achieved.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Fig. 2 is the structural block diagram of access log combination system provided in an embodiment of the present invention, provided in an embodiment of the present invention
Access log merging method can be based on system shown in Figure 2, referring to Fig. 2, access log combination system provided in an embodiment of the present invention
It may include: proxy server 1, multiple web servers 2, log processing server 3;
Wherein, proxy server 1 is set to Website front-end, can access user and branch to multiple web servers 2;
Each web server 2 is set to website rear end, can handle user's access, and record user accesses corresponding access log;
Log processing server 3 is setting of the embodiment of the present invention for merging the server of access log, log processing
Server 3 collects the access log that each web server 2 of duplication records, and carries out the merging of access log, obtains merging access
Log can be downloaded and analyze to merge access log;
Optionally, log processing server 3 can belong to same cluster with multiple web servers 2, can pass through interior Network Communication.
Access log combination system as shown in connection with fig. 2, below with the angle of log processing server, to the embodiment of the present invention
The access log merging method of offer is introduced.
Fig. 3 is the flow chart of access log merging method provided in an embodiment of the present invention, and this method can be applied at log
Server is managed, referring to Fig. 3, this method may include:
Step S100, the access log that each web server is recorded is replicated, wherein the suffix of the log name of each access log
Carry the server identification of corresponding web server;
When access log is that user accesses website, the user that web server is recorded accesses the log information of website, in
Appearance includes but is not limited to: accessing the IP of user, accesses chained address, the mode (browser used) of access, web server is rung
Between seasonable, the response time etc. of the entire browsing process of user.Under normal conditions, access log can all take log to rotate, according to
Access log is split storage by the date;
The embodiment of the present invention is to be shunted by multiple web servers to user's access, therefore be split for a certain
User access for, the log name for the access log that multiple web server is recorded be it is identical, only by difference
Web server carry out processing record;Therefore for the user's access being split for a certain, the embodiment of the present invention is from each web
The access log that server is collected is log name having the same and journal format, and is the source for distinguishing access log, this
Inventive embodiments can be after being collected into the access log that each web server is recorded, by the suffix of the log name of each access log
The server identification (such as server ip) of web server is corresponded in carrying;
Optionally, when replicating the access log that each web server is recorded, the embodiment of the present invention can determine each web clothes
The server identification of business device, the information such as log name for the access log that need to be replicated, thus the server mark for passing through each web server
Knowledge is communicated with each web server, is collected into from each web server corresponding with the log name of access log that need to be replicated
Access log;
Optionally, the access log replicated can be stored in specified storage region.
Step S110, from the access log replicated, the service that log name is identical but the suffix of log name carries is determined
Device identifies different access log set;
Optionally, for the user's access being split for a certain, in the access log that is replicated same log name but after
Sew the different access log of the server identification of carrying, constitutes the whole access log of the user being split access, need
It merges;The embodiment of the present invention can be in the access log replicated, and log name is identical but the suffix of log name carries
The different multiple access logs of server identification are gathered, and access log set is assembled;Access in access log set
Log is regarded as needing to merge.
Step S120, the access log in each access log set is mutually merged, obtains each access log collection
Close corresponding merging access log.
For each access log set, the embodiment of the present invention can by the content of each access log in access log set into
Row duplicate removal and sequence, combined log content, and the file with the log content of the merging is renamed, it obtains merging and visit
Ask log;
The purpose that the content of each access log in access log set is carried out to duplicate removal, is to retain useful letter
Breath, avoids duplicate information from repeatedly occurring, and reduces data volume;It, can foundation to the mode that the content of access log is ranked up
The specific time of log content record carries out;
Optionally, the embodiment of the present invention can by merge access log be placed in download directory, so as to staff can under
It carries and merges access log, carry out the work such as subsequent analysis.
Access log merging method provided in an embodiment of the present invention, can be applied in log processing server;Log processing
The access log that the reproducible each web server of server is recorded, wherein the suffix of the log name of each access log carries pair
The server identification of web server is answered, and from the access log replicated, determines that log name is identical but the suffix of log name is taken
The different access log set of the server identification of band, and then the access log in each access log set is mutually merged,
Obtain the corresponding merging access log of each access log set.It can be seen that access log provided in an embodiment of the present invention
Merging method can carry the clothes of corresponding web server on the suffix of the log name for the access log collected from web server
It is engaged in device mark, so that it is determined that the different access log of the server identification that log name is identical but the suffix of log name carries needs to close
And and then realize access log automatic merging.Access log merging method provided in an embodiment of the present invention is, it can be achieved that automatically
The purpose collecting the access log of multiple web server records and merging.
Optionally, log processing server can server identification by each web server of determination and each web server
Login password, realize each web server of Telnet, and then the log name based on the access log that need to be replicated takes from each web
Corresponding access log is copied in business device;
Corresponding, Fig. 4 shows another process of access log merging method provided in an embodiment of the present invention, and this method can
Applied to log processing server, referring to Fig. 4, this method may include:
Step S200, the server identification of each web server is determined, the log name for the access log that need to be replicated and each web
The login password of server;
Step S210, according to the server identification and login password of each web server, each web server of Telnet will
The access log corresponding with the log name of the access log that need to be replicated of each web server record is replicated;
Step S220, the server mark of corresponding web server is added in the suffix of the log name of the access log replicated
Know;
Step S230, from the access log replicated, the service that log name is identical but the suffix of log name carries is determined
Device identifies different access log set;
Step S240, the access log in each access log set is mutually merged, obtains each access log collection
Close corresponding merging access log.
Optionally, the embodiment of the present invention can export the log content of each access log in access log set to setting
In download directory, and then duplicate removal and sequence carried out to the log content of each access log of output, combined log content,
And the file with combined log content is renamed, it obtains merging access log.
Optionally, the embodiment of the present invention can realize the merging of access log by the first script of setting and the second script, the
Two scripts are called the Telnet, it can be achieved that each web server by the first script, and visit is replicated from each web server
Ask log, the first script can merge the access log replicated;
Optionally, in method shown in Fig. 3, the second script can be called each web server of execution duplication to be remembered by the first script
The access log of record, wherein the suffix of the log name of each access log carries the stream of the server identification of corresponding web server
Journey;First script is executable from the access log replicated, determines the service that log name is identical but the suffix of log name carries
Device identifies different access log set, and the access log in each access log set is mutually merged, and obtains each described
The corresponding process for merging access log of access log set;
Correspondingly, Fig. 5 shows another process of access log merging method provided in an embodiment of the present invention, this method can
Applied to log processing server, referring to Fig. 5, this method may include:
Step S300, it executes the first script and the second script is called by the first script, taken each web by the first script
It is engaged in the server identification of device, the log name for the access log that need to be replicated and the login password of each web server pass to described the
Two scripts;
Step S310, by crus secunda this according to the server identification and login password of each web server, Telnet is each
Web server, and by the second script execution remote copy order, what each web server was recorded needs to replicate with described
The corresponding access log of log name of access log replicated;
Step S320, corresponding web services are added in the suffix of the log name of the access log replicated by the second script
The server identification of device;
Step S330, through the first script from the access log replicated, after determining that log name is identical but log name
Sew the different access log set of the server identification of carrying;
Step S340, for each access log set, by the first script by each access in access log set
The log content of log is exported into setting download directory, carries out duplicate removal and row to the log content of each access log of output
Sequence, combined log content rename the file with combined log content, obtain merging access log.
Optionally, the first script can be used as main code, and filename can be log_collect.sh, the generation of the first script
Code can be such that
Optionally, the second script can be used as auxiliary code, for Telnet web server and replicate access log,
File name can be scp_expect.sh, and the code of the second script can be such that
Optionally, access log method provided in an embodiment of the present invention can plan to execute, such as the first script and crus secunda
Originally it 4:00 AM can execute once daily, copy log and simultaneously merge, and be placed in the catalogue that can be used for web downloading, web downloading
Catalogue such as:
14***sh/home/shell/log_dispose/log_collect.sh。
The execution process of first script (log_collect.sh script), the second script (scp_expect.sh script) can
With as follows:
When executing log_collect.sh script, log_collect.sh will call scp_expect.sh script, and will
Web server list (server identification that record has each web server), (record has the access log that need to be replicated to log list
Log name), the date of the log rotated, and copy log after store position, the variables such as password of web server pass
It is defeated by scp_expect.sh script;
Server identification of the scp_expect.sh script according to the web server received, the day for the log for needing to replicate
Will name, using remote copy scp order by daily record reproduction to the machine;Under normal circumstances, Are can be all prompted when First Contact Connections
You sure you want to continue connecting (yes/no)? need to input yes continuation herein;scp_
It is handled by expect in expect.sh, ensure that the automation of script;
After the completion of scp_expect.sh execution, journal file has been downloaded to/data/nginx_log/ catalogue under,
And it has changed filename and stores (joined the server ip of corresponding web server in the suffix of log name) respectively;
Log_collect.sh script can the server identification that carries of log name is identical but log name suffix it is different
Access log checks log content using the cat order of linux;All output is to standard output for the log content checked, then
Duplicate removal and sequence are carried out using redirection " > " symbol, and to content, and then duplicate removal and the content of the standard output of sequence is whole
Output renames again into the catalogue for web downloading, obtains merging access log;Optionally, duplicate removal and sequence be
Before content is output to new file (i.e. merging access log), that is, in log content all output to standard outputs
It carries out.
It finally assigns the file for merging access log to permission, has permission to access other users.
It should be noted that it is using 1 pair of access mould more than 1 pair that user, which accesses the mode of website, in the embodiment of the present invention
Formula, i.e. 1 crowd of user access 1 IP and are shunted by multiple web servers, this IP is front-end proxy agent server, and there are multiple web in rear end
Server shunts user's access;The web server of rear end program having the same, therefore the user of a certain shunting is accessed,
The access log of the web server record of rear end is not multiplicity, they have identical log name, identical format;So this hair
Access log that bright embodiment is collected into need it is to be changed only using the log name suffix of log added the IP of server as
Mark;It is also by same log name when merging, the log of different server IP suffix is merged into a file, guarantees combined standard
True property.
Meanwhile although the embodiment of the present invention uses remote copy, but log processing server and web server are same
It is to be copied using Intranet, therefore the copy stability and rate of access log can be protected in one cluster.
In addition, the processing logic of access log merging method provided in an embodiment of the present invention be execute in a serial fashion, though
So setting log processing server carries out log merging (4:00 AM as previously described) in certain time period, although consumption
Have it is instantaneous increase, but overall consumption is in a more stable state, to the operation efficiency and speed of log processing server
Degree has no too big influence.
Log processing server provided in an embodiment of the present invention is introduced below, log processing service described below
Device can correspond to each other reference with above-described access log merging method.
Fig. 6 is the structural block diagram of log processing server provided in an embodiment of the present invention, referring to Fig. 6, log processing clothes
Business device may include:
First script 100, for calling the second script to replicate the access log that each web server is recorded, wherein each visit
Ask that the suffix of the log name of log carries the server identification of corresponding web server;And from the access log replicated, really
The different access log set of the server identification that settled date will name is identical but the suffix of log name carries, by each access log
Access log in set mutually merges, and obtains the corresponding merging access log of each access log set;
Second script 200 replicates the access log that each web server is recorded, wherein respectively for being called by the first script
The suffix of the log name of access log carries the server identification of corresponding web server.
Optionally, Fig. 7 shows the functional frame composition of the first script provided in an embodiment of the present invention, referring to Fig. 7, the first foot
Originally may include:
Transfer unit 110, for by the server identification of each web server, the log name for the access log that need to be replicated, and
The login password of each web server passes to second script;
Gather determination unit 120, for from the access log replicated, determining that log name is identical but the suffix of log name
The different access log set of the server identification of carrying;
Combining unit 130 is used for for each access log set, by each access log in access log set
Log content is exported into setting download directory, is carried out duplicate removal and sequence to the log content of each access log of output, is obtained
Combined log content renames the file with combined log content, obtains merging access log.
Correspondingly, Fig. 8 shows the functional frame composition of the second script provided in an embodiment of the present invention, referring to Fig. 7, this second
Script may include:
Telnet copied cells 210 are remotely stepped on for the server identification and login password according to each web server
Record each web server, and execute remote copy order, each web server is recorded with the access log that need to be replicated
The corresponding access log of log name is replicated;
Adding unit 220 is identified, the suffix for the log name in the access log replicated adds corresponding web server
Server identification.
The embodiment of the present invention also provides a kind of access log combination system, and the structure of the access log combination system can be such as figure
Shown in 2, comprising: proxy server, multiple web servers, log processing server;
Wherein, the proxy server is used to user's access branching to the multiple web server;
The web server accesses corresponding access log for recording user;
The log processing server, the access log recorded for replicating each web server, wherein each access log
The suffix of log name carry the server identification of corresponding web server;From the access log replicated, log name is determined
The different access log set of the server identification that identical but log name suffix carries;It will be in each access log set
Access log mutually merges, and obtains the corresponding merging access log of each access log set.
Optionally, the log processing server can have the first script and the second script;
First script, for calling the second script to replicate the access log that each web server is recorded, wherein each visit
Ask that the suffix of the log name of log carries the server identification of corresponding web server;And from the access log replicated, really
The different access log set of the server identification that settled date will name is identical but the suffix of log name carries, by each access log
Access log in set mutually merges, and obtains the corresponding merging access log of each access log set;
Second script, for replicating the access log that each web server is recorded by the first script calling, wherein
The suffix of the log name of each access log carries the server identification of corresponding web server.
The embodiment of the present invention can realize the access log for automatically collecting multiple web server records and merge, and make
The merging for obtaining access log can be concisely and efficiently progress.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with other
The difference of embodiment, the same or similar parts in each embodiment may refer to each other.For device disclosed in embodiment
For, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is said referring to method part
It is bright.
Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure
And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and
The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These
Function is implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Profession
Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered
Think beyond the scope of this invention.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor
The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit
Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology
In any other form of storage medium well known in field.
The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention.
Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention
It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one
The widest scope of cause.