[go: up one dir, main page]

CN116127945B - Network link processing method and device, electronic equipment and storage medium - Google Patents

Network link processing method and device, electronic equipment and storage medium

Info

Publication number
CN116127945B
CN116127945B CN202211674049.6A CN202211674049A CN116127945B CN 116127945 B CN116127945 B CN 116127945B CN 202211674049 A CN202211674049 A CN 202211674049A CN 116127945 B CN116127945 B CN 116127945B
Authority
CN
China
Prior art keywords
link
data
data request
information source
accessed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211674049.6A
Other languages
Chinese (zh)
Other versions
CN116127945A (en
Inventor
陈志群
刘双
唐圆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Zhonghong Online Co ltd
Original Assignee
Shenzhen Zhonghong Online Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Zhonghong Online Co ltd filed Critical Shenzhen Zhonghong Online Co ltd
Priority to CN202211674049.6A priority Critical patent/CN116127945B/en
Publication of CN116127945A publication Critical patent/CN116127945A/en
Application granted granted Critical
Publication of CN116127945B publication Critical patent/CN116127945B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请实施例提供了网络链接处理方法和装置、电子设备、存储介质,涉及互联网技术领域。该方法包括:获取待访问链接和关键词字段;根据所述待访问链接和所述关键词字段对预设的信息源数据库进行筛选,得到选定信息源;根据所述选定信息源对预设的解析模板数据库进行筛选,得到选定解析模板;根据所述选定解析模板对所述待访问链接进行解析,得到目标数据请求;向所述待访问链接发送所述目标数据请求,以得到目标链接数据。在本申请实施例中,得到的目标数据请求与信息源配对,从而能够提高获取链接数据的完整性。

This application provides a network link processing method and apparatus, electronic device, and storage medium, relating to the field of Internet technology. The method includes: obtaining a link to be accessed and a keyword field; filtering a preset information source database based on the link to be accessed and the keyword field to obtain a selected information source; filtering a preset parsing template database based on the selected information source to obtain a selected parsing template; parsing the link to be accessed based on the selected parsing template to obtain a target data request; and sending the target data request to the link to be accessed to obtain target link data. In this application embodiment, the obtained target data request is paired with an information source, thereby improving the completeness of the obtained link data.

Description

Network link processing method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of internet technologies, and in particular, to a network link processing method and apparatus, an electronic device, and a storage medium.
Background
In general, in the case of a web link to an article, access to the web link is required to acquire link data of the article. The link data includes the title of the article, the release time of the article, the release author of the article, the body of the article, and the like. However, due to the limitation of the network link, it is highly likely that access to the network link directly does not result in relatively complete link data. Therefore, how to provide a network link processing method, which can improve the integrity of the acquired link data, is a technical problem to be solved.
Disclosure of Invention
The embodiment of the application mainly aims to provide a network link processing method and device, electronic equipment and storage medium, which can improve the integrity of acquired link data.
To achieve the above object, a first aspect of an embodiment of the present application provides a network link processing method, where the method includes:
Acquiring a link to be accessed and a keyword field;
Screening a preset information source database according to the links to be accessed and the keyword fields to obtain a selected information source;
Screening a preset analysis template database according to the selected information source to obtain a selected analysis template;
Analyzing the link to be accessed according to the selected analysis template to obtain a target data request;
and sending the target data request to the link to be accessed to obtain target link data.
In some embodiments, screening a preset information source database according to the link to be accessed and the keyword field to obtain a selected information source, including:
if the keyword field is an empty field, performing domain name resolution on the link to be accessed to obtain a link domain name;
and screening the information source database according to the link domain name to obtain the selected information source, wherein the information source database comprises matching information of the link domain name and the selected information source.
In some embodiments, screening a preset information source database according to the link to be accessed and the keyword field to obtain a selected information source, including:
if the keyword field is a non-null field, extracting keywords from the keyword field;
Performing similarity calculation according to the keywords and the names of each information source of the information source database to obtain similarity;
and obtaining the selected information source according to the information sources with the similarity meeting the preset similarity condition.
In some embodiments, parsing the link to be accessed according to the selected parsing template to obtain a target data request includes:
Identifying the request parameters of the link to be accessed to obtain an information identification code, wherein the information identification code is used for representing the identification of the target link data corresponding to the link to be accessed;
screening a preset data request database according to the information identification code to obtain the target data request, wherein the data request database comprises matching information of the information identification code and the target data request, and the target data request is used for requesting to acquire the link data to be accessed.
In some embodiments, before the screening the preset data request database according to the information identification code to obtain the target data request, the method further includes:
the creating of the data request database specifically comprises:
Acquiring at least one initial data request of a preset information source;
Carrying out parameter modification on the initial data request to obtain at least one intermediate data request, wherein the parameter modification comprises deleting parameters or modifying values of parameters;
sending the intermediate data request to the link to be accessed to obtain intermediate link data;
screening the intermediate data request of which the integrity of the intermediate link data meets a preset integrity condition to obtain the target data request;
And obtaining the data request database according to the target data request.
In some embodiments, before the screening the intermediate data request with the integrity of the intermediate link data meeting a preset integrity condition, the method further includes:
filtering the intermediate data request, specifically including:
Any two pieces of intermediate link data are obtained, and first intermediate link data and second intermediate link data are obtained;
comparing the data volume of the first intermediate link data with the data volume of the second intermediate link data to obtain a comparison result;
Deleting the intermediate data request corresponding to the first intermediate link data or deleting the intermediate data request corresponding to the second intermediate link data according to the comparison result.
In some embodiments, deleting the intermediate data request corresponding to the first intermediate link data or deleting the intermediate data request corresponding to the second intermediate link data according to the comparison result includes:
If the comparison result is that the data volume of the first intermediate link data is larger than or equal to the data volume of the second intermediate link data, deleting the intermediate data request corresponding to the first intermediate link data;
And if the comparison result is that the data volume of the first intermediate link data is smaller than the data volume of the second intermediate link data, deleting the intermediate data request corresponding to the second intermediate link data.
To achieve the above object, a second aspect of an embodiment of the present application provides a network link processing apparatus, including:
the acquisition module is used for acquiring the links to be accessed and the keyword fields;
The information source screening module is used for screening a preset information source database according to the links to be accessed and the keyword fields to obtain a selected information source;
The target screening module is used for screening a preset analysis template database according to the selected information source to obtain a selected analysis template;
The analysis module is used for analyzing the link to be accessed according to the selected analysis template to obtain a target data request;
And the sending module is used for sending the target data request to the link to be accessed so as to obtain target link data.
To achieve the above object, a third aspect of the embodiments of the present application proposes an electronic device, including a memory storing a computer program and a processor implementing the method according to the first aspect when the processor executes the computer program.
To achieve the above object, a fourth aspect of the embodiments of the present application proposes a storage medium, which is a computer-readable storage medium storing a computer program that, when executed by a processor, implements the method according to the first aspect.
According to the network link processing method and device, the electronic equipment and the storage medium, a matched information source is determined according to the links to be accessed and the keyword fields, an analysis template is selected according to the information source, the links with access are analyzed according to the analysis template to obtain a target data request, and then the target data request is sent to the links to be accessed, so that a server to which the links with access belong returns link data according to the target data request, and finally target link data is obtained. In the embodiment of the application, the obtained target data request is paired with the information source, so that the integrity of acquiring the link data can be improved.
Drawings
FIG. 1 is a flowchart of a network link processing method provided in an embodiment of the present application;
FIG. 2 is a schematic diagram of one embodiment of step S101 of FIG. 1;
fig. 3 is a flowchart of step S102 in fig. 1;
fig. 4 is a flowchart of step S102 in fig. 1;
Fig. 5 is a flowchart of step S104 in fig. 1;
FIG. 6 is a flowchart of a network link processing method according to another embodiment of the present application;
FIG. 7 is a flowchart of a network link processing method according to another embodiment of the present application;
fig. 8 is a block diagram of a network link processing apparatus according to an embodiment of the present application;
fig. 9 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
It should be noted that although functional block division is performed in a device diagram and a logic sequence is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the block division in the device, or in the flowchart. The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the application only and is not intended to be limiting of the application.
First, the nouns involved in the present application are parsed:
Uniform resource locators (Uniform Resource Locator, URLs) each web page on the network (Internet) has a unique name identifier, commonly referred to as a URL. It is a uniform resource locator of a web page, simply referred to as a URL, which is a web page (web) address, commonly known as a "web site". In the embodiment of the application, the network link is referred to as URL, also referred to as link address, or web page address.
The URL typically starts with http:// or https:// and the basic URL contains, for example, protocol type, domain name, path and parameters, etc. Protocol types such as http or https, where http is a hypertext transfer protocol (HyperText Transfer Protocol), https is a hypertext transfer security protocol (Hyper Text Transfer Protocolover Secure Socket Layer), and https is a hypertext transfer protocol transferred using a secure socket layer. HTTPHEADER parameters, HTTPHEADER include operation parameters of http requests and responses. Domain name, path, parameters, etc., the parameters including parameters of the requested data.
In general, if the user wants to obtain the link data of an article, such as the title of the article, the release time of the article, the release author of the article, the text of the article, etc., the user needs to go to the browser to check whether the network link is valid or not, and manually search the link data of the link article. Some of the link data may lack the text of the article. Therefore, how to provide a network link processing method, which can improve the integrity of the acquired link data, is a technical problem to be solved.
In addition, when the network link is obtained, whether the link is effective or not can be manually verified, if so, the article data information is judged, and when different information sources (websites and APP) are inspected, a great deal of manpower is required for the inspection, and the inspection efficiency is low. Therefore, the embodiment of the application can improve the efficiency of verifying the validity of the network link besides improving the integrity of the acquired link data.
The technical scheme of the embodiment of the application is mainly suitable for a communication network of a master-slave architecture, namely a client and a server architecture, wherein the client can send a request to the server and receive data returned by the server. The client may be, for example, an APP installed on the user device. The user equipment can be various smart phones, tablet computers and the like. The server may be, for example, a computer server.
The network link processing method provided by the embodiment of the application can be used in a plurality of general or special computer system environments or configurations. Such as a server computer, multiprocessor system, microprocessor-based system, set top box, programmable consumer electronics, network PC, distributed computing environment including any of the above systems or devices, and so forth. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiment of the application provides a network link processing method, a network link processing device, an electronic device and a storage medium, and specifically, the following embodiment is used for explaining, and first describes the network link processing method in the embodiment of the application.
According to one embodiment of the present application, a network link processing method is provided.
The network link specifically refers to a URL, and corresponding link data can be obtained by accessing the network link. The process of accessing the network link comprises the steps of obtaining a plurality of data requests according to network link assembly, then sending the plurality of data requests to a server corresponding to the network link, and obtaining link data returned by the server based on the data requests to complete the access process of the network link.
Fig. 1 is an optional flowchart of a network link processing method according to an embodiment of the present application, which may include, but is not limited to, steps S101 to S105.
Step S101, obtaining a link to be accessed and a keyword field;
Step S102, screening a preset information source database according to the links to be accessed and the keyword fields to obtain a selected information source;
Step S103, screening a preset analysis template database according to the selected information source to obtain a selected analysis template;
step S104, analyzing the link to be accessed according to the selected analysis template to obtain a target data request;
step S105, a target data request is sent to the link to be accessed to obtain target link data.
Step S101 to step S105 shown in the embodiment of the application, firstly, determining a matched information source according to a link to be accessed and a keyword field, then, selecting an analysis template according to the information source, then, analyzing the link with access according to the analysis template to obtain a target data request, and then, sending the target data request to the link to be accessed, so that a server to which the link with access belongs returns link data according to the target data request, and finally, obtaining target link data. In the embodiment of the application, the analysis efficiency of the link can be improved by determining the analysis template by the information source, and the obtained target data request is paired with the information source, so that the integrity of acquiring the link data can be improved.
Steps S101 to S105 are described in detail below.
In step S101 of some embodiments, the link to be accessed is specifically a URL, which may also be referred to as a network link, a network address, a web page address, or the like. The keyword field is an optional field, which may be a null field or a non-null field. If the keyword field is a null field, the description has no keyword, and if the keyword field is a non-null field, the description can extract the keyword from the keyword field.
The different scenarios for obtaining the links to be accessed and the keyword fields are described below in connection with fig. 2.
In one embodiment, the user enters a URL into the input box of the verification system, but does not enter a keyword, at which point the verification system will obtain the network link to be accessed and a keyword field, but the keyword field is a null field.
In another embodiment, the user enters a URL and a keyword into an input box of the verification system, at which point the verification system will obtain the network link to be accessed and the keyword field, and the keyword field is a non-null field.
Under the condition that a user inputs keywords to the verification system, when a cursor moves to an input box corresponding to the keywords, a plurality of candidate keywords preset by the verification system are displayed. The information sources pointed by the embodiment of the application comprise websites and APP, and one keyword field can be selected from a plurality of candidate keywords to be used as input in response to touching or clicking of a user.
In step S102 of some embodiments, a preset information source database is screened according to the links to be accessed and the keyword fields, so as to obtain a selected information source.
Specifically, the information source to which the link belongs can be known from the domain name of the URL, but domain name resolution is required for the URL to obtain the link domain name, and then information source screening is performed for the link domain name to obtain the selected information source. Referring specifically to fig. 3, step S102 may include, but is not limited to, steps S201 to S202:
step S201, if the keyword field is a null field, carrying out domain name resolution on the link to be accessed to obtain a link domain name;
Step S202, screening an information source database according to the link domain name to obtain a selected information source, wherein the information source database comprises matching information of the link domain name and the selected information source.
In the steps S201 to S202 shown in the embodiment of the present application, the URL includes a domain name, so that the domain name of the link to be accessed can be resolved, and then the information source matching is performed on the domain name of the link obtained by the resolution, so as to obtain the selected information source.
In another embodiment, considering that the time consumed for performing domain name resolution and information source matching affects the processing efficiency, a keyword field is introduced, and if the keyword field is a non-null field, a keyword is extracted from the keyword field, and an information source corresponding to the keyword is used as a selected information source.
It should be noted that, if the keyword field is a candidate keyword selected by touching or clicking by the user, no information source matching is required, and the selected keyword is used as the selected information source. Or referring to fig. 4, step S102 further includes, but is not limited to, steps S301 to S303:
Step S301, extracting keywords from the keyword field if the keyword field is a non-null field;
step S302, similarity calculation is carried out according to the keywords and the names of each information source of the information source database, and similarity is obtained;
step S303, obtaining the selected information source according to the information sources with the similarity meeting the preset similarity condition.
In the steps S301 to S303 shown in the embodiment of the application, the mode of selecting the information source is determined by the similarity between the keyword and the information source, and domain name resolution is not needed, so that the processing efficiency is greatly improved.
In step S302 of some embodiments, the keywords are vectorized to obtain a first vector, names of the information sources are vectorized to obtain a second vector, and cosine similarity or Euclidean distance is calculated according to the first vector and the second vector to obtain similarity.
In step S303 of some embodiments, the selected information source is obtained according to the information source with the greatest similarity.
In step S103 of some embodiments, a preset parsing template database is screened according to the selected information source to obtain a selected parsing template. Specifically, the parsing template database includes a plurality of candidate parsing templates, and matching information of each candidate parsing template and an information source. After the selected information source is obtained in step S102, the parsing template database is screened according to the selected information source, which is actually searching for matching information, and finding out the candidate parsing template paired with the selected information source, thereby obtaining the selected parsing template.
In step S104 of some embodiments, the link to be accessed is parsed according to the selected parsing template, resulting in a target data request.
Referring to fig. 5, step S104 includes, but is not limited to, steps S401 to S402:
Step S401, identifying request parameters of the link to be accessed to obtain an information identification code, wherein the information identification code is used for representing the identification of target link data of the link data to be accessed;
Step S402, screening a preset data request database according to the information identification code to obtain a target data request, wherein the data request database comprises matching information of the information identification code and the target data request, and the target data request is used for requesting to acquire target link data.
In the embodiment of the application, each URL to be accessed has a corresponding request parameter, where the request parameter is used to indicate an information identifier to be acquired, and the information identifier may indicate which target link data is specifically acquired. For example, the URL unique identifier, that is, the information identifier pointed by the embodiment of the present application, may be obtained from the URL link to be accessed, so that it is known which article is obtained or which video is obtained. The data request database comprises a plurality of candidate data requests and matching information of the information identification codes and the candidate data requests. And screening candidate data requests of the data request database according to the information identification code to obtain target data requests. Then, by sending the target data request, the target link data can be obtained.
Referring to fig. 6, before step S402, the network link processing method provided in the embodiment of the present application further includes:
the data request database is created, specifically including steps S501 to S505:
Step S501, at least one initial data request of a preset information source is obtained;
Step S502, carrying out parameter modification on the initial data request to obtain at least one intermediate data request, wherein the parameter modification comprises deleting parameters or modifying values of the parameters;
step S503, the intermediate data request is sent to the link to be accessed to obtain intermediate link data;
Step S504, screening the intermediate data request with the integrity of the intermediate link data conforming to the preset integrity condition to obtain a target data request;
step S505, a data request database is obtained according to the target data request.
In steps S501 to S505 shown in the embodiment of the present application, at least one initial data request of a preset information source is obtained by a packet capturing tool, and a designated link is accessed according to the initial data request, so that corresponding link data is received. However, the parameters can be deleted and the values of the parameters can be modified for the initial data request to obtain the intermediate data request, considering that redundant parameters exist in the separate initial data request or the values of the parameters are dynamically changed. And sending the obtained multiple intermediate data requests to the link to be accessed so as to receive the returned intermediate link data. And determining the target data request according to the integrity of the intermediate link data. It will be appreciated that the higher the integrity of the intermediate link data, the easier the corresponding intermediate link data is selected as the target data request. And finally, matching the target data request with the information type, and storing the matched information type into a database to finally obtain a data request database.
It should be noted that, in step S504, if the integrity of the intermediate link data meets the preset integrity condition, the corresponding intermediate data request may be added to the target data request. In this way, there may be two identical intermediate data requests, both of which can acquire intermediate link data with consistent data, resulting in redundant data requests in the target data request.
Therefore, referring to fig. 7, before step S504, the network link processing method according to the embodiment of the present application further includes:
filtering the intermediate data request specifically comprises:
step S601, acquiring any two pieces of intermediate link data to obtain first intermediate link data and second intermediate link data;
Step S602, comparing the first intermediate link data with the second intermediate link data to obtain a comparison result;
step S603, deleting the intermediate data request corresponding to the first intermediate link data or deleting the intermediate data request corresponding to the second intermediate link data according to the comparison result.
In the steps S601 to S603 shown in the embodiment of the present application, the same intermediate link data can be acquired in consideration of the same intermediate data request, so that the intermediate data request is filtered, the number of times of sending the request to the link to be accessed can be reduced, and the communication efficiency is improved.
If the comparison result shows that the data volume of the first intermediate link data is larger than or equal to the data volume of the second intermediate link data, deleting the intermediate data request corresponding to the first intermediate link data;
and if the comparison result shows that the data volume of the first intermediate link data is smaller than the data volume of the second intermediate link data, deleting the intermediate data request corresponding to the second intermediate link data.
Specifically, if the comparison result is that the data amount of the first intermediate link data is equal to the data amount of the second intermediate link data, deleting the intermediate data request corresponding to the first intermediate link data, or if the comparison result is that the data amount of the first intermediate link data is equal to the data amount of the second intermediate link data, deleting the intermediate data request corresponding to the second intermediate link data.
In some embodiments, considering the size of the parameter number, deleting the intermediate data request with the larger parameter number may include obtaining the intermediate data request corresponding to the first intermediate link data to obtain a first intermediate data request, obtaining the intermediate data request corresponding to the second intermediate link data to obtain a second intermediate data request, comparing the parameter number of the first intermediate data request with the parameter number of the second intermediate data request to obtain a number comparison result, deleting the intermediate data request corresponding to the first intermediate link data if the parameter number of the first intermediate data request is greater than or equal to the parameter number of the second intermediate data request, and deleting the intermediate data request corresponding to the second intermediate link data if the parameter number of the first intermediate data request is less than the parameter number of the second intermediate data request. Through the steps, the intermediate data requests with smaller parameter number are deleted, and under the condition of reducing the access times, the parameter analysis is reduced, the parameter analysis speed is improved, and the speed of acquiring the connection data is further improved.
In step S105 of some embodiments, a target data request is sent to the link to be accessed to obtain target link data.
It should be noted that the target data request may include one data request to be sent, or may include a plurality of data requests to be sent. Each data request to be sent may include a request header or may include a plurality of request headers. The request header may be a User-Agent, cookie, token, host, etc.
1. Obtaining target link data by sending a data request to be sent to obtain all data information in the link, wherein the method specifically comprises the following steps:
1) The data request with the request header added to the most basic User-Agent request header can access the link to be accessed, and all required data fields of the data information can be obtained to obtain the target link data.
2) And a plurality of request heads are required to be added, the access of the parameters of part of request heads is time-efficient, and the links to be accessed are accessed through dynamic change of real-time data requests, so that target link data is obtained.
And secondly, obtaining the target link data by sending a plurality of data requests to be sent to obtain the link data information. Considering that the link to be accessed can not obtain the most complete link data in a single access, the link to be accessed needs to be accessed for multiple times, which is equivalent to sending multiple data requests to the link to be accessed, so as to obtain the most complete data information and obtain the target link data. The multiple accesses are the same as the single accesses in the above embodiment, and will not be described again.
In one embodiment, after step S105, the network link processing method further includes:
Determining a data format analysis method according to the selected information source;
analyzing the target link data according to the data format analysis method to obtain target analysis data;
And returning target analysis data to the front end.
It should be noted that, by sending the link data requested to be obtained through the above embodiment, according to different parsing modes, the returned final data of each information source (website, APP) has different data structures, some are html, json format, XML and some returned data encryption needs to be broken to obtain the final data. And returns the data to the front end.
Referring to fig. 8, an embodiment of the present application further provides a network link processing apparatus, which may implement the above network link processing method, and fig. 8 is a block diagram of a module structure of the network link processing apparatus provided in the embodiment of the present application, where the apparatus includes an obtaining module 701, an information source screening module 702, a target screening module 703, an analyzing module 704, and a sending module 705. The system comprises an acquisition module 701, an information source screening module 702, a target screening module 703, an analysis module 704 and a sending module 705, wherein the acquisition module 701 is used for acquiring links to be accessed and keyword fields, the information source screening module 702 is used for screening a preset information source database according to the links to be accessed and the keyword fields to obtain a selected information source, the target screening module 703 is used for screening a preset analysis template database according to the selected information source to obtain a selected analysis template, the analysis module 704 is used for analyzing the links to be accessed according to the selected analysis template to obtain target data requests, and the sending module 705 is used for sending the target data requests to the links to be accessed to obtain target link data.
It should be noted that, the specific implementation of the network link processing apparatus is substantially the same as the specific embodiment of the network link processing method described above, and will not be described herein again.
The embodiment of the application also provides the electronic equipment, which comprises a memory, a processor, a program stored on the memory and capable of running on the processor and a data bus for realizing connection communication between the processor and the memory, wherein the program is executed by the processor to realize the cross-domain session data processing method. The electronic equipment can be any intelligent terminal including a tablet personal computer, a vehicle-mounted computer and the like.
Referring to fig. 9, fig. 9 illustrates a hardware structure of an electronic device according to another embodiment, the electronic device includes:
The processor 801 may be implemented by a general purpose CPU (Central Processing Unit ), a microprocessor, an Application SPECIFIC INTEGRATED Circuit (ASIC), or one or more integrated circuits, etc. for executing related programs to implement the technical solution provided by the embodiments of the present application;
The Memory 802 may be implemented in the form of a Read Only Memory (ROM), a static storage device, a dynamic storage device, or a random access Memory (Random Access Memory, RAM). Memory 802 may store an operating system and other application programs, and when the technical solutions provided in the embodiments of the present disclosure are implemented by software or firmware, relevant program codes are stored in memory 802, and the processor 801 invokes a cross-domain session data processing method for executing the embodiments of the present disclosure;
an input/output interface 803 for implementing information input and output;
The communication interface 804 is configured to implement communication interaction between the device and other devices, and may implement communication in a wired manner (e.g., USB, network cable, etc.), or may implement communication in a wireless manner (e.g., mobile network, WIFI, bluetooth, etc.);
A bus 805 that transfers information between the various components of the device (e.g., the processor 801, the memory 802, the input/output interface 803, and the communication interface 804);
Wherein the processor 801, the memory 802, the input/output interface 803, and the communication interface 804 implement communication connection between each other inside the device through a bus 805.
The embodiment of the application also provides a storage medium, which is a computer readable storage medium and is used for computer readable storage, the storage medium stores one or more programs, and the one or more programs can be executed by one or more processors to realize the cross-domain session data processing method.
The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
According to the network link processing method, the network link processing device, the electronic equipment and the storage medium, the matched information source is determined according to the links to be accessed and the keyword fields, the analysis template is selected according to the information source, the accessed links are analyzed according to the analysis template to obtain the target data request, and then the target data request is sent to the links to be accessed, so that the server to which the accessed links belong returns link data according to the target data request, and finally the target link data is obtained. In the embodiment of the application, the analysis efficiency of the link can be improved by determining the analysis template by the information source, and the obtained target data request is paired with the information source, so that the integrity of acquiring the link data can be improved.
The embodiments described in the embodiments of the present application are for more clearly describing the technical solutions of the embodiments of the present application, and do not constitute a limitation on the technical solutions provided by the embodiments of the present application, and those skilled in the art can know that, with the evolution of technology and the appearance of new application scenarios, the technical solutions provided by the embodiments of the present application are equally applicable to similar technical problems.
It will be appreciated by those skilled in the art that the solutions shown in fig. 1, 4-8 are not limiting on the embodiments of the application and may include more or fewer steps than shown, or may combine certain steps, or different steps.
The above described apparatus embodiments are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
Those of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.
The terms "first," "second," "third," "fourth," and the like in the description of the application and in the above figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in the present application, "at least one (item)" means one or more, and "a plurality" means two or more. "and/or" is used to describe an association relationship of an associated object, and indicates that three relationships may exist, for example, "a and/or B" may indicate that only a exists, only B exists, and three cases of a and B exist simultaneously, where a and B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one of a, b or c may represent a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including multiple instructions for causing an electronic device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. The storage medium includes various media capable of storing programs, such as a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory RAM), a magnetic disk, or an optical disk.
The preferred embodiments of the present application have been described above with reference to the accompanying drawings, and are not thereby limiting the scope of the claims of the embodiments of the present application. Any modifications, equivalent substitutions and improvements made by those skilled in the art without departing from the scope and spirit of the embodiments of the present application shall fall within the scope of the claims of the embodiments of the present application.

Claims (10)

1.一种网络链接处理方法,其特征在于,所述方法包括:1. A method for processing network links, characterized in that the method comprises: 获取待访问链接和关键词字段;Retrieve the link to be accessed and the keyword field; 根据所述待访问链接和所述关键词字段对预设的信息源数据库进行筛选,得到选定信息源;The preset information source database is filtered based on the link to be accessed and the keyword field to obtain the selected information source; 根据所述选定信息源对预设的解析模板数据库进行筛选,得到选定解析模板;The selected parsing template is obtained by filtering the preset parsing template database according to the selected information source; 根据所述选定解析模板对所述待访问链接进行解析,得到目标数据请求;The link to be accessed is parsed according to the selected parsing template to obtain the target data request; 向所述待访问链接发送所述目标数据请求,以得到目标链接数据。Send the target data request to the link to be accessed to obtain the target link data. 2.根据权利要求1所述的方法,其特征在于,根据所述待访问链接和所述关键词字段对预设的信息源数据库进行筛选,得到选定信息源,包括:2. The method according to claim 1, characterized in that, filtering a preset information source database based on the link to be accessed and the keyword field to obtain selected information sources includes: 若所述关键词字段为空字段,则对所述待访问链接进行域名解析,得到链接域名;If the keyword field is empty, then perform domain name resolution on the link to be accessed to obtain the link domain name; 根据所述链接域名对所述信息源数据库进行筛选,得到所述选定信息源;其中,所述信息源数据库包括链接域名与选定信息源的匹配信息。The information source database is filtered based on the linked domain name to obtain the selected information source; wherein, the information source database includes matching information between the linked domain name and the selected information source. 3.根据权利要求2所述的方法,其特征在于,根据所述待访问链接和所述关键词字段对预设的信息源数据库进行筛选,得到选定信息源,包括:3. The method according to claim 2, characterized in that, filtering a preset information source database based on the link to be accessed and the keyword field to obtain selected information sources includes: 若所述关键词字段为非空字段,则从所述关键词字段提取出关键词;If the keyword field is not empty, then the keywords are extracted from the keyword field; 根据所述关键词和所述信息源数据库的每个信息源的名称进行相似度计算,得到相似度;The similarity score is obtained by calculating the similarity between the keywords and the name of each information source in the information source database. 根据所述相似度符合预设相似度条件的所述信息源得到所述选定信息源。The selected information source is obtained based on the information source whose similarity meets the preset similarity conditions. 4.根据权利要求1所述的方法,其特征在于,根据所述选定解析模板对所述待访问链接进行解析,得到目标数据请求,包括:4. The method according to claim 1, characterized in that parsing the link to be accessed according to the selected parsing template to obtain the target data request includes: 对所述待访问链接的请求参数进行识别,得到信息标识码;其中,所述信息标识码用于表示所述待访问链接对应的目标链接数据的标识;The request parameters of the link to be accessed are identified to obtain an information identification code; wherein, the information identification code is used to represent the identifier of the target link data corresponding to the link to be accessed; 根据所述信息标识码对预设的数据请求数据库进行筛选,得到所述目标数据请求;其中,所述数据请求数据库包括信息标识码与所述目标数据请求的匹配信息,所述目标数据请求用于请求获取所述目标链接数据。The target data request is obtained by filtering a preset data request database based on the information identification code; wherein the data request database includes matching information between the information identification code and the target data request, and the target data request is used to request the acquisition of the target link data. 5.根据权利要求4所述的方法,其特征在于,在所述根据所述信息标识码对预设的数据请求数据库进行筛选,得到所述目标数据请求之前,所述方法还包括:5. The method according to claim 4, characterized in that, before filtering the preset data request database according to the information identification code to obtain the target data request, the method further includes: 创建所述数据请求数据库,具体包括:Creating the data request database specifically includes: 获取预设信息源至少一个初始数据请求;Obtain at least one initial data request from a preset information source; 对所述初始数据请求进行参数修改,得到至少一个中间数据请求:所述参数修改包括删除参数或修改参数的值;The initial data request is modified to obtain at least one intermediate data request: the parameter modification includes deleting parameters or modifying the value of parameters; 向所述待访问链接发送所述中间数据请求,以得到中间链接数据;Send the intermediate data request to the link to be accessed to obtain intermediate link data; 对所述中间链接数据的完整度符合预设完整度条件的所述中间数据请求进行筛选,得到所述目标数据请求;The intermediate data requests whose completeness meets the preset completeness conditions are filtered to obtain the target data request; 根据所述目标数据请求得到所述数据请求数据库。The data request database is obtained based on the target data request. 6.根据权利要求5所述的方法,其特征在于,在所述对所述中间链接数据的完整度符合预设完整度条件的所述中间数据请求进行筛选,得到所述目标数据请求之前,所述方法还包括:6. The method according to claim 5, characterized in that, before filtering the intermediate data requests whose completeness meets a preset completeness condition to obtain the target data request, the method further includes: 对所述中间数据请求进行过滤,具体包括:Filtering the intermediate data requests specifically includes: 获取任意两个中间链接数据,得到第一中间链接数据和第二中间链接数据;Obtain any two intermediate link data to get the first intermediate link data and the second intermediate link data; 对所述第一中间链接数据和所述第二中间链接数据进行比对,得到比对结果;The first intermediate link data and the second intermediate link data are compared to obtain the comparison results; 根据所述比对结果删除所述第一中间链接数据对应的所述中间数据请求、或删除获取所述第二中间链接数据对应的所述中间数据请求。Based on the comparison results, delete the intermediate data request corresponding to the first intermediate link data, or delete the intermediate data request corresponding to the acquisition of the second intermediate link data. 7.根据权利要求6所述的方法,其特征在于,所述根据所述比对结果删除所述第一中间链接数据对应的所述中间数据请求、或删除获取所述第二中间链接数据对应的所述中间数据请求,包括:7. The method according to claim 6, wherein deleting the intermediate data request corresponding to the first intermediate link data or deleting the intermediate data request corresponding to obtaining the second intermediate link data according to the comparison result comprises: 若所述比对结果为所述第一中间链接数据的数据量等于所述第二中间链接数据的数据量,则删除所述第一中间链接数据对应的所述中间数据请求;If the comparison result shows that the data volume of the first intermediate link data is equal to the data volume of the second intermediate link data, then delete the intermediate data request corresponding to the first intermediate link data. 或者,若所述比对结果为所述第一中间链接数据的数据量等于所述第二中间链接数据的数据量,则删除所述第二中间链接数据对应的所述中间数据请求。Alternatively, if the comparison result shows that the amount of data in the first intermediate link data is equal to the amount of data in the second intermediate link data, then the intermediate data request corresponding to the second intermediate link data is deleted. 8.一种网络链接处理装置,其特征在于,所述装置包括:8. A network link processing apparatus, characterized in that the apparatus comprises: 获取模块,用于获取待访问链接和关键词字段;The retrieval module is used to retrieve the link to be accessed and the keyword field; 信息源筛选模块,用于根据所述待访问链接和所述关键词字段对预设的信息源数据库进行筛选,得到选定信息源;The information source filtering module is used to filter the preset information source database based on the link to be accessed and the keyword field to obtain the selected information source; 目标筛选模块,用于根据所述选定信息源对预设的解析模板数据库进行筛选,得到选定解析模板;The target filtering module is used to filter the preset parsing template database according to the selected information source to obtain the selected parsing template; 解析模块,用于根据所述选定解析模板对所述待访问链接进行解析,得到目标数据请求;The parsing module is used to parse the link to be accessed according to the selected parsing template to obtain the target data request; 发送模块,用于向所述待访问链接发送所述目标数据请求,以得到目标链接数据。The sending module is used to send the target data request to the link to be accessed in order to obtain the target link data. 9.一种电子设备,其特征在于,所述电子设备包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现权利要求1至7任一项所述的方法。9. An electronic device, characterized in that the electronic device includes a memory and a processor, the memory storing a computer program, and the processor executing the computer program to implement the method according to any one of claims 1 to 7. 10.一种计算机可读存储介质,所述存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至7中任一项所述的方法。10. A computer-readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the method of any one of claims 1 to 7.
CN202211674049.6A 2022-12-26 2022-12-26 Network link processing method and device, electronic equipment and storage medium Active CN116127945B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211674049.6A CN116127945B (en) 2022-12-26 2022-12-26 Network link processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211674049.6A CN116127945B (en) 2022-12-26 2022-12-26 Network link processing method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN116127945A CN116127945A (en) 2023-05-16
CN116127945B true CN116127945B (en) 2025-11-21

Family

ID=86303821

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211674049.6A Active CN116127945B (en) 2022-12-26 2022-12-26 Network link processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116127945B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118138838B (en) * 2024-05-06 2024-07-12 长沙掌控智能科技有限公司 Method and system for optimizing high-definition live broadcast signal transmission of mobile phone television

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105447202A (en) * 2015-12-31 2016-03-30 宁波公众信息产业有限公司 Internet information collecting system
CN110555178A (en) * 2019-08-28 2019-12-10 贝壳技术有限公司 Data proxy method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107766328B (en) * 2017-10-24 2020-06-12 平安科技(深圳)有限公司 Text information extraction method of structured text, storage medium and server
US11151174B2 (en) * 2018-09-14 2021-10-19 International Business Machines Corporation Comparing keywords to determine the relevance of a link in text
CN110597844B (en) * 2019-08-14 2023-07-21 中国平安财产保险股份有限公司 Unified access method for heterogeneous database data and related equipment
CN114692050A (en) * 2022-03-30 2022-07-01 北京金堤科技有限公司 Page parsing method and device, computer readable medium and electronic device
CN115344614B (en) * 2022-08-15 2025-01-03 中国电信股份有限公司 Data processing method, device, storage medium and electronic device
CN115396483B (en) * 2022-08-30 2025-05-16 中国工商银行股份有限公司 Interface calling method, device, computer equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105447202A (en) * 2015-12-31 2016-03-30 宁波公众信息产业有限公司 Internet information collecting system
CN110555178A (en) * 2019-08-28 2019-12-10 贝壳技术有限公司 Data proxy method and device

Also Published As

Publication number Publication date
CN116127945A (en) 2023-05-16

Similar Documents

Publication Publication Date Title
US11150874B2 (en) API specification generation
US10362050B2 (en) System and methods for scalably identifying and characterizing structural differences between document object models
CN103365865B (en) Date storage method, data download method and its device
CN104125209B (en) Malice website prompt method and router
CN101546309B (en) Method and equipment for constructing indexes to resource content in computer network
CN102831252B (en) A kind of method for upgrading index data base and device, searching method and system
JP6520513B2 (en) Question and Answer Information Providing System, Information Processing Device, and Program
CN108197244A (en) It is a kind of to search for the method for pushing and device for recommending word
TW201800962A (en) Webpage file sending method, webpage rendering method and device and webpage rendering system
CN114528457A (en) Web fingerprint detection method and related equipment
CN105808605B (en) A search log merging method and system
CN104281629B (en) The method, apparatus and client device of picture are extracted from webpage
CN116127945B (en) Network link processing method and device, electronic equipment and storage medium
CN111291288B (en) Webpage link extraction method and system
CN109246069B (en) Webpage login method and device and readable storage medium
CN115048533A (en) Knowledge graph construction method and device, electronic equipment and readable storage medium
CN109344344A (en) Identification method, server and the computer readable storage medium of webpage client
CN104021143A (en) Method and device for recording webpage access behavior
CN102918527B (en) Investigation method and system for web application hosting
KR101175168B1 (en) Apparatus and method for searching a plurality of web-sites through a web-site in the terminal device
Ham et al. Big Data Preprocessing Mechanism for Analytics of Mobile Web Log.
CN110825976B (en) Website page detection method and device, electronic equipment and medium
CN116861106B (en) Data processing method, device, equipment, storage medium and computer program product
CN119150238B (en) Network asset identification method, device and computer program product
CN108287826B (en) Medical system-based case reading method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant