WO2009097710A1 - Procédé pour organiser et récupérer des fichiers, module et système pour organiser des fichiers et support de mémoire associé - Google Patents
Procédé pour organiser et récupérer des fichiers, module et système pour organiser des fichiers et support de mémoire associé Download PDFInfo
- Publication number
- WO2009097710A1 WO2009097710A1 PCT/CN2008/071908 CN2008071908W WO2009097710A1 WO 2009097710 A1 WO2009097710 A1 WO 2009097710A1 CN 2008071908 W CN2008071908 W CN 2008071908W WO 2009097710 A1 WO2009097710 A1 WO 2009097710A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- file
- module
- index
- retrieving
- storage node
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
Definitions
- the present invention relates to the field of communications, and in particular, to a file organization method, a file retrieval method, a file organization module, a file retrieval system, and a computer readable storage medium.
- the bills are stored on the storage module in the form of a file before the conversion is printed or displayed by the tool. This type of file is called a billing document.
- the billing file has the following characteristics:
- the number of files is too large, which wastes the storage nodes (storage space) of the file system; on the other hand, when a large number of files are searched and located, the efficiency is low, and in specific applications, in the business operation support system (Business Operation Support System) In BOSS, the billing file has low efficiency in finding and positioning, which makes the printing (presentation) and reprinting speed of the billing file slow, which reduces the experience satisfaction of the user's billing service.
- the technical problem to be solved by the embodiments of the present invention is to provide a file organization method, a file retrieval method, a file organization module, a file retrieval system and a computer readable storage medium, which can solve The number of files is too large, resulting in wasted storage space and low file location efficiency.
- an embodiment of the present invention provides a method for organizing a file, including: obtaining at least one file set; Combining all the files in each of the file sets into one storage node; establishing an index for retrieving the merged files under the storage node.
- the embodiment of the invention further provides a file retrieval method, including:
- a file corresponding to the index in the file set is output.
- an embodiment of the present invention further provides a file organization module, including:
- a merging submodule for merging all the files in each of the file sets into one storage node
- An index creation sub-module is configured to retrieve an index of the merged file under the storage node.
- an embodiment of the present invention further provides a file retrieval system, including:
- the storage module corresponds to at least one storage node, configured to store an index for retrieving files in the merged file set under the storage node, and the file set;
- a master control module configured to receive a read request for reading a file in a file set merged under the storage node, and output corresponding control information according to the read request;
- a file retrieval module according to the control information of the total control module, obtaining an index for retrieving a file in the file set merged under the storage node and the file set, and outputting the index and the file from the storage module Collection
- a file output module configured to output a file corresponding to the index in the file set.
- an embodiment of the present invention further provides a computer readable storage medium, where the computer readable storage medium stores a plurality of executable instructions, where the executable instructions are used to:
- an index for retrieving the merged file under the storage node is established, thereby solving the problem due to the number of files. Too much, causing wasted storage space; and receiving the text in the merged file set under the storage node based on the above-structured file and index storage relationship Reading a read request, and searching and outputting the file corresponding to the read request according to an index for retrieving a file in the file set merged under the storage node, thereby solving the problem of low file positioning efficiency.
- FIG. 1 is a schematic diagram of a method of organizing a file according to an embodiment of the present invention
- FIG. 2 is a schematic diagram of a file storage structure established in an embodiment of the present invention.
- FIG. 3 is a schematic diagram of a file organization module according to an embodiment of the present invention.
- FIG. 4 is a schematic diagram of a document retrieval system according to an embodiment of the present invention.
- FIG. 5 is a schematic diagram of a method of retrieving a file according to an embodiment of the present invention. detailed description
- FIG. 1 is a schematic diagram of a method for organizing a file according to an embodiment of the present invention, where the method includes:
- each file set corresponds to each classification result, that is, forms a file collection of a certain class or classes, and in a specific implementation, a hash algorithm may be used.
- Documents are classified, but are not limited to this;
- a hash algorithm may be used to establish a directory structure for each file set, but is not limited thereto;
- the index organization may adopt a B+ tree form, and the above description is equally applicable to other embodiments of the present invention.
- the file storage structure established above can be as shown in FIG. 2, including the total directory structure f(y), the file f(x) in the directory under the directory structure, the index (idx file), and the packed compression node (.tar. Gz), merged files (numbering from 1.256900 to 1.659900, numbering from 1.348699 to 1.648699, etc.).
- the method may further include:
- the files under each merged storage node are packaged and compressed.
- 101, 102 can be selected according to the actual situation.
- an index corresponding to the file may be updated according to an operation of adding, deleting, or modifying the file.
- the above file may be a bill file, or other type of file.
- the method for organizing the file in the embodiment of the present invention as shown in FIG. 1 can solve the problem of storing the file by categorizing the file, establishing a directory structure, merging under a storage node, and establishing an index.
- the problem of wasted space can be solved by categorizing the file, establishing a directory structure, merging under a storage node, and establishing an index. The problem of wasted space.
- the file organization module includes a classification sub-module 41, a directory establishment sub-module 42, an acquisition sub-module 43, a merge sub-module 44, and an index establishment sub-module 45.
- the function of each submodule is as follows:
- a classification sub-module 41 configured to classify files, obtain at least one file set, and each file set corresponds to each classification result, that is, form a file collection of a certain class or classes;
- the directory creation sub-module 42 is configured to establish a directory structure for each type of file collection obtained by classifying the sub-module 41, and the directory file in the directory structure corresponds to each type of file set;
- the algorithm of the classification process of the classification sub-module 41 or the process of establishing the directory structure of the directory creation sub-module 42 may be a hash algorithm
- the obtaining sub-module 43 is configured to obtain various types of file sets obtained by the classification processing of the classification sub-module 41.
- the merging sub-module 44 is configured to merge all the files in each type of file set obtained by the obtaining sub-module 43 into one storage. Under the node;
- the index establishing sub-module 45 establishes an index for retrieving the merged file under the storage node.
- the index organization may adopt the form of a B+ tree, and the above description is also applicable to other embodiments of the present invention.
- each of the above functional units performs a corresponding function, and the created file storage structure can still be as shown in FIG. 2.
- all files in each merged file set in the merge submodule 44 may be package compressed by a package compression submodule.
- the classification sub-module 41 and the directory creation sub-module 42 may be selected according to actual conditions.
- the index corresponding to the file may be updated by the index maintenance submodule according to the adding, deleting, or modifying operation of the file. (idx file), the index is maintained.
- the above file may be a bill file, or other type of file.
- the file organization module of the embodiment of the present invention as shown in FIG. 3 is implemented. By classifying files by different sub-modules, establishing a directory structure, merging under a storage node, and establishing an index, the number of files can be solved. The problem of wasted storage space.
- the embodiment of the present invention further provides a storage module, which stores the storage structure of the file constructed as described above.
- FIG. 4 is a schematic diagram of a file retrieval system according to an embodiment of the present invention.
- the system mainly includes a storage module 51, a general control module 52, a file retrieval module 53, a file generation module 54, a file organization module 55, and a file output module 56.
- the bill presentation processing module 57 has the following functions:
- a storage module 51 the storage module 51 corresponding to the at least one storage node, configured to store an index for retrieving a bill file in the merged file set under the storage node, and the file set, the bill in the storage module 51
- the storage structure of the file may still be as shown in FIG. 2, and details are not described herein again;
- the receiving submodule in the master control module 52 receives the read request of the bill presentation processing module 57 to read the bill file in the file set merged under the storage node in the storage module 51, and the read request may be a print request or a reprint request for a billing file;
- the control submodule in the master control module 52 determines whether the bill file corresponding to the read request is in the request queue, and if so, outputs first control information for controlling the bill file corresponding to the read request The first control information carries a request queue number. Otherwise, the output is used to control the file retrieval module 53 to obtain an index from the storage module 51 for retrieving the bill file in the file set merged under the storage node. And the second control information of the file set, the second control information includes the read request and the index key value, and the file retrieval module 53 may retrieve the index and the file set according to the index key value;
- the file output module 56 directly transmits the bill file to the bill presentation processing module 57 in the form of a file stream, and the bill presentation processing module 57 may include a bill presentation processing program; the file retrieval module And obtaining, according to the second control information, an index for retrieving a bill file in the file set merged under the storage node, and the file set, and obtaining the obtained index and file from the storage module 51.
- the collection is transmitted to the file output module 56;
- the file output module 56 reads the file set and index sent from the file retrieval module 53, selects the corresponding bill file in the file set according to the index, and transmits the bill file to the bill presentation processing module as a file stream. 57, whereby the bill presentation processing module 57 triggers printing according to the bill file Or reprint the operation to complete the bill presentation;
- the control submodule in the master control module 52 after obtaining the obtained result information returned by the file retrieval module 53, the obtaining result information instructing the file retrieval module 53 not to obtain the index and the file set from the storage module 51, Sending, to the file generation module 54, third control information for controlling generation of a bill file of the user;
- the file generating module 54 generates a bill file of the user according to the third control information of the master control module 52, and the file generating module 54 may include a file generating program;
- the file organization module 55 organizes the bill file generated by the file generating module 54 into the storage module 51, and the file organization module 55 can The function of the file organization module shown in FIG. 3 processes the generated bill file, and sends the processed bill file to the storage module 51 for storage;
- the file retrieval module 53 can obtain the generated bill file from the storage module 51, and add the index sent by the total control module 52, and the file retrieval module 53 can obtain the file set for retrieving the merged file under the storage node.
- the index of the billing file and the set of files are transferred to the file output module 56 to complete the bill presentation.
- the file output module 56 specifically includes:
- a compressed stream processing module after reading the file set and index sent by the file retrieval module 53 (the bill file in the file set exists in the form of a compressed package), decompressing the bill corresponding to the index in the file set Document
- the output module transmits the bill file obtained by decompressing the compressed stream processing module to the bill presentation processing module 57 in the form of a file stream.
- the above document retrieval system can be applied to a bill presentation subsystem in BOSS.
- the file generating module 54 and the file organization module 55 may select the application according to the actual situation.
- the bill file may be organized in a fixed format.
- FIG. 5 is a schematic diagram of a file retrieval method according to an embodiment of the present invention. The method is based on the storage structure of the file created in FIG. 2, referring to FIG. 5, in conjunction with the system shown in FIG. 4, the method includes:
- the master control module receives a read request of the bill presentation processing module to read the bill file in the merged file set under the storage node.
- the read request may be a print request of the bill file or Reprint the request, but not limited to this;
- the master control module determines whether the read request is in an existing request queue, and if so, directly notifies the file output module of the request queue number, and then the file output module directly transmits the bill file to the account as a file stream.
- the request queue set by the embodiment of the present invention can implement the recently used bill file to be cached in the request queue, so as to quickly dispatch the bill file to the bill presentation processing module for processing, and the request queue
- the bill file that is not used in the medium and long term can be cleared periodically.
- a time can be set. When the set time is exceeded, the bill file is cleared to save the request queue resource;
- control module according to the read request, the control file retrieval module obtains, from the storage module, an index for retrieving the bill file in the merged file set under the storage node, and the file set.
- the control module generates control information for the file retrieval module to include the read request and the index key value, and the file retrieval module may retrieve the index and the file set according to the index key value;
- the file retrieval module transfers the obtained index and file set to the file output module
- the file output module reads the file collection and index sent by the file retrieval module, and selects a corresponding bill file in the file collection according to the index;
- the file output module transmits the bill file to the bill presentation processing module in a file stream, so that the bill presentation processing module can trigger a printing or reprinting operation on the bill file to complete the bill presentation.
- the master control module obtains the obtained result information returned by the file retrieval module, where the obtained result information indicates that the file retrieval module does not obtain the index and the file set from the storage module.
- the master control module control file generating module generates a bill file of the user.
- the master control module simultaneously controls the file organization module to perform the generated billing statement as shown in FIG. Organize the pieces;
- the master control module sends the index corresponding to the generated billing file to the file retrieval module
- the file generation module sends the generated bill file of the user to the file organization module, and the file organization module can process the generated bill file according to the process of the organization method of the file shown in FIG. 1;
- the file organization module sends the processed bill file to the storage module for storage.
- the file retrieval module can obtain the generated bill file from the storage module, and add the index sent by the total control module 310, and the file retrieval module can perform the corresponding function of 304 to complete the bill presentation.
- the above 305 is specifically:
- the file output module After the file output module reads the file set and the index sent by the file retrieval module (the bill file in the file set exists in the form of a compressed package), decompress the bill file corresponding to the index in the file set, and then The processing of the decompressed bill file is performed 306. To improve efficiency, partial decompression of the corresponding bill file may be performed according to the index, without decompressing all of the file set.
- the embodiment of the present invention further provides a computer readable storage medium, where the computer readable storage medium stores a plurality of executable instructions, where the executable instructions are used to:
- the storage medium may be a magnetic disk, an optical disk, or a read-only memory (Read-Only Memory, ROM) or random access memory (RAM).
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
L'invention concerne un procédé pour organiser des fichiers. Le procédé comprend les étapes consistant à : obtenir au moins un groupe de fichiers; fusionner tous les fichiers dans chacun des groupes de fichiers dans un nœud de stockage; établir un index pour récupérer les fichiers fusionnés dans le nœud de stockage. L'invention concerne également un procédé pour récupérer des fichiers, un module pour organiser des fichiers, un système pour récupérer des fichiers et un support de mémoire lisible par ordinateur.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN2008100262348A CN101226546B (zh) | 2008-02-01 | 2008-02-01 | 文件的检索方法 |
| CN200810026234.8 | 2008-02-01 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2009097710A1 true WO2009097710A1 (fr) | 2009-08-13 |
Family
ID=39858541
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2008/071908 WO2009097710A1 (fr) | 2008-02-01 | 2008-08-07 | Procédé pour organiser et récupérer des fichiers, module et système pour organiser des fichiers et support de mémoire associé |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN101226546B (fr) |
| WO (1) | WO2009097710A1 (fr) |
Families Citing this family (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101226546B (zh) * | 2008-02-01 | 2011-12-21 | 华为技术有限公司 | 文件的检索方法 |
| CN102148859A (zh) * | 2011-01-20 | 2011-08-10 | 南京烽火星空通信发展有限公司 | 网络文件服务方法 |
| CN102880677B (zh) * | 2012-09-11 | 2016-04-13 | 珠海金山网络游戏科技有限公司 | 一种基于哈希的文件打包和读取方法 |
| CN103853791A (zh) * | 2012-12-07 | 2014-06-11 | 腾讯科技(深圳)有限公司 | 快速检索文件的实现方法及装置 |
| CN104978330A (zh) * | 2014-04-04 | 2015-10-14 | 西南大学 | 一种数据存储的方法及装置 |
| CN107092604B (zh) * | 2016-02-18 | 2020-03-20 | 中国移动通信集团河北有限公司 | 一种文件处理方法和装置 |
| CN106547911B (zh) * | 2016-11-25 | 2020-07-10 | 长城计算机软件与系统有限公司 | 一种海量小文件的存取方法和系统 |
| CN108549545A (zh) * | 2018-04-20 | 2018-09-18 | 武汉极意网络科技有限公司 | 一种基于tornado框架的项目组织方法与系统 |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5729743A (en) * | 1995-11-17 | 1998-03-17 | Deltatech Research, Inc. | Computer apparatus and method for merging system deltas |
| CN1113304C (zh) * | 1998-09-18 | 2003-07-02 | 英业达股份有限公司 | 合并档案文件的方法和开启或执行集合档案文件的方法 |
| CN101226546A (zh) * | 2008-02-01 | 2008-07-23 | 华为技术有限公司 | 文件的组织、检索方法 |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN100454308C (zh) * | 2006-08-30 | 2009-01-21 | 华为技术有限公司 | 文件发布和检索的方法及其系统 |
-
2008
- 2008-02-01 CN CN2008100262348A patent/CN101226546B/zh not_active Expired - Fee Related
- 2008-08-07 WO PCT/CN2008/071908 patent/WO2009097710A1/fr active Application Filing
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5729743A (en) * | 1995-11-17 | 1998-03-17 | Deltatech Research, Inc. | Computer apparatus and method for merging system deltas |
| CN1113304C (zh) * | 1998-09-18 | 2003-07-02 | 英业达股份有限公司 | 合并档案文件的方法和开启或执行集合档案文件的方法 |
| CN101226546A (zh) * | 2008-02-01 | 2008-07-23 | 华为技术有限公司 | 文件的组织、检索方法 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN101226546A (zh) | 2008-07-23 |
| CN101226546B (zh) | 2011-12-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2009097710A1 (fr) | Procédé pour organiser et récupérer des fichiers, module et système pour organiser des fichiers et support de mémoire associé | |
| KR102007070B1 (ko) | 메모리 관리 시의 중복 제거를 위해서 기준 세트로 기준 블록을 취합하는 기법 | |
| CN102831127B (zh) | 重复数据处理方法、装置及系统 | |
| JP5316711B2 (ja) | ファイル記憶装置、ファイル記憶方法およびプログラム | |
| WO2014015488A1 (fr) | Procédé et appareil de stockage et d'interrogation de données | |
| RU2007143551A (ru) | Способ, система и читаемый компьютером носитель информации для синхронизации поддающихся изменению документов для множества клиентов | |
| US20130067237A1 (en) | Providing random access to archives with block maps | |
| CN102024005B (zh) | 信息项目集合目录的发布系统和方法 | |
| US9355108B2 (en) | Storing data files in a file system | |
| US20110004601A1 (en) | Multi-streamed method for optimizing data transfer through parallelized interlacing of data based upon sorted characteristics to minimize latencies inherent in the system | |
| CN105975495A (zh) | 大数据的存储、搜索方法及装置 | |
| WO2022082891A1 (fr) | Procédé et système d'acquisition de mégadonnées, dispositif informatique et support associé de stockage | |
| US20240220456A1 (en) | Efficient database query evaluation | |
| CN108090186A (zh) | 一种大数据平台上的电力数据去重方法 | |
| CN102750391A (zh) | 一种基于Hadoop分布式文件预览方法及其系统 | |
| CN101415029B (zh) | 文件分发方法及装置 | |
| US8346970B2 (en) | Data relay device, data receiving device and communication system | |
| CN113672307B (zh) | 数据处理方法、电子设备和计算机可读存储介质 | |
| WO2010015172A1 (fr) | Procédé de conversion et d'obtention de courrier, serveur de courrier, client et système correspondants | |
| JP5194936B2 (ja) | ファイル変換装置、ファイル変換方法及びプログラム | |
| CN110226153A (zh) | 垃圾收集系统和过程 | |
| WO2017167101A1 (fr) | Procédé et dispositif de gestion de pièces jointes de message multimédia, système de communication, et support de stockage informatique | |
| US12222821B2 (en) | Server-side inline generation of virtual synthetic backups using group fingerprints | |
| JP2015032281A (ja) | 文書管理装置及び文書管理プログラム | |
| JP7377915B2 (ja) | 個別データ検索サービスを提供する方法、コンピュータ装置、およびコンピュータプログラム |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 08783901 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 08783901 Country of ref document: EP Kind code of ref document: A1 |