A kind of MapReduce platform towards magnanimity multi-medium data based on GPU
Technical field
The present invention relates to mass data processing and high-performance calculation processing technology field, especially relate to a kind of MapReduce platform towards magnanimity multi-medium data based on GPU.
Background technology
After the information age enters into Web2.0, emergence along with the original mutually acting systems of multimedia, the new media such as network multimedia and mobile multimedia popular, and portable intelligent terminal device (as: IPhone, IPad, notebook etc.) popular and universal, the multimedia on internet (as video, image etc.) quantity is just presenting magnanimity level explosion type and is increasing.Picture and the video of magnanimity transmit on the internet, by internet hunt and watch abundant picture and video resource to become the important way of numerous netizens' obtaining information.In the face of the multi-medium data of magnanimity, how effectively it to be organized, manage, to be searched for has become a urgent task, is also the study hotspot in the fields such as multimedia, search engine, data mining.For this reason, not only need advanced algorithm to carry out content-based analysis and understanding to video data; For the required huge calculated amount of analyzing and processing, also need cloud computing platform, GPU (Graphics Processing Unit) etc. with support, the multi-medium data of magnanimity to be processed.Cloud computing is a kind of emerging computation schema based on internet, and being intended to provides the calculating of getting as required by isomery on internet, autonomous service for individual and enterprise customer.MapReduce is a kind of distributed computing framework of realizing cloud computing being proposed by Google.Cloud computing is distributed in calculation task on the resource pool of a large amount of computing machines formations, makes various application systems can obtain as required computing power, storage space and various software service.
In recent years, flourish along with integrated circuit and semiconductor industry, the calculated performance of GPU has had swift and violent development.The meanwhile appearance of GPGPU (General Programming for GPU) makes GPU no longer be confined to traditional graph and image processing and demonstration, can also be as high performance universal computing device.CUDA be exactly a set of like this by NVidia company, proposed for solve the software architecture of concurrent operation on GPU.Due to the hardware advances speed of GPU, substantially exceed the speed of development of CPU simultaneously, also make the performance of GPU promote at double, thereby be more and more subject to vast researcher, Application Engineer's attention.
For video, it is different from traditional documents, and it need to characterize its complicated data by extracting the feature of magnanimity, and especially local feature point, larger to the demand of calculated amount.To the analysis of video data and processing, bring huge burden will to common computer system.In the face of the explosive growth of video information exponentially form growth present situation, especially Internet video, traditional calculating and memory module are difficult to meet to be analyzed and processes these mass data information.The technical advantages such as cloud computing is extensive by it, can expand, unstructured data processing, the splendid platform and the solution that address this problem just.
Summary of the invention
Object of the present invention is exactly to provide a kind of MapReduce platform towards magnanimity multi-medium data based on GPU in order to overcome the defect of above-mentioned prior art existence.
Object of the present invention can be achieved through the following technical solutions: a kind of MapReduce platform towards magnanimity multi-medium data based on GPU, utilize computer cluster to realize the computing to image/video retrieval tasks, in each computer cluster, be provided with a plurality of CPU and GPU, it is characterized in that, described platform is based upon on CUDA and HDFS, comprise platform driver and work submodule, described platform driver adopts MapReduce computation model, primary control program on dispatching clustered node is some Map tasks by image/video retrieval process division of tasks, the data of described Map task are stored in HDFS, during each Map task start, utilize the listed files that platform driver imports into obtain task data, and calculate concrete distribution of computation tasks to described work submodule, task is distributed to GPU to task dispatcher in described work submodule or CPU processes, local library libhdfs.so by HDFS in calculation processes obtains the required data of calculating, HDFS afterwards writes direct the data after computing.
Between described platform driver and work submodule, adopt Protocol Buffer serializing agreement as host-host protocol, to simplify the complicacy of exchanges data between the two, utilize JNI technology to carry out alternately, with the high efficiency that guarantees that it is mutual simultaneously.
Described platform driver is used Java language to write, and is realization and the expansion of Hadoop framework in concrete application.
Described work submodule is based upon and on CUDA basis, uses C/C++ and CUDA-C language compilation.
Described work submodule adopts distributed caching technology in calculation processes, and the Internet Transmission of HDFS while realizing the image/video retrieval process algorithm with data unchangeability to reduce improves the performance of whole cluster.
Described platform driver is in charge of the soft and hardware resource of platform, controls the workflow of platform, and its groundwork comprises the startup, cutting, scheduling, fault-tolerant processing of task etc.; Described work submodule be main image, video frequency searching Processing Algorithm as the realization of feature point extraction, cluster etc., born calculation task the heaviest in platform.Different work submodules are under the management of platform driver, and certain task that mutually cooperated, is meanwhile keeping again mutual independence between them, be beneficial to maintenance and the expansion of platform.
Compared with prior art, the present invention is a set of complete magnanimity multi-medium data analysis theories and technical system, by this platform of the present invention, can realize the high-performance treatments of magnanimity multi-medium data, to meet the many services demands such as video content analysis, video frequency searching, image retrieval and event detection, can not only greatly improve computing velocity, can also guarantee computational accuracy simultaneously.
Accompanying drawing explanation
Fig. 1 is framework schematic diagram of the present invention;
Fig. 2 is the high-level schematic functional block diagram of platform driver of the present invention;
Fig. 3 is the high-level schematic functional block diagram of work submodule of the present invention.
Embodiment
Below in conjunction with the drawings and specific embodiments, the present invention is described in detail.
As Figure 1-3, a kind of MapReduce platform towards magnanimity multi-medium data based on GPU, utilize computer cluster to realize the computing to image/video retrieval tasks, in each computer cluster, be provided with a plurality of CPU and GPU, it is characterized in that, described platform is based upon on CUDA and HDFS, comprise platform driver 1 and work submodule 2, described platform driver 1 is in charge of the soft and hardware resource of platform, control the workflow of platform, its groundwork comprises the startup, cutting, scheduling, fault-tolerant processing of task etc.; Described 2 of work submodules be main image, video frequency searching Processing Algorithm as the realization of feature point extraction, cluster etc., born calculation task the heaviest in platform.Different work submodules 2 are under the management of platform driver 1, and certain task that mutually cooperated, is meanwhile keeping again mutual independence between them, be beneficial to maintenance and the expansion of platform.Between described platform driver 1 and work submodule 2, adopt Protocol Buffer serializing agreement as host-host protocol, to simplify the complicacy of exchanges data between the two, utilize JNI technology to carry out alternately, with the high efficiency that guarantees that it is mutual simultaneously.Described platform driver 1 is used Java language to write, and is realization and the expansion of Hadoop framework in concrete application.Described work submodule 2 is based upon and on CUDA basis, uses C/C++ and CUDA-C language compilation.
Described platform driver 1 adopts MapReduce computation model, and the primary control program on dispatching clustered node is some Map tasks by image/video retrieval process division of tasks, and the data of described Map task are stored in HDFS.During each Map task start, utilize the listed files that platform driver 1 imports into obtain task data, and calculate concrete distribution of computation tasks to described work submodule 2, task is distributed to GPU to task dispatcher in described work submodule 2 or CPU processes, local library libhdfs.so by HDFS in calculation processes obtains the required data of calculating, and HDFS afterwards writes direct the data after computing.Described work submodule 2 adopts distributed caching technology in calculation processes, and the Internet Transmission of HDFS while realizing the image/video retrieval process algorithm with data unchangeability to reduce improves the performance of whole cluster.
Embodiment: by 12 host nodes, carried out a large amount of image/video retrieval process experiments on the computer cluster that each node comprises a CPU and two GPU at one.Experiment shows, platform of the present invention can not only accelerate processing speed (highest point reaches nearly 1500 times) greatly, also can greatly improve arithmetic accuracy simultaneously.Cluster configuration is as shown in table 1:
The configuration of table 1 computer cluster
From upper table, can see, platform of the present invention can move on the PC cluster of common, inexpensive, and does not need special expensive server cluster, and performance is not less than the latter.The present embodiment has selected different data sets to test on platform of the present invention, and comprising MSR-Bing, Flickr100k, CCVideo and Oxford etc., its picture number has reached 1,000,000 grades, and unique point quantity is over hundred million grades.When Flickr100k pictures are carried out to clustering algorithm, its speed-up ratio is as shown in table 2:
The speed-up ratio of table 2 when Flickr100k pictures are carried out to clustering algorithm
Wherein:
S---standalone version single-threading program
C---platform of the present invention is not enabled GPU and is accelerated
C+G---platform of the present invention is enabled GPU and is accelerated
In experimentation, whole working platform is smooth, does not substantially need human intervention and supervision.As can be seen from Table 2, platform of the present invention is not when enabling GPU, and speed-up ratio is directly proportional to host number; Enable after GPU, the speed-up ratio of whole cluster obtains greatly and promotes, and mainly has benefited from the acceleration that GPU is superior.
The precision of different images searching algorithm is as shown in table 3:
The precision of table 3 the invention process different images searching algorithm
In table 3, the first row 20K, 200K, 1M represent respectively central point number in cluster, and 0 and 1M in the second row represents respectively to join the number of the picture disturbing in reference set, are respectively 0 and 1,000,000.Baseline (Inv), HE and WGC represent respectively three kinds of common methods for Image Retrieval.From table 3, can see, on platform of the present invention, the precision of implementation algorithm also has many liftings, and this is mainly because the simultaneously treatable data volume of platform of the present invention strengthens, and can process the not treatable big data quantity of other algorithms.