[go: up one dir, main page]

WO2006112843A1 - Identification par empreinte digitale acoustique repartie - Google Patents

Identification par empreinte digitale acoustique repartie Download PDF

Info

Publication number
WO2006112843A1
WO2006112843A1 PCT/US2005/013267 US2005013267W WO2006112843A1 WO 2006112843 A1 WO2006112843 A1 WO 2006112843A1 US 2005013267 W US2005013267 W US 2005013267W WO 2006112843 A1 WO2006112843 A1 WO 2006112843A1
Authority
WO
WIPO (PCT)
Prior art keywords
fingerprint
record
database
cluster
validating
Prior art date
Application number
PCT/US2005/013267
Other languages
English (en)
Inventor
Sean Ward
Isaac J. Richards
Original Assignee
Sean Ward
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sean Ward filed Critical Sean Ward
Priority to PCT/US2005/013267 priority Critical patent/WO2006112843A1/fr
Publication of WO2006112843A1 publication Critical patent/WO2006112843A1/fr

Links

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • G11B27/28Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • G06F16/632Query formulation
    • G06F16/634Query by example, e.g. query by humming
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/683Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content

Definitions

  • the present invention is related to a method for the distributed recognition of media files, such as digital audio and video files. More specifically, it relates to the recognition of such files using a combination of acoustic and bit based fingerprints and metadata.
  • media file identification has been based on metadata, a bit based hash, or an acoustic fingerprint, and has utilized centralized server architecture. This can be effective in small-scale systems, or in large scale ones by utilizing a less robust acoustic fingerprint, and a less exhaustive metadata matching technique.
  • One such example system for the former is the Napster system that was deployed in 2001 for large- scale acoustic fingerprint and bit based hash identification.
  • An example of a hybrid metadata and acoustic fingerprint based system is Musicbrainz, which utilizes a single central repository of metadata and fingerprints, moderated by users, to serve client requests.
  • This system for distributed media file recognition comprises four major parts: the media file analysis component, the edge node component, the index node component, and the central fingerprint analysis and issuer (server) component.
  • Fingerprints are built, using the media file analysis component, off a sound stream, which may be sourced from a compressed audio file, a CD, a radio broadcast, a microphone, or any of the available digital audio sources. Fingerprints are formed by the subdivision of an audio stream into discrete frames, wherein acoustic features, such as zero crossing rates, spectral residuals, Haar wavelet residuals, Mel cepstrals, and trailing spectral power deltas are extracted, summarized, and organized into frame feature vectors. The sampling of these feature frames can be continuous, or fixed windows within the audio stream can be utilized.
  • the edge node component will optimally be located with the media analysis component in each end user client, and is responsible for distributed signed fingerprint database storage, as well as fingerprint resolution.
  • a certain percentage of edge nodes can be promoted to index nodes, which are registered with the central server, and maintain the neighborhood set of edge nodes which contain a single image of the fingerprint database, for fingerprint resolution.
  • the central fingerprint analysis and issuer component is responsible for maintaining the central authoritative fingerprint database, as well as providing a last resort search location in the event of network disruption, data aggregation for automatic database growth, and optionally, query logging for system analysis and content population measurement.
  • FIG. 1 is a block diagram, showing the components of the distributed media file recognition system.
  • FIG. 2 is a logic flow diagram, showing the process of fingerprinting a media file, extracting metadata, forming a media file recognition packet, and performing a recognition query.
  • FlG. 3 is a logic flow diagram, showing the process of adding and populating a new index or edge node to the recognition system.
  • FIG. 4 is a logic flow diagram, showing in detail the process of resolving a media file recognition packet once it hits an index node.
  • FIG. 5 is a block diagram, showing the components of an index node.
  • FIG. 6 is a block diagram, showing the components of an edge node.
  • FIG. 7 is a block diagram, showing the components of the central server.
  • FIG. 8 is a block diagram, showing the components of a signed fingerprint record.
  • FIG. 9 is a logic flow diagram, showing the process of propagating a signed fingerprint record update from the central server.
  • FIG. 10 is a logic flow diagram, showing the process of adding new entries to the confirmed identification database in the central server.
  • the preferred embodiment of the present invention places the recognition client 40, edge node 30, and index node 10 components of the system within a peer-to- peer client.
  • the central fingerprint analyzer and issuer component 20 is ideally centrally hosted, although aspects of it (like failover copies of the current database) can be geographically dispersed.
  • the recognition client Upon being tasked to identify a piece of content, the recognition client will proceed to decode (100), fingerprint with one or more bit based and acoustic based fingerprints (110), and summarize any available file metadata (120) into a media recognition request packet (130), as described in FIG 2.
  • this packet can also contain information on the request originator, allowing replies to be sent directly from the edge nodes 30 to the source recognition client 40, and may also optionally be posted to the central server 20 for logging purposes.
  • an available index node is selected from the recognition client index node list. This list may be obtained from the central server, synchronized from another peer, or hard coded as a set of generally available "master" index nodes. Similar methods for synchronizing the master index node list, such as DNS requests, and broadcast based discovery requests are also contemplated by this invention.
  • the media recognition packet is then propagated to the index node for a resolution request (140).
  • the index node 10 then proceeds to perform the fingerprint hash function
  • db partition subset list (420)
  • the index node Upon selecting a db partition subset list (420), the index node proceeds to route the media recognition packet to one or more currently available edge nodes (430).
  • the edge node 30 Upon receiving a media recognition request, the edge node 30 proceeds to perform a brute force fingerprint comparison between the incoming media recognition request and any signed fingerprint records 800 (as described in FIG 8) stored on said edge node 30. Additionally, if sufficient signed fingerprint records are stored on an edge node 30 to warrant indexing the local database, such an index can be used to further select a subset of the edge node records to perform the search against. If one or more signed fingerprint records are an apparent match to the incoming fingerprint request , the complete signed fingerprint record set that matches is then returned to the index node 10, or, optionally, routed directly back to the originating recognition client 40. [0023] At the index node 10, any matching signed records are aggregated (440), and returned to the recognition client.
  • the metadata records 820 can be utilized to select the best matching record using a fuzzy text matching algorithm. If no match is sufficiently distinct from metadata matching as well, the available matches and overall confidence scores can be displayed to the user at the recognition client 40, to allow a final determination of an appropriate match. [0024] In the event that no candidate signed records are returned from the index node 10, the user selects no match from the record set at the recognition client 40, or all signed records fail to validate against the central issuer certificate, then the media recognition query can be routed directly to the central server 20, for a last attempt at recognition, and to add the media recognition record to the pending table 740 at the central server 20 in the event of no match.
  • a check is performed to see if one or more fingerprints in the record resolve against the reference fingerprint database 750. If a match is found (1020), the master fingerprint database 700 is updated with a new fingerprint to metadata association. If no reference fingerprint is found (1030), then the current unverified fingerprint database 730 is queried. If no matching fingerprint clusters are found (1050), then a new cluster is added to the unverified fingerprint database 730, and the record is left in the pending table 740. [0026] If a matching cluster is found in the unverified fingerprint database 730, then a test can be performed (1040) to check if the associated metadata records for that cluster correlate. Additionally, manual review, or matching against the master confirmed metadata database 710 can be used to confirm a cluster.
  • the master fingerprint database 700 is updated with a new record from the confirmed cluster (1070), and the record and confirmed cluster are removed from the pending table 740, and unconfirmed fingerprint database 730, respectively.
  • the update mechanism described in FIG 9 Upon inserting, or updating new signed fingerprint records in the central server database, the update mechanism described in FIG 9 is followed. Specifically, the version and expiration fields of the signed fingerprint record (910) are updated to expire any existing record that has been distributed to the edge nodes. Next, the updated record is signed with the central server issuer key 760, certifying the source of the record. The master index node list 720 is then accessed, and the new signed fingerprint record, and the database partition number where the record should be stored, is sent to each available index node (940). Upon receiving a signed fingerprint record, the index node 10 proceeds to contact each edge node 30 stored in the edge node table 520 associated with the database partition number suggested by the central server. Finally, each contacted edge node 30 then updates the local database 610 with the new signed fingerprint record, and validates the signature on the record.
  • an edge node is initialized (300), which proceeds to contact one or more index nodes 10 to check whether a promotion is needed to an index node (320).
  • the edge node If the edge node is to become an index node, it registers itself with the central server 20, and one or more peer index nodes are returned. The edge node database 520 and state table 530 of one or more of these peer index nodes is then synchronized directly with the new index node 340. [0029] In the event that the edge node 30 is to stay an edge node, the contacted index node 10 will assign a database partition number to the new edge node 30. This assignment can be done in a variety of fashions, including selecting the database partition with the smallest number of active edge nodes, selecting the partition with the highest query throughput, or selecting based on geographic location (mirroring the most "distant" peer).
  • a list of currently known peers from that partition group is also provided.
  • the edge node then directly contacts its peer group (380), to synchronize the current database image for that database partition.
  • the central server is directly contacted to receive the database partition image directly (370), and the edge node registers itself as being ready for queries with one or more index nodes 10 from the index node db 620.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention se rapporte à un système et à un procédé d'identification de fichiers multimédia à la fois disponibles sur le marché et créés par un utilisateur, ledit système permettant une identification précise et extensible par analyse dactyloscopique des fichiers multimédia. Le système se compose de quatre parties essentielles : l'élément d'analyse du fichier multimédia ou client d'identification (40) ; l'élément nodal de contour (30) ; l'élément nodal index (10) ; et l'élément central d'analyse d'empreinte digital et d'émission, ou serveur central (20).
PCT/US2005/013267 2005-04-19 2005-04-19 Identification par empreinte digitale acoustique repartie WO2006112843A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/US2005/013267 WO2006112843A1 (fr) 2005-04-19 2005-04-19 Identification par empreinte digitale acoustique repartie

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2005/013267 WO2006112843A1 (fr) 2005-04-19 2005-04-19 Identification par empreinte digitale acoustique repartie

Publications (1)

Publication Number Publication Date
WO2006112843A1 true WO2006112843A1 (fr) 2006-10-26

Family

ID=37115441

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2005/013267 WO2006112843A1 (fr) 2005-04-19 2005-04-19 Identification par empreinte digitale acoustique repartie

Country Status (1)

Country Link
WO (1) WO2006112843A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021047664A1 (fr) * 2019-09-12 2021-03-18 华为技术有限公司 Procédé de reconnaissance de caractéristique biométrique et dispositif associé

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020083060A1 (en) * 2000-07-31 2002-06-27 Wang Avery Li-Chun System and methods for recognizing sound and music signals in high noise and distortion
US20020161741A1 (en) * 2001-03-02 2002-10-31 Shazam Entertainment Ltd. Method and apparatus for automatically creating database for use in automated media recognition system
US20030028796A1 (en) * 2001-07-31 2003-02-06 Gracenote, Inc. Multiple step identification of recordings
US20040006701A1 (en) * 2002-04-13 2004-01-08 Advanced Decisions Inc. Method and apparatus for authentication of recorded audio

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020083060A1 (en) * 2000-07-31 2002-06-27 Wang Avery Li-Chun System and methods for recognizing sound and music signals in high noise and distortion
US20020161741A1 (en) * 2001-03-02 2002-10-31 Shazam Entertainment Ltd. Method and apparatus for automatically creating database for use in automated media recognition system
US20030028796A1 (en) * 2001-07-31 2003-02-06 Gracenote, Inc. Multiple step identification of recordings
US20040006701A1 (en) * 2002-04-13 2004-01-08 Advanced Decisions Inc. Method and apparatus for authentication of recorded audio

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021047664A1 (fr) * 2019-09-12 2021-03-18 华为技术有限公司 Procédé de reconnaissance de caractéristique biométrique et dispositif associé

Similar Documents

Publication Publication Date Title
US11483622B2 (en) Hybrid blockchains and streamchains using non-crypto hashes for securing audio-, video-, image-, and speech-based transactions and contracts
US7765192B2 (en) System and method for archiving a media collection
US7523312B2 (en) Fingerprint database updating method, client and server
US8447120B2 (en) Incremental feature indexing for scalable location recognition
JP4398242B2 (ja) 録音の多段階識別方法
US20200301961A1 (en) Image retrieval method and apparatus, system, server, and storage medium
CN1881210A (zh) 用于搜索的方法和设备
US20100161614A1 (en) Distributed index system and method based on multi-length signature files
MXPA05010665A (es) Sistema y metodo para acelerar busquedas de base de datos para multiples corrientes de datos sincronizados.
EP2668619A1 (fr) Procédé et appareil d'identification d'images similaires
KR20040081191A (ko) 고속 해시-기반 멀티미디어 객체 메타데이터 검색
CN113767390B (zh) 用于分布式存储系统中的改变检测的属性分组
CN102365640A (zh) 在分级的基于信任的配置中结构化并搜索数据
CN103973810A (zh) 基于互联网协议ip盘的数据处理方法和装置
CN111026709A (zh) 基于集群访问的数据处理方法及装置
CN108647266A (zh) 一种异构数据快速分布存储、交互方法
CN112507026B (zh) 基于键值模型、文档模型和图模型的分布式高速存储方法
US20080097971A1 (en) Peer-to-peer based secondary key search method and system for cluster database
US20160085795A1 (en) Grouping equivalent content items
CN113728304A (zh) 分布式存储系统中的粒度改变检测
CN119829571A (zh) 一种数据存储方法、装置、设备和介质
CN114553885A (zh) 基于dht网络的存储方法及装置、电子设备、存储介质
WO2006112843A1 (fr) Identification par empreinte digitale acoustique repartie
CN111352938A (zh) 数据处理方法、计算机设备及存储介质
CN113129855A (zh) 音频指纹提取和建库方法、及音频识别和检索方法与系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

NENP Non-entry into the national phase

Ref country code: RU

122 Ep: pct application non-entry in european phase

Ref document number: 05739852

Country of ref document: EP

Kind code of ref document: A1