WO2014068364A1 - Method and systems for chunk-level peer-to-peer (p2p) file sharing - Google Patents
Method and systems for chunk-level peer-to-peer (p2p) file sharing Download PDFInfo
- Publication number
- WO2014068364A1 WO2014068364A1 PCT/IB2012/056000 IB2012056000W WO2014068364A1 WO 2014068364 A1 WO2014068364 A1 WO 2014068364A1 IB 2012056000 W IB2012056000 W IB 2012056000W WO 2014068364 A1 WO2014068364 A1 WO 2014068364A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- chunk
- identification
- digest
- chunks
- size
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/104—Peer-to-peer [P2P] networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
Definitions
- This invention is generally in the field of peer-to-peer (p2p) file sharing.
- the present method of sharing fi les over a p2p overlay network is:
- the second Peer downloads chunks of the requested file by sending requests to other Peers came to know from the IP address list sent from the Tracker.
- the format of this chunk request is either (main file, offset of chunk) or (main file, Chunk ID) depending how chunks are organised.
- the Chunk ID is identification of the chunk. This process is depicted in the diagram FIG. 03.
- the present method requires the main file to be supplied together with a chunk identification (Chunk ID) to fetch or download a chunk from a Peer in the form of either (main file, offset of chunk) or (main file, Chunk ID) depending how chunks are organised.
- chunk ID chunk identification
- Some systems encode a main fi le into multiple resolution chunk collections. That is, for an example, a full HD main file is segmented into 1080p chunk collection, then the main file is re-encoded into 720p chunk collection, then the ma in file is re-encoded into 480p chunk collection, etc. In such case, to fetch or download a chunk it is necessary to provide the Content Collection ID together with the Chunk ID.
- the Content Collection ID could be the name of the main file.
- One of the issues with present methods of p2p based file sharing is the dupl ication of contents, and as a further effect, reduces the number seeders for a g iven content over a p2p overlay network. Seeders are those who share the complete fi le over the p2p overlay network.
- main file copy and renamed as another file and share it over the p2p network it duplicates the contents as if two different files as two entries comes to the Tracker for the main file and the copied file. Some p2p systems detect this and shares only one file, thereby, avoiding the content duplication.
- Vast majority of a video file is the video content, the audio content is a smaller part compared to video content. Audio contents of a 3D full high definition video with left and right views is a very small part compared to its video contents. If the audio part of such a video file is replaced with a different language, and share the file over present methods of p2p based file sharing, the video part which indeed the same still get duplicated as if completely different content.
- Methods and systems disclosed here implements a granular chunk-level peer-to-peer (p2p) file sharing system.
- Chunks are parts or pieces of a larger file or stream, produced by a segmentation process. Chunks stored as separate individual files are efficient and suitable for peer-to-peer (p2p) file sharing at granular chunk-level .
- p2p peer-to-peer
- Chunks must be assigned with unique chunk identifications so that different chunks are assigned with different chunk identifications (Chunk IDs) and identical chunks are assigned with identical Chunk IDs.
- a system i.e., a p2p storage service, which implements p2p request message methods to request a chunk by Chunk ID for both local and remote clients.
- a content-aware method of segmenting content as video content into video chunks, audio content into audio chunks, and subtitle content into subtitle chunks complements the requirements for a method of chunk-level p2p file sharing and solves issue of duplication associated with the present methods of p2p based file sharing.
- Chunk ID 6.
- P2P request message method to initiate upload a chunk to a local p2p storage service by the Chunk ID.
- P2P request message method to get Peer list from a Tracker by the Chunk ID.
- FIG. 01 is a diagrammatic representation of the present method of segmentation of content, in general.
- FIG. 01A shows a segment a given fi le and identify segments or chunks of the file as an offset relative to start of the file, and on ly a metadata file is generated as a result of the process and
- FIG. 01B shows create segments or chunks of the main file as separate individual smaller files, i.e. Chunks, and identify such chunks uniquely but relative to the main input file (eg. Chunk 1 of file X.mp4)
- FIG. 01C shows encode a main file into multiple resolution content collections and identify chunks uniquely but relative to the content collection.
- FIG. 02 is the disclosed segmentation method, in general, where the segmentation program/system fetches unique chunk identifications from an external chunk identification generation server.
- FIG. 03 shows the present p2p file sharing methodology.
- Step 1 shows the Tracker Announcement after segmentation at Peer 1 (PI)
- Step 2 also shows a Tracker Announcement by a p2p storage server
- Peer 2 (P2) also shows a Tracker Announcement by a p2p storage server
- Peer 2 (P2) also shows a Tracker Announcement by a p2p storage server
- Step 3 shows a peer (P3) attempt to get the peer list from the Tracker
- Step 4 shows the Tracker sends the peer list to the P3
- Step 5 shows now P3 is aware who is sharing the file and make a request for a chunk from PI
- Step 6 shows PI sends the requested chunk to the P3
- Step 7 & 8 shows P3 simi larly download chunks from P2.
- FIG. 04 shows the big picture of the disclosed chunk-level p2p file sharing.
- Stage 1 shows segment and post chunks to the ma in peer Pi's local p2p storage service.
- Stage 2 shows Tracker Announcement from the ma in peer (PI) announcement of the availability of new chunks.
- Stage 3 shows once the peer P2 obta in the metadata file, request chunks from its loca l p2p storage service and the peer P2's local p2p storage service contacts the Tracker to get peer lists for chunks.
- Stage 4 shows peer P2 download chunks from the main peer PI.
- FIG. 05 shows the big picture of the method of generating an unique chunk identification by providing one cryptographic digest and size of the chunk as parameters.
- the FIG. 05A shows the communication steps involved, and parameters supplied.
- the FIG . 05B shows the table of rules for the process.
- FIG. 06 shows the big picture of the method of generating an unique chunk identification by providing two cryptographic digests and size of the chunk as parameters.
- the FIG. 06A shows the communication steps involved, and parameters supplied.
- the FIG . 06B shows the table of rules for the process.
- FIG. 07 shows the big picture of the method of generating an unique chunk identification (generated ID based) by providing one digest and size of the chunk as parameters.
- the FIG. 07A shows the communication steps involved, and parameters supplied.
- the FIG . 07B shows the table of rules for the process.
- FIG. 08 shows the big picture of the method of generating an unique chunk identification (generated ID based) by providing two digests and size of the chunk as parameters.
- the FIG. 08A shows the communication steps involved, and parameters supplied.
- the FIG . 08B shows the table of rules for the process.
- chunk ID chunk-level peer-to-peer
- Chunk Identification Generation Server ma intains a table with three(3) columns, first column is "Digest”, second column is “Size” and third column is "Chunk ID"; Chunk identification request includes parameters cryptographic digest of the chunk and the size of the chunk in bytes; The returned chunk identification is unique; The SHA-2 (SHA-256 or higher) or SHA-3 (256-bit digest or higher) cryptographic hash functions are recommended to generate the cryptographic digest of the chunk.
- SHA-2 SHA-256 or higher
- SHA-3 256-bit digest or higher
- the received digest not in the table, returns the received digest as the chunk identification and inserts an entry into the table; If the received digest exists in the table but the size is different, returns the received digest appended with one or more alphanumeric characters to make it an unique identification, as the chunk identification and inserts an entry into the table; Eg. digest+X or digest+XX, where X is an alphanumeric character represents 0 to 9, a to z and A to Z, totally 62 permutations per alphanumeric character.
- the Chunk Identification Generation Server If there are one or more entries in the table matching the received digest and the size, returns the chunk identifications of entries in the table matching the received d igest and the size in a response message with a status code followed by a list of one or more chunk identifications to indicate the client should download and binary compare the given chunks with the chunk requiring an identification; If the client finds the chunk requiring an identification is identical to one of those given chunks to binary compare, then the client should use the identification of the matching chunk as the identification for the chunk requiring an identification; If the client finds the chunk requiring an identification is not identica l to any one of those given chunks, then the client should issue a different chunk identification request method to the Chunk Identification Generation Server which indicates the client is making this request after binary comparison and includes parameters cryptographic digest of the chunk, the size of the chunk in bytes, and the list of one or more chunk identifications received for the binary comparison and the Chunk Identification Generation Server verifies the list of one or more chunk identifications given with regards to the supplied digest and the size and Ch
- Chunk Identification Generation Server ma intains a table with four(4) columns, first column is "Short Digest”, second column is “Long Digest”, third column is “Size” and fourth column is "Chunk ID”;
- Chunk identification request includes parameters, short cryptographic digest of the chunk (eg. 256-bit digest ), long cryptographic digest of the chunk (eg. 512-bit digest) and the size of the chunk in bytes;
- the returned chunk identification is un ique;
- the SHA-2 (SHA-256 or higher) or SHA-3 (256-bit digest or higher) cryptographic hash functions are recommended to generate the cryptographic digests of the chunk.
- the SHA- 256 is recommended to generate the short cryptograph ic digest and SHA-3 (512-bit digest) is recommended to generate the long cryptographic digest of the chunk.
- the received short digest not in the table, returns the received short digest as the chunk identification and inserts an entry into the table; If the received short digest exists in the table but the related long digest is different, returns the received short digest appended with one or more alphanumeric characters to make it an unique identification, as the chunk identification and inserts an entry into the table; Eg. digest+X or digest+XX, where X is an a lphanumeric character represents 0 to 9, a to z and A to Z, totally 62 permutations per alphanumeric character.
- the received short and long digests exists together in a single entry in the table but the size is different, returns the received short digest appended with one or more alphanumeric characters to make it an unique identification, as the chunk identification and inserts an entry into the table;
- This method comprises one or more external Chunk Identification Requests Processing Servers and a Chunk Identification Generation Server; Requests for new chunk identifications are received by the Chunk
- Chunk Identification Requests Processing Server and when the Chunk Identification Requests Processing Server requires a new unique chunk identification, it fetches one from the Chunk Identification Generation Server.
- Chunk Identification Requests Processing Server maintains a table with three(3) columns, first column is "Digest”, second column is “Size” and third column is "Chunk ID”; Chunk identification request includes parameters cryptographic digest of the chunk and the size of the chunk in bytes; The returned chunk identification is unique; The SHA-2 (SHA-256 or higher) or SHA-3 (256-bit digest or higher) cryptographic hash functions are recommended to generate the cryptographic digest of the chunk.
- SHA-2 SHA-256 or higher
- SHA-3 256-bit digest or higher
- a chunk identification generation service or an external server may pre generate large number of contiguous chunk identifications but pick and return one of those generated chunk identifications either random ly or by based on a criteria when request for a new chunk identification.
- This method comprises one or more external Chunk Identification Requests Processing Servers and a Chunk Identification Generation Server; Requests for new chunk identifications are received by the Chunk
- Chunk Identification Requests Processing Server and when the Chunk Identification Requests Processing Server requires a new unique chunk identification, it fetches one from the Chunk Identification Generation Server.
- Chunk Identification Requests Processing Server maintains a table with four(4) columns, first column is “Short Digest”, second column is “Long Digest”, third column is “Size” and fourth column is "Chunk ID”; Chunk identification request includes parameters, short cryptographic digest of the chunk (eg. 256-bit digest), long cryptographic digest of the chunk (eg. 512-bit digest) and the size of the chunk in bytes; The returned chunk identification is unique;
- the SHA-2 (SHA-256 or higher) or SHA-3 (256-bit digest or higher) cryptographic hash functions are recommended to generate the cryptographic digests of the chunk.
- the SHA-256 is
- the Chunk Identification Requests Processing Server fetches a new identification from the Chunk Identification Generation Server and returns the new identification as the chunk identification and inserts an entry into the table; If the received short digest exists in the table but the related long digest is different, the Chunk Identification Requests Processing Server fetches a new identification from the Chunk Identification Generation Server and returns the new identification as the chunk identification and inserts an entry into the table.
- the Chunk Identification Requests Processing Server fetches a new identification from the Chunk Identification Generation Server and returns the new identification as the chunk identification and inserts an entry into the table;
- a chunk identification generation service or an external server may pre generate large number of contiguous chunk identifications but pick and return one of those generated chunk identifications either random ly or by based on a criteria when request for a new chunk identification.
- the P2P request message method format is as follows:
- the p2p service version number optiona may also be added to the above message.
- the method name GET_UNIQUE_CHUNK_ID, digest and size may be separated by one or more spaces or any other delimiter such as a colon (': '), semi-colon ( '; '), etc.
- the "digest” is the cryptographic digest of the chunk, in text format.
- the "size” is size of the chunk in bytes, in text format.
- the trailing ⁇ r ⁇ n is optiona l, but at least a trailing ⁇ n is recommended.
- the ⁇ r is Carriage Return and ⁇ n is new line or Line Feed.
- the GET UNIQUE CHUNKJD method returns:
- the P2P request message method format is as follows:
- the p2p service version number optiona may also be added to the above message.
- the method name GET_UNIQUE_CHUNK_ID, short digest, long digest and size may be separated by one or more spaces or any other delimiter such as a colon (': '), semi-colon ('; '), etc.
- the "short_digest” is a cryptographic digest of the chunk, in text format. Eg. SHA256
- the " long_digest” is a cryptographic digest of the chunk, in text format. Eg. SHA512
- the "size” is size of the chunk in bytes, in text format.
- the trailing ⁇ r ⁇ n is optiona l, but at least a trailing ⁇ n is recommended.
- the ⁇ r is Carriage Return and ⁇ n is new line or Line Feed.
- the GET UNIQUE CHUNK ID method returns:
- the P2P request message method format is as follows:
- the p2p service version number optiona may also be added to the above message.
- the method name GET_UNIQUE_CHUNK_ID_AFTER_BINARY_COMPARISON, digest, size and chunk list may be separated by one or more spaces or any other delimiter such as a colon ( ': '), semi-colon ('; '), etc.
- the "digest” is the cryptographic digest of the chunk, in text format.
- the "size” is size of the chunk in bytes, in text format.
- the "chunk_list” is the list of one or more chunk identifications given to binary compare, in text format.
- the trailing ⁇ r ⁇ n is optiona l, but at least a trailing ⁇ n is recommended.
- the ⁇ r is Carriage Return and ⁇ n is new line or Line Feed.
- Chunk_ID ⁇ r ⁇ n A response message with status code and description for fa ilure.
- the P2P request message method format is as follows:
- the p2p service version number optiona may also be added to the above message.
- the method name GET UNIQUE CHUNK ID AFTER BINARY COMPARISON, short digest, long digest, size and chunk_list may be separated by one or more spaces or any other delimiter such as a colon (': '), sem i-colon (';'), etc.
- the "short_digest” is a cryptographic digest of the chunk, in text format. Eg. SHA256
- the " long_digest” is a cryptographic digest of the chunk, in text format. Eg. SHA512
- the "size” is size of the chunk in bytes, in text format.
- the "chunk_list” is the list of one or more chunk identifications given to binary compare, in text format.
- the trailing ⁇ r ⁇ n is optiona l, but at least a trailing ⁇ n is recommended.
- the ⁇ r is Carriage Return and ⁇ n is new line or Line Feed.
- the P2P request message method format is as follows:
- the p2p service version number optiona may also be added to the above message.
- the method name GET_CHUNK_REMOTE and Chunk-ID may be separated by one or more spaces or any other delimiter such as a colon (': '), semi-colon ('; '), etc.
- the "Chunk-ID” is the unique identification of the chunk, in text format.
- the trailing ⁇ r ⁇ n is optiona l, but at least a trailing ⁇ n is recommended.
- the ⁇ r is Carriage Return and ⁇ n is new line or Line Feed.
- the GET_CHUNK_REMOTE method returns:
- the GET_CHU NK_REMOTE method does not download the chunk.
- the P2P request message method format is as follows:
- the p2p service version number optiona may also be added to the above message.
- the method name G ET_CH U N K_PART_RE M OTE, Chunk-ID, Start and offset may be separated by one or more spaces or any other delim iter such as a colon (' : '), semi-colon (';'), etc.
- the "Chunk-ID” is the unique identification of the chunk, in text format.
- the "Star” is the starting byte position from the beginn ing of the chunk, in text format. First byte in zero.
- the “offset” is the length or number of bytes to read from the starting position, in text format.
- the trailing ⁇ r ⁇ n is optiona l, but at least a trailing ⁇ n is recommended.
- the ⁇ r is Carriage Return and ⁇ n is new line or Line Feed.
- the GET_CHU NK_PART_REMOTE method does not download the chunk.
- the P2P request message method format is as follows:
- the p2p service version number optiona may also be added to the above message.
- the method name UPLOAD CHUNK, Chunk-ID, Chunk-length and digest may be separated by one or more spaces or any other delim iter such as a colon (' : '), semi-colon (';'), etc.
- the "Chunk-ID" is the unique identification of the chunk, in text format.
- the "Chunk-length” is the length of the chunk in bytes, in text format.
- the "digest” is a cryptographic digest of the chunk, in text format. Eg. SHA256
- the trailing ⁇ r ⁇ n is optiona l, but at least a trailing ⁇ n is recommended.
- the ⁇ r is Carriage Return and ⁇ n is new line or Line Feed.
- the client If the client receives the response message with status code for continue to send the chunk, then send the chunk to the p2p storage service as a binary byte stream.
- the p2p storage service send either a response message with status code for success or a response message with status code and description for failure.
- a p2p storage service is not allowed to upload a chunk to another p2p storage service as a security measure.
- the P2P request message method format is as follows:
- the p2p service version number optiona may also be added to the above message.
- the method name GET_CHUNK_LOCAL, Chunk-ID and digest may be separated by one or more spaces or any other delimiter such as a colon (': '), semi-colon ( '; '), etc.
- the "Chunk-ID” is the unique identification of the chunk, in text format.
- the "digest” is a cryptographic digest of the chunk, in text format. Eg. SHA256
- the trailing ⁇ r ⁇ n is optiona l, but at least a trailing ⁇ n is recommended.
- the ⁇ r is Carriage Return and ⁇ n is new line or Line Feed.
- the local p2p storage service downloads the requested chunk for the GET_CHU NK_LOCAL method before return the full file system path of the chunk to the caller.
- the GET_CHUNK_LOCAL method returns:
- the P2P request message method format is as follows:
- the p2p service version number optiona may also be added to the above message.
- the method name CHECK_CHUNK, and Chunk-ID may be separated by one or more spaces or any other del imiter such as a colon (': '), sem i-colon (';'), etc.
- the "Chunk-ID” is the unique identification of the chunk, in text format.
- the trailing ⁇ r ⁇ n is optiona l, but at least a trailing ⁇ n is recommended.
- the ⁇ r is Carriage Return and ⁇ n is new line or Line Feed.
- the CHECK_CHU NK method returns:
- a response message with status code for chunk does not exist if the requested chunk does not exists at the p2p storage service.
- the CHECK_CHU NK method does not download the chunk.
- the p2p storage service is also known as a seeder, which manages a chunk store.
- P2P storage service further comprising issue a Tracker Announcement for a chunk immediately after a chunk is uploaded to P2P storage service's chunk store.
- the P2P request message method format is as follows:
- the p2p service version number optiona may also be added to the above message.
- the method name GET_PEER_LIST, Chunk-ID and digest may be separated by one or more spaces or any other delimiter such as a colon (': '), semi-colon ('; '), etc.
- the "Chunk-ID” is the unique identification of the chunk, in text format.
- the "digest” is a cryptographic digest of the chunk, in text format. Eg. SHA256.
- the implementation of this method specifies which cryptographic hash function is accepted for the digest.
- Providing the cryptographic digest of the chunk in the GET_PEER_LIST request allows load-balancing for Trackers and integrity checking.
- a load-balancer could use the first one or more characters of the digest to decide which Tracker should serve the request.
- the trailing ⁇ r ⁇ n is optiona l, but at least a trailing ⁇ n is recommended.
- the ⁇ r is Carriage Return and ⁇ n is new line or Line Feed.
- the GET_PEER_LIST method returns:
- IP Internet Protocol
- the Tracker Announcement in this regard refers to the announcement of the availability of a complete chunk ready to share by a peer to the Tracker, so that the Tracker could update its registry to reflect the IP address of the peer regarding the chunk.
- the P2P request message method format is as follows:
- TRACKER ANNOUNCE Chunk-ID digest[ ⁇ r ⁇ n] The p2p service version number optiona lly may also be added to the above message.
- TRACKER_ANNOU NCE Chunk-ID and digest may be separated by one or more spaces or any other delimiter such as a colon (':'), sem i-colon ( '; '), etc.
- the "Chunk-ID” is the unique identification of the chunk, in text format.
- the "digest” is a cryptographic digest of the chunk, in text format. Eg. SHA256.
- the implementation of this method specifies which cryptographic hash function is accepted for the digest.
- Providing the cryptographic digest of the chunk in the TRACKER_ANNOU NCE request further allows load- balancing for Trackers and integrity checking.
- a load-balancer could use the first one or more characters of the digest to decide which Tracker should serve the request.
- the trailing ⁇ r ⁇ n is optiona l, but at least a trailing ⁇ n is recommended.
- the ⁇ r is Carriage Return and ⁇ n is new line or Line Feed.
- This method further comprising the Tracker uses the chunk existence check to connect back to the peer to verify the existence of the announced chunk and its ability to share. If the Tracker finds the announced chunk does not exists or not possible to share, i.e. behind a firewall or on a private IP address, then the Tracker does not accept th is TRACKER ANNOU NCE request.
- the TRACKER_ANNOU NCE method returns:
- Ma intains registry with three(3) essential columns, first column is "Chunk ID”, second column is “Digest” and third column is "Peer List”.
- the "Digest” is the cryptographic digest of the chunk, provided with Tracker Announcement .
- the "Peer List” consists IP addresses of peers who share the chunk.
- the segmentation program At the end of the segmentation process, the segmentation program generates the metadata fi le with chunk identifications, sizes and cryptographic digests for each and every chunk.
- Remote peer obta ins the metadata fi le and read through the chunk list.
- the player (or the chunk consumption program) at the remote peer, connects to its local p2p storage service on the same peer and issue the method GET_CHUNK_LOCAL Chunk-ID digest[ ⁇ r ⁇ n] for required chunk.
- the local p2p storage service at the remote peer then connects to configured the Tracker and issue the method "GET_PEER_LIST Chunk-ID digest[ ⁇ r ⁇ n]" for the required chunk to obtain the peer list.
- the local p2p storage service at the remote peer then connects to a remote peer and issue the method "GET_CHU NK_REMOTE Chunk-ID [ ⁇ r ⁇ n]" and obtain the chunk. For every chunk downloaded, it is required to compare the computed cryptographic digest with the digest from the metadata file for the chunk.
- the local p2p storage service at the remote peer now completes the GET_CHUNK_LOCAL by returning the local full file system path of the chunk.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Transfer Between Computers (AREA)
Abstract
Methods and systems disclosed here implements a granular chunk-level peer-to-peer (p2p) file sharing system. File sharing at granular chunk-level avoid content duplication and thereby increases potential seeders in the peer-to-peer (p2p) overlay network. According to this innovation, chunks or pieces are uniquely identified, therefore, T racker does not need to know the name of the main file being shared. T racker maintains it's registry not at main file level, but at the granular level of chunks of the main file. Chunks of the main file are produced by a segmentation system with unique chunk identifications assigned and all chunks are separate individual files. The name of the main file to be shared is also not required for the T racker Announcement. T racker Announcements are made at the granular level of pieces or chunks of the main file. P2P Storage services, also known as Seeders, are also not required to know the name of the main file being shared. Seeders operate at the granular level of pieces or chunks of the main file and only know the existence of such chunks with them, and respond to queries at the chunk-level and transfer chunks. According to the present practise, the downloading program or system sends a query to a peer to check the existence of a chunk to download by providing both the name of the main file and the identification of the required chunk or byte range, but according to the innovation disclosed here, the downloading program or system sends only the identification of the required chunk. As per the innovation disclosed, the downloading system or the P2P Storage service, also known as the seeder, or also known as the caching daemon, does not reassemble the main file, it maintains the local cache at chunk-level. The systems are software programs designed and configured to offer services such as operate as a T racker and the P2P Storage service issue T racker Announcements, download chunks and Seeding according to the methods disclosed.
Description
METHOD AND SYSTEMS FOR CHUNK-LEVEL PEER-TO-PEER (P2P) FILE SHARI NG
DESCRIPTION
The invention described here, by way of example only, with reference to the accompanying drawings. In this regard, no attempt is made to show structura l deta ils of the invention in more deta il than is necessary for a fundamental understanding of the invention.
[Dl] FIELD AND BACKGROUND OF THE INVENTION
This invention is generally in the field of peer-to-peer (p2p) file sharing.
The present method of sharing fi les over a p2p overlay network is:
(1) segment a given file and identify segments or chunks of the file as an offset relative to start of the file (FIG. 01A) or create segments or chunks of the main file as separate individual smal ler files and identify such smaller files uniquely but relative to the main input file (eg. Chunk 1 of file X.mp4) (FIG. 01B & FIG. 01C).
(2) Maintain a Tracker or a Registry to identify who are sharing the main file.
(3) First Peer who holds the file, announces to the Tracker the availability of the main file and the Tracker updates its registry with the IP address of the first Peer.
(4) The second Peer who wants to download the file, request from the Tracker who holds the fi le and Tracker sends the IP address list for the file so that the second Peer can understand who holds the file.
(5) The second Peer downloads chunks of the requested file by sending requests to other Peers came to know from the IP address list sent from the Tracker. The format of this chunk request is either (main file, offset of chunk) or (main file, Chunk ID) depending how chunks are organised. The Chunk ID is identification of the chunk. This process is depicted in the diagram FIG. 03.
The present method requires the main file to be supplied together with a chunk identification (Chunk ID) to fetch or download a chunk from a Peer in the form of either (main file, offset of chunk) or (main file, Chunk ID) depending how chunks are organised.
Some systems encode a main fi le into multiple resolution chunk collections. That is, for an example, a full HD main file is segmented into 1080p chunk collection, then the main file is re-encoded into 720p chunk collection, then the ma in file is re-encoded into 480p chunk collection, etc. In such case, to fetch or download a chunk it is necessary to provide the Content Collection ID together with the Chunk ID. The Content Collection ID could be the name of the main file.
That is, according to the present methods of p2p based file sharing, it is not possible to fetch or download a chunk by the Chunk ID alone without providing its ma in file or Content Collection ID.
One of the issues with present methods of p2p based file sharing is the dupl ication of contents, and as a further effect, reduces the number seeders for a g iven content over a p2p overlay network. Seeders are those who share the complete fi le over the p2p overlay network.
If the main file copy and renamed as another file and share it over the p2p network, it duplicates the contents as if two different files as two entries comes to the Tracker for the main file and the copied file. Some p2p systems detect this and shares only one file, thereby, avoiding the content duplication.
Vast majority of a video file is the video content, the audio content is a smaller part compared to video content. Audio contents of a 3D full high definition video with left and right views is a very small part compared to its video contents. If the audio part of such a video file is replaced with a different language, and share the file over present methods of p2p based file sharing, the video part which indeed the same still get duplicated as if completely different content.
A true chunk-level p2p file sharing avoids this duplication.
[D2] SU MMARY OF THE INVENTION
Methods and systems disclosed here implements a granular chunk-level peer-to-peer (p2p) file sharing system.
Chunks are parts or pieces of a larger file or stream, produced by a segmentation process. Chunks stored as separate individual files are efficient and suitable for peer-to-peer (p2p) file sharing at granular chunk-level .
File sharing at granular chunk-level avoid content duplication and thereby increases potential seeders in the p2p overlay network.
To implement file sharing at granular chunk-level, following requirements have to be met:
1. Chunks must be assigned with unique chunk identifications so that different chunks are assigned with different chunk identifications (Chunk IDs) and identical chunks are assigned with identical Chunk IDs.
2. A method to generate different Chunk IDs for different chunks and identical Chunk IDs for identical chunks.
3. A p2p request message method to get an unique Chunk ID by providing one or more cryptographic digests.
4. Related to when request for a new unique Chunk ID, a p2p response message with a status code followed by a list of one or more chunk identifications to binary compare given one or more chunks if passed cryptographic digests already exists in a registry.
5. A p2p request message method to get an unique Chunk ID after a binary comparison of given chunks when all chunks to compare are different.
6. A p2p request message method to get a chunk from a remote p2p storage service by the Chunk ID.
7. A system, i.e., a p2p storage service, which implements p2p request message methods to request a chunk by Chunk ID for both local and remote clients.
8. A p2p request message method to get a local chunk within the same Peer by the Chunk ID.
9. A p2p request message method to check existence of a chunk of a local or remote p2p storage
service by the Chunk ID.
A content-aware method of segmenting content as video content into video chunks, audio content into audio chunks, and subtitle content into subtitle chunks, complements the requirements for a method of chunk-level p2p file sharing and solves issue of duplication associated with the present methods of p2p based file sharing.
To satisfy the requirements for a chunk-level peer-to-peer (p2p) file sharing, we disclose following methods and systems:
1. Four (4) methods to generate different Chunk IDs for different chunks and identical Chunk IDs for identical chunks.
2. Two (2) p2p request message method to get an unique Chunk ID by providing one or more
cryptographic digests.
3. A p2p response message with a status code followed by a list of one or more chunk identifications to binary compare given one or more chunks.
4. Two (2) p2p request message methods to get an unique Chunk ID after a binary comparison of given chunks when all chunks to compare are different.
5. A p2p request message method to download a chunk from a remote p2p storage service by the
Chunk ID.
6. A p2p request message method to download part of a chunk from a remote p2p storage service by the Chunk ID.
7. P2P request message method to initiate upload a chunk to a local p2p storage service by the Chunk ID.
8. P2P request message method to get a local chunk within the same Peer by the Chunk ID.
9. P2P request message method to check existence of a chunk at a loca l or remote p2p storage service by the Chunk ID.
10. A software system, chunk-level p2p storage service.
11. P2P request message method to get Peer list from a Tracker by the Chunk ID.
12. P2P request message method to issue a Tracker Announcement by the Chunk ID.
13. A software system, chunk-level Tracker service.
[D3] BRIEF DESCRIPTION OF THE DRAWI NGS
With specific reference now to the drawings in detail, it is stressed that the particulars shown are by way of example and for purposes of illustrative discussion of the preferred embodiments of the present invention, and are presented in the context of the most useful and readily understood description of the principles and conceptual aspects of the invention.
FIG. 01 is a diagrammatic representation of the present method of segmentation of content, in general. FIG. 01A shows a segment a given fi le and identify segments or chunks of the file as an offset relative to start of the file, and on ly a metadata file is generated as a result of the process and FIG . 01B shows create segments or chunks of the main file as separate individual smaller files, i.e. Chunks, and identify such chunks uniquely but relative to the main input file (eg. Chunk 1 of file X.mp4), and FIG. 01C shows encode a main file into multiple resolution content collections and identify chunks uniquely but relative to the content collection.
FIG. 02 is the disclosed segmentation method, in general, where the segmentation program/system fetches unique chunk identifications from an external chunk identification generation server.
FIG. 03 shows the present p2p file sharing methodology. Step 1 shows the Tracker Announcement after segmentation at Peer 1 (PI), Step 2 also shows a Tracker Announcement by a p2p storage server, Peer 2 (P2), Step 3 shows a peer (P3) attempt to get the peer list from the Tracker, Step 4 shows the Tracker sends the peer list to the P3, Step 5 shows now P3 is aware who is sharing the file and make a request for a chunk from PI, Step 6 shows PI sends the requested chunk to the P3, Step 7 & 8 shows P3 simi larly download chunks from P2.
FIG. 04 shows the big picture of the disclosed chunk-level p2p file sharing. Stage 1 shows segment and post chunks to the ma in peer Pi's local p2p storage service. Stage 2 shows Tracker Announcement from the ma in peer (PI) announcement of the availability of new chunks. Stage 3 shows once the peer P2 obta in the metadata file, request chunks from its loca l p2p storage service and the peer P2's local p2p storage service contacts the Tracker to get peer lists for chunks. Stage 4 shows peer P2 download chunks from the main peer PI.
FIG. 05 shows the big picture of the method of generating an unique chunk identification by providing one cryptographic digest and size of the chunk as parameters. The FIG. 05A shows the communication steps involved, and parameters supplied. The FIG . 05B shows the table of rules for the process.
FIG. 06 shows the big picture of the method of generating an unique chunk identification by providing two cryptographic digests and size of the chunk as parameters. The FIG. 06A shows the communication steps involved, and parameters supplied. The FIG . 06B shows the table of rules for the process.
FIG. 07 shows the big picture of the method of generating an unique chunk identification (generated ID based) by providing one digest and size of the chunk as parameters. The FIG. 07A shows the communication steps involved, and parameters supplied. The FIG . 07B shows the table of rules for the process.
FIG. 08 shows the big picture of the method of generating an unique chunk identification (generated ID based) by providing two digests and size of the chunk as parameters. The FIG. 08A shows the communication steps involved, and parameters supplied. The FIG . 08B shows the table of rules for the process.
[D4] DETAILED DESCRIPTION
[D4.1] Methods to generate different Chunk IDs for different chunks and identical Chunk IDs for identical chunks
To download chunks by chunk identification (Chunk ID) alone, i.e., without knowing the ma in file that a chunk belongs to, is a fundamental requ irement of a chunk-level peer-to-peer (p2p) file sharing system.
Assignment of cryptographic d igest such SHA-512 of a chunk, as the chunk's chunk identification comes closer to satisfy this requirement but though it rare but since it is mathematically possible to get the same cryptographic digest for different chunks, assignment of a cryptographic digest of a chunk as the chunk identification for large content collections is not suitable.
Therefore, four (4) methods (D4.1.1, D4.1.2, D4.1.3 and D4.1.4) are disclosed for the purpose.
[D4.1.1] A method of generating an unique chunk identification by providing one cryptographic digest and size of the chunk as parameters
An external Chunk Identification Generation Server ma intains a table with three(3) columns, first column is "Digest", second column is "Size" and third column is "Chunk ID"; Chunk identification request includes parameters cryptographic digest of the chunk and the size of the chunk in bytes; The returned chunk identification is unique; The SHA-2 (SHA-256 or higher) or SHA-3 (256-bit digest or higher) cryptographic hash functions are recommended to generate the cryptographic digest of the chunk.
If the received digest not in the table, returns the received digest as the chunk identification and inserts an entry into the table; If the received digest exists in the table but the size is different, returns the received digest appended with one or more alphanumeric characters to make it an unique identification, as the chunk identification and inserts an entry into the table; Eg. digest+X or digest+XX, where X is an alphanumeric character represents 0 to 9, a to z and A to Z, totally 62 permutations per alphanumeric character.
If there are one or more entries in the table matching the received digest and the size, returns the chunk identifications of entries in the table matching the received d igest and the size in a response message with a status code followed by a list of one or more chunk identifications to indicate the client should download and binary compare the given chunks with the chunk requiring an identification; If the client finds the chunk requiring an identification is identical to one of those given chunks to binary compare, then the client should use the identification of the matching chunk as the identification for the chunk requiring an identification; If the client finds the chunk requiring an identification is not identica l to any one of those given chunks, then the client should issue a different chunk identification request method to the Chunk Identification Generation Server which indicates the client is making this request after binary comparison and includes parameters cryptographic digest of the chunk, the size of the chunk in bytes, and the list of one or more chunk identifications received for the binary comparison and the Chunk Identification Generation Server verifies the list of one or more chunk identifications given with regards to the supplied digest and the size and Chunk Identification Generation Server returns the received digest appended with one or more alphanumeric characters to make it an unique identification, as the chunk identification and inserts an entry into the table.
Eg. Status-code Binary compare\r\n
ChunkJDx ChunkJDy ... \r\n
This process is depicted in the diagram FIG. 05.
[D4.1.2] A method of generating an unique chunk identification by providing two cryptographic digests and size of the chunk as parameters
An external Chunk Identification Generation Server ma intains a table with four(4) columns, first column is "Short Digest", second column is "Long Digest", third column is "Size" and fourth column is "Chunk ID"; Chunk identification request includes parameters, short cryptographic digest of the chunk (eg. 256-bit digest ), long cryptographic digest of the chunk (eg. 512-bit digest) and the size of the chunk in bytes; The returned chunk identification is un ique; The SHA-2 (SHA-256 or higher) or SHA-3 (256-bit digest or higher) cryptographic hash functions are recommended to generate the cryptographic digests of the chunk. The SHA- 256 is recommended to generate the short cryptograph ic digest and SHA-3 (512-bit digest) is recommended to generate the long cryptographic digest of the chunk.
If the received short digest not in the table, returns the received short digest as the chunk identification and inserts an entry into the table; If the received short digest exists in the table but the related long digest is different, returns the received short digest appended with one or more alphanumeric characters to make it an unique identification, as the chunk identification and inserts an entry into the table; Eg. digest+X or digest+XX, where X is an a lphanumeric character represents 0 to 9, a to z and A to Z, totally 62 permutations per alphanumeric character.
If the received short and long digests exists together in a single entry in the table but the size is different, returns the received short digest appended with one or more alphanumeric characters to make it an unique identification, as the chunk identification and inserts an entry into the table;
If there are one or more entries in the table matching the received short digest, long digest and the size (FIG. 06B, entries 1 and 4), returns the chunk identifications of entries in the table matching the received short digest, long digest and the size in a response message with a status code followed by a list of one or more chunk identifications to indicate the client should download and binary compare the given chunks with the chunk requiring an identification; If the client finds the chunk requiring an identification is identical to one of those given chunks to binary compare, then the client should use the identification of the matching chunk as the identification for the chunk requiring an identification; If the client finds the chunk requiring an identification is not identical to any one of those given chunks, then the client should issue a different chunk identification request method to the Chunk Identification Generation Server which indicates the client is making this request after binary comparison and includes parameters cryptographic short digest of the chunk, long digest of the chunk, the size of the chunk in bytes, and the list of one or more chunk identifications received for the binary comparison and the Chunk Identification Generation Server verifies the list of one or more chunk identifications given with regards to the supplied short digest, long digest and the size and Chunk Identification Generation Server returns the received short digest appended with one or more alphanumeric characters to make it an unique identification, as the chunk identification and inserts an entry into the table.
Eg. Status-code Binary compare\r\n
ChunkJDx ChunkJDy ... \r\n
This process is depicted in the diagram FIG. 06.
Advantage of this method is it reduces potential binary comparison of chunks.
[D4.1.3] A method of generating an unique chunk identification (generated ID based) by providing one digest and size of the chunk as parameters
This method comprises one or more external Chunk Identification Requests Processing Servers and a Chunk Identification Generation Server; Requests for new chunk identifications are received by the Chunk
Identification Requests Processing Server and when the Chunk Identification Requests Processing Server requires a new unique chunk identification, it fetches one from the Chunk Identification Generation Server.
Chunk Identification Requests Processing Server maintains a table with three(3) columns, first column is "Digest", second column is "Size" and third column is "Chunk ID"; Chunk identification request includes parameters cryptographic digest of the chunk and the size of the chunk in bytes; The returned chunk identification is unique; The SHA-2 (SHA-256 or higher) or SHA-3 (256-bit digest or higher) cryptographic hash functions are recommended to generate the cryptographic digest of the chunk.
If the received digest not in the table, fetches a new identification from the Chunk Identification Generation
Server and returns the new identification as the chunk identification and inserts an entry into the table; If the received digest exists in the table but the size is different, fetches a new identification from the Chunk Identification Generation Server and returns the new identification as the chunk identification and inserts an entry into the table.
If there are one or more entries in the table matching the received digest and the size, returns the chunk identifications of entries in the table matching the received d igest and the size in a response message with a status code followed by a list of one or more chunk identifications to indicate the client should download and binary compare the given chunks with the chunk requiring an identification; If the client finds the chunk requiring an identification is identical to one of those given chunks to binary compare, then the client should use the identification of the matching chunk as the identification for the chunk requiring an identification; If the client finds the chunk requiring an identification is not identica l to any one of those given chunks, then the client should issue a different chunk identification request method to the Chunk Identification Requests Processing Server which indicates the client is making this request after binary comparison and includes parameters cryptographic digest of the chunk, the size of the chunk in bytes, and the list of one or more chunk identifications received for the binary comparison and the Chunk Identification Requests Processing Server verifies the list of one or more chunk identifications given with regards to the supplied digest and the size and the Chunk Identification Requests Processing Server fetches a new identification from the Chunk Identification Generation Server and returns the new identification as the chunk identification and inserts an entry into the table.
Eg. Status-code Binary compare\r\n
ChunkJDx ChunkJDy ... \r\n
A chunk identification generation service or an external server, may pre generate large number of contiguous chunk identifications but pick and return one of those generated chunk identifications either random ly or by based on a criteria when request for a new chunk identification.
This process is depicted in the diagram FIG. 07.
[D4.1.4] A method of generating an unique chunk identification (generated ID based) by providing two digests and size of the chunk as parameters
This method comprises one or more external Chunk Identification Requests Processing Servers and a Chunk Identification Generation Server; Requests for new chunk identifications are received by the Chunk
Identification Requests Processing Server and when the Chunk Identification Requests Processing Server requires a new unique chunk identification, it fetches one from the Chunk Identification Generation Server.
Chunk Identification Requests Processing Server maintains a table with four(4) columns, first column is "Short Digest", second column is "Long Digest", third column is "Size" and fourth column is "Chunk ID"; Chunk identification request includes parameters, short cryptographic digest of the chunk (eg. 256-bit digest), long cryptographic digest of the chunk (eg. 512-bit digest) and the size of the chunk in bytes; The returned chunk identification is unique; The SHA-2 (SHA-256 or higher) or SHA-3 (256-bit digest or higher) cryptographic hash functions are recommended to generate the cryptographic digests of the chunk. The SHA-256 is
recommended to generate the short cryptographic d igest and SHA-3 (512-bit digest) is recommended to generate the long cryptographic digest of the chunk.
If the received short digest not in the table, the Chunk Identification Requests Processing Server fetches a new identification from the Chunk Identification Generation Server and returns the new identification as the chunk identification and inserts an entry into the table; If the received short digest exists in the table but the related long digest is different, the Chunk Identification Requests Processing Server fetches a new identification from the Chunk Identification Generation Server and returns the new identification as the chunk identification and inserts an entry into the table.
If the received short and long digests exists together in a single entry in the table but the size is different, the Chunk Identification Requests Processing Server fetches a new identification from the Chunk Identification Generation Server and returns the new identification as the chunk identification and inserts an entry into the table;
If there are one or more entries in the table matching the received short digest, long digest and the size (FIG. 08B, entries 1 and 4), returns the chunk identifications of entries in the table matching the received short digest, long digest and the size in a response message with a status code followed by a list of one or more chunk identifications to indicate the client should download and binary compare the given chunks with the
chunk requiring an identification; If the client finds the chunk requiring an identification is identical to one of those given chunks to binary compare, then the client should use the identification of the matching chunk as the identification for the chunk requiring an identification; If the client finds the chunk requiring an identification is not identical to any one of those given chunks, then the client should issue a different chunk identification request method to the Chunk Identification Requests Processing Server which indicates the client is making this request after binary comparison and includes parameters cryptographic short digest of the chunk, long digest of the chunk and, size of the chunk in bytes, and the list of one or more chunk identifications received for the binary comparison and the Chunk Identification Requests Processing Server verifies the list of one or more chunk identifications given with regards to the supplied short digest, long digest and the size and the Chunk Identification Requests Processing Server fetches a new identification from the Chunk Identification Generation Server and returns the new identification as the chunk identification and inserts an entry into the table.
Eg. Status-code Binary compare\r\n
ChunkJDx ChunkJDy ... \r\n
A chunk identification generation service or an external server, may pre generate large number of contiguous chunk identifications but pick and return one of those generated chunk identifications either random ly or by based on a criteria when request for a new chunk identification.
This process is depicted in the diagram FIG. 08.
Advantage of this method is it reduces potential binary comparison of chunks.
[D4.2] P2P request message methods to get an unique Chunk ID by providing one or more cryptographic digests
Two (2) methods (D4.2.1 and D4.2.2) are provided for the purpose.
[D4.2.1] P2P request message method to get an unique Chunk ID, which takes a cryptographic digest and size of the chunk as parameters.
The P2P request message method format is as follows:
GET UNIQUE CHU NK ID digest size [\r\n]
The p2p service version number optiona lly may also be added to the above message.
The method name GET_UNIQUE_CHUNK_ID, digest and size may be separated by one or more spaces or any other delimiter such as a colon (': '), semi-colon ( '; '), etc.
The "digest" is the cryptographic digest of the chunk, in text format.
The "size" is size of the chunk in bytes, in text format.
The trailing \r\n is optiona l, but at least a trailing \n is recommended. The \r is Carriage Return and \n is new line or Line Feed.
The GET UNIQUE CHUNKJD method returns:
1. A response message with status code for success followed by a chunk identification.
Eg. Status-code Success\r\n
Chunk_ID\r\n
2. A response message with status code and description for fa ilure.
3. A response message with status code to binary compare given chunk identifications, and followed by a list of one or more chunk identifications separated by either space or a del imiter.
[D4.2.2] P2P request message method to get an unique Chunk ID, which takes two different cryptographic digests and size of the chunk as parameters.
The P2P request message method format is as follows:
GET UNIQUE CHU NK ID short_digest long digest size [\r\n]
The p2p service version number optiona lly may also be added to the above message.
The method name GET_UNIQUE_CHUNK_ID, short digest, long digest and size may be separated by one or more spaces or any other delimiter such as a colon (': '), semi-colon ('; '), etc.
The "short_digest" is a cryptographic digest of the chunk, in text format. Eg. SHA256
The " long_digest" is a cryptographic digest of the chunk, in text format. Eg. SHA512
The "size" is size of the chunk in bytes, in text format.
The trailing \r\n is optiona l, but at least a trailing \n is recommended. The \r is Carriage Return and \n is new line or Line Feed.
The GET UNIQUE CHUNK ID method returns:
1. A response message with status code for success followed by a chunk identification.
Eg. Status-code Success\r\n
Chunk_ID\r\n
2. A response message with status code and description for fa ilure.
3. A response message with status code to binary compare given chunk identifications, and followed by a list of one or more chunk identifications separated by either space or a del imiter.
[D4.3] P2P request message methods to get an unique Chunk ID after a binary comparison of given chunks when all chunks to compare are different.
Two (2) methods (D4.3.1 and D4.3.2) are provided for the purpose.
[D4.3.1] P2P request message method to get an unique Chunk ID after a binary comparison of chunks, which takes a cryptographic digest of the chunk, size the of the chunk, and the list of one or more chunk identifications given to binary compare, as parameters.
The P2P request message method format is as follows:
GET UNIQUE CHU NK ID AFTER BINARY COMPARISON digest size chunk Jist[\r\n]
The p2p service version number optiona lly may also be added to the above message.
The method name GET_UNIQUE_CHUNK_ID_AFTER_BINARY_COMPARISON, digest, size and chunk list may be separated by one or more spaces or any other delimiter such as a colon ( ': '), semi-colon ('; '), etc.
The "digest" is the cryptographic digest of the chunk, in text format.
The "size" is size of the chunk in bytes, in text format.
The "chunk_list" is the list of one or more chunk identifications given to binary compare, in text format.
The trailing \r\n is optiona l, but at least a trailing \n is recommended. The \r is Carriage Return and \n is new line or Line Feed.
The GET UNIQUE CHUNK ID AFTER BINARY COMPARISON method returns:
1. A response message with status code for success followed by a chunk identification.
Eg. Status-code Success\r\n
Chunk_ID\r\n
2. A response message with status code and description for fa ilure.
[D4.3.2] P2P request message method to get an unique Chunk ID after a binary comparison of chunks, which takes two different cryptographic digests of the chunk, size of the chunk, and the list of one or more chunk identifications given to binary compare, as parameters.
The P2P request message method format is as follows:
GET UNIQUE CHU NK ID AFTER BINARY COMPARISON short digest long digest size
chunk_list[\r\n]
The p2p service version number optiona lly may also be added to the above message.
The method name GET UNIQUE CHUNK ID AFTER BINARY COMPARISON, short digest, long digest, size and chunk_list may be separated by one or more spaces or any other delimiter such as a colon (': '), sem i-colon (';'), etc.
The "short_digest" is a cryptographic digest of the chunk, in text format. Eg. SHA256
The " long_digest" is a cryptographic digest of the chunk, in text format. Eg. SHA512
The "size" is size of the chunk in bytes, in text format.
The "chunk_list" is the list of one or more chunk identifications given to binary compare, in text format.
The trailing \r\n is optiona l, but at least a trailing \n is recommended. The \r is Carriage Return and \n is new line or Line Feed.
The GET UNIQUE CHUNK ID AFTER BINARY COMPARISON method returns:
1. A response message with status code for success followed by a chunk identification.
Eg. Status-code Success\r\n
Chunk_ID\r\n
2. A response message with status code and description for fa ilure.
[D4.4] A p2p request message method to download a chunk from a remote p2p storage service by the Chunk ID.
Once the connection is established with the remote p2p storage service, issue this method
GET_CHU NK_REMOTE to download the chunk.
The P2P request message method format is as follows:
GET CHU NK REMOTE Chunk-ID [\r\n]
The p2p service version number optiona lly may also be added to the above message.
The method name GET_CHUNK_REMOTE and Chunk-ID may be separated by one or more spaces or any other delimiter such as a colon (': '), semi-colon ('; '), etc.
The "Chunk-ID" is the unique identification of the chunk, in text format.
The trailing \r\n is optiona l, but at least a trailing \n is recommended. The \r is Carriage Return and \n is new line or Line Feed.
The GET_CHUNK_REMOTE method returns:
1. A response message with status code for success followed by a binary byte stream . Note, the calling client knows the length of the chunk in byes before issue this request message.
Eg. Status-code Success\r\n
byte stream
2. A response message with status code and description for fa ilure.
Note, if the requested chunk does not exists at the p2p storage service, the GET_CHU NK_REMOTE method does not download the chunk.
[D4.5] A p2p request message method to download part of a chunk from a remote p2p storage service by the Chunk ID.
Once the connection is established with the remote p2p storage service, issue this method
GET_CHU NK_PART_REMOTE to download part of the chunk.
The P2P request message method format is as follows:
GET CHU NK PART REMOTE Chunk-ID Start offset[\r\n]
The p2p service version number optiona lly may also be added to the above message.
The method name G ET_CH U N K_PART_RE M OTE, Chunk-ID, Start and offset may be separated by one or more spaces or any other delim iter such as a colon (' : '), semi-colon (';'), etc.
The "Chunk-ID" is the unique identification of the chunk, in text format.
The " Start" is the starting byte position from the beginn ing of the chunk, in text format. First byte in zero.
The "offset" is the length or number of bytes to read from the starting position, in text format.
The trailing \r\n is optiona l, but at least a trailing \n is recommended. The \r is Carriage Return and \n is new line or Line Feed.
The G ET_C HUN K_PART_RE M OTE method returns:
1. A response message with status code for success, followed by the cryptographic digest of the portion to be downloaded, and followed by a binary byte stream. Note, the calling client knows the length of the part of the chunk in byes before issue this request message. The SHA256 is recommended for the returned digest.
Eg. Status-code Success\r\n
digest\r\n
byte stream
2. A response message with status code and description for fa ilure.
Note, if the requested chunk does not exists at the p2p storage service, the GET_CHU NK_PART_REMOTE method does not download the chunk.
[D4.6] P2P request message method to initiate upload a chunk to a local p2p storage service by the Chunk ID
Once a chunk is created and assigned an unique identification by a segmentation process, it is required that the chunk should be uploaded on to the local p2p storage service to make that chunk available to be downloaded by other peers including loca l clients on the same peer.
Once a connection is established with the local p2p storage service, issue this method UPLOAD_CHU NK to initiate upload a chunk to the local p2p storage service.
The P2P request message method format is as follows:
UPLOAD CHUNK Chunk-ID Chunk-length digest[\r\n]
The p2p service version number optiona lly may also be added to the above message.
The method name UPLOAD CHUNK, Chunk-ID, Chunk-length and digest may be separated by one or more spaces or any other delim iter such as a colon (' : '), semi-colon (';'), etc.
The "Chunk-ID" is the unique identification of the chunk, in text format.
The "Chunk-length" is the length of the chunk in bytes, in text format.
The "digest" is a cryptographic digest of the chunk, in text format. Eg. SHA256
The trailing \r\n is optiona l, but at least a trailing \n is recommended. The \r is Carriage Return and \n is new line or Line Feed.
The process of the UPLOAD_CHU NK method:
1. The UPLOAD_CHU NK method returns:
(a) A response message with status code for continue to send the chunk.
Eg. Status-code Continue to send\r\n
(b) A response message with status code for the chunk already exists in the local p2p storage service's chunk store, therefore, do not send the chunk.
(c) A response message with status code and description for failure, therefore, do not send the chunk.
2. If the client receives the response message with status code for continue to send the chunk, then send the chunk to the p2p storage service as a binary byte stream.
3. Fina lly, the p2p storage service send either a response message with status code for success or a response message with status code and description for failure.
Note, a p2p storage service is not allowed to upload a chunk to another p2p storage service as a security measure.
[D4.7] P2P request message method to get a local chunk within the same Peer by the Chunk ID
Once the connection is established with the local p2p storage service, issue this method
GET_CHU NK_ LOCAL to download the chunk.
The P2P request message method format is as follows:
GET CHU NK LOCAL Chunk-ID digest[\r\n]
The p2p service version number optiona lly may also be added to the above message.
The method name GET_CHUNK_LOCAL, Chunk-ID and digest may be separated by one or more spaces or any other delimiter such as a colon (': '), semi-colon ( '; '), etc.
The "Chunk-ID" is the unique identification of the chunk, in text format.
The "digest" is a cryptographic digest of the chunk, in text format. Eg. SHA256
The trailing \r\n is optiona l, but at least a trailing \n is recommended. The \r is Carriage Return and \n is new line or Line Feed.
If the requested chunk not available in the loca l p2p storage service's chunk store, the local p2p storage service downloads the requested chunk for the GET_CHU NK_LOCAL method before return the full file system path of the chunk to the caller.
The GET_CHUNK_LOCAL method returns:
1. A response message with status code for success followed by the full file system path of the chunk.
Eg. Status-code Success\r\n
full file system path of the chunk\r\n
2. A response message with status code and description for fa ilure.
[D4.8] P2P request message method to check existence of a chunk of a p2p storage service's chunk store by the Chunk ID
Once the connection is established with a p2p storage service, issue this method CHECK_CHU NK to check existence of a chunk of the p2p storage service.
The P2P request message method format is as follows:
CHECK CHUNK Chunk-ID[\r\n]
The p2p service version number optiona lly may also be added to the above message.
The method name CHECK_CHUNK, and Chunk-ID may be separated by one or more spaces or any other del imiter such as a colon (': '), sem i-colon (';'), etc.
The "Chunk-ID" is the unique identification of the chunk, in text format.
The trailing \r\n is optiona l, but at least a trailing \n is recommended. The \r is Carriage Return and \n is new line or Line Feed.
The CHECK_CHU NK method returns:
1. A response message with status code for success if the requested chunk exists at the p2p storage service.
Eg. Status-code Success\r\n
2. A response message with status code for chunk does not exist if the requested chunk does not exists at the p2p storage service.
Eg. Status-code Chunk does not exist\r\n
3. A response message with status code and description for fa ilure.
Note, if the requested chunk does not exists at the p2p storage service, the CHECK_CHU NK method does not download the chunk.
[D4.9] A chunk-level p2p storage service
The p2p storage service is also known as a seeder, which manages a chunk store.
Features of the p2p storage service:
1. Listen to a specific TCP port for connections.
2. Run as a local chunk caching service.
3. Download chunks from remote p2p storage services and store them under its management without reassemble the main file.
4. Store chunks in secondary storage such as hard disks, Solid State Drives (SSD), etc. in long term, persists between reboots or shutdowns.
5. Remove old chunks under a specified criteria when the a llocated storage exceeds its capacity or the storage device run out of space or used capacity exceeds a certa in threshold.
6. Implements chunk existence checking under its management by Chunk ID. That is, implements
CHECK CHUNK method.
7. Implements get chunk by Chunk ID for remote clients over the Internet. That is, implements
GET CHUNK REMOTE method.
8. Implements get part of a chunk by Chunk ID for remote clients over the Internet. That is, implements GET CHUNK PART REMOTE method.
9. Implements get chunk by Chunk ID for local clients within the same peer. That is, implements
GET CHUNK LOCAL method.
10. Implements a method to upload chunks to a local or remote p2p storage service. That is, implements UPLOAD CHU NK method.
11. Implements periodic Tracker Announcement of chunks. That is, implements the
TRACKER ANNOU NCE method. Chunks are Tracker Announced to the configured master Tracker.
12. P2P storage service further comprising issue a Tracker Announcement for a chunk immediately after a chunk is uploaded to P2P storage service's chunk store.
[D4.10] P2P request message method to get Peer list from a Tracker by the Chunk ID
Once the connection is established with the Tracker configured to the p2p storage service, issue this method GET_PEER_LIST to get the peer list who share the chunk.
The P2P request message method format is as follows:
GET PEER LIST Chunk-ID digest[\r\n]
The p2p service version number optiona lly may also be added to the above message.
The method name GET_PEER_LIST, Chunk-ID and digest may be separated by one or more spaces or any other delimiter such as a colon (': '), semi-colon ('; '), etc.
The "Chunk-ID" is the unique identification of the chunk, in text format.
The "digest" is a cryptographic digest of the chunk, in text format. Eg. SHA256. The implementation of this method specifies which cryptographic hash function is accepted for the digest.
Providing the cryptographic digest of the chunk in the GET_PEER_LIST request allows load-balancing for Trackers and integrity checking. A load-balancer could use the first one or more characters of the digest to decide which Tracker should serve the request.
The trailing \r\n is optiona l, but at least a trailing \n is recommended. The \r is Carriage Return and \n is new line or Line Feed.
The GET_PEER_LIST method returns:
1. A response message with status code for success followed by one or more Internet Protocol (IP) addresses of peers sharing the requested chunk.
Eg. Status-code Success\r\n
I PI IP2 ...\r\n
2. A response message with status code and description for fa ilure.
[D4.11] P2P request message method for Tracker Announcement by the Chunk ID
The Tracker Announcement in this regard refers to the announcement of the availability of a complete chunk ready to share by a peer to the Tracker, so that the Tracker could update its registry to reflect the IP address of the peer regarding the chunk.
Once the connection is established with the Tracker, issue this method TRACKER_ANNOUNCE to announce the ava ilability of a chunk to share.
The P2P request message method format is as follows:
TRACKER ANNOUNCE Chunk-ID digest[\r\n]
The p2p service version number optiona lly may also be added to the above message.
The method name TRACKER_ANNOU NCE, Chunk-ID and digest may be separated by one or more spaces or any other delimiter such as a colon (':'), sem i-colon ( '; '), etc.
The "Chunk-ID" is the unique identification of the chunk, in text format.
The "digest" is a cryptographic digest of the chunk, in text format. Eg. SHA256. The implementation of this method specifies which cryptographic hash function is accepted for the digest.
Providing the cryptographic digest of the chunk in the TRACKER_ANNOU NCE request further allows load- balancing for Trackers and integrity checking. A load-balancer could use the first one or more characters of the digest to decide which Tracker should serve the request.
The trailing \r\n is optiona l, but at least a trailing \n is recommended. The \r is Carriage Return and \n is new line or Line Feed.
This method further comprising the Tracker uses the chunk existence check to connect back to the peer to verify the existence of the announced chunk and its ability to share. If the Tracker finds the announced chunk does not exists or not possible to share, i.e. behind a firewall or on a private IP address, then the Tracker does not accept th is TRACKER ANNOU NCE request.
The TRACKER_ANNOU NCE method returns:
1. A response message with status code for success if the request accepted by the Tracker.
Eg. Status-code Success\r\n
2. A response message with status code and description for fa ilure.
[D4.12] A chunk-level Tracker service
Features of the Tracker service:
1. Listen to a specific TCP port for connections.
2. Ma intains registry with three(3) essential columns, first column is "Chunk ID", second column is "Digest" and third column is "Peer List". The "Digest" is the cryptographic digest of the chunk, provided with Tracker Announcement . The "Peer List" consists IP addresses of peers who share the chunk.
3. Implements "P2P request message method to get Peer list from a Tracker by the Chunk ID" and "P2P request message method to make Tracker Announcement by the Chunk ID".
[D5] DETAILS OF CARRY OUT THE INNOVATION
Most simple form of carrying out the innovation is expla ined below:
1. Implement the Chunk Identification Generation Server as per [D4.1.1]. The SHA-256 digest is expected.
2. Implement the chunk-level p2p storage service as per [D4.9].
3. Implement the chunk-level Tracker service as per [D4.12].
4. From the segmentation program, connect to the Chunk Identification Generation Server and issue the method "GET_UNIQUE_CHU NK_ID digest size [\r\n]" and get an unique chunk identification and assign to a chunk. Note, the Chunk Identification Generation Server may request to binary compare given one or more chunks.
Post the newly created chunk to the p2p storage service using "UPLOAD CHU NK Chunk-ID Chunk-length digest[\r\n]".
Repeat from step 4. for each chunk.
At the end of the segmentation process, the segmentation program generates the metadata fi le with chunk identifications, sizes and cryptographic digests for each and every chunk.
Remote peer obta ins the metadata fi le and read through the chunk list.
The player (or the chunk consumption program) at the remote peer, connects to its local p2p storage service on the same peer and issue the method GET_CHUNK_LOCAL Chunk-ID digest[\r\n] for required chunk.
The local p2p storage service at the remote peer, then connects to configured the Tracker and issue the method "GET_PEER_LIST Chunk-ID digest[\r\n]" for the required chunk to obtain the peer list.
The local p2p storage service at the remote peer, then connects to a remote peer and issue the method "GET_CHU NK_REMOTE Chunk-ID [\r\n]" and obtain the chunk. For every chunk downloaded, it is required to compare the computed cryptographic digest with the digest from the metadata file for the chunk.
The local p2p storage service at the remote peer, now completes the GET_CHUNK_LOCAL by returning the local full file system path of the chunk.
Claims
1. Chunks are assigned with unique chunk identifications so that different chunks are assigned with different chunk identifications (Chunk IDs) and identical chunks are assigned with identical Chunk IDs.
2. A method of generating an unique chunk identification (Chunk ID) by provid ing one cryptographic digest and size of the chunk as parameters:
The method further comprising following steps:
(a) An external Chunk Identification Generation Server maintains a table with three(3) columns, first column is "Digest", second column is "Size" and third column is "Chunk ID"; Chunk identification request includes parameters cryptographic digest of the chunk and the size of the chunk in bytes;
(b) If the received digest not in the table, returns the received digest as the chunk identification and inserts an entry into the table; If the received digest exists in the table but the size is different, returns the received digest appended with one or more alphanumeric characters to make it an unique identification, as the chunk identification and inserts an entry into the table;
(c) If there are one or more entries in the table matching the received digest and the size, returns the chunk identifications of entries in the table matching the received digest and the size in a response message with a status code followed by a list of one or more chunk identifications to indicate the client should download and binary compare the given chunks with the chunk requiring an identification; If the client finds the chunk requiring an identification is identical to one of those given chunks to binary compare, then the client should use the identification of the matching chunk as the identification for the chunk requiring an identification; If the client finds the chunk requiring an identification is not identical to any one of those given chunks, then the client should issue a different chunk identification request method to the Chunk Identification Generation Server which indicates the client is making this request after binary comparison and includes parameters cryptographic digest of the chunk, the size of the chunk in bytes, and the list of one or more chunk identifications received for the binary comparison and the Chunk Identification Generation Server verifies the list of one or more chunk identifications given with regards to the supplied digest and the size and returns the received digest appended with one or more a lphanumeric characters to make it an unique identification, as the chunk identification and inserts an entry into the table.
3. A method of generating an unique chunk identification by providing two cryptographic digests and size of the chunk as parameters:
The method further comprising following steps:
(a) An external Chunk Identification Generation Server mainta ins a table with four(4) columns, first column is "Short Digest", second column is "Long Digest", third column is "Size" and fourth column is "Chunk ID"; Chunk identification request includes parameters, short cryptographic digest of the chunk, long cryptographic digest of the chunk and the size of the chunk in bytes;
(b) If the received short digest not in the table, returns the received short digest as the chunk identification and inserts an entry into the table; If the received short digest exists in the table but the related long digest is different, returns the received short digest appended with one or more alphanumeric characters to make it an unique identification, as the chunk identification and inserts an entry into the table;
If the received short and long digests exists together in a single entry in the table but the size is different, returns the received short digest appended with one or more alphanumeric characters to make it an unique identification, as the chunk identification and inserts an entry into the table;
(c) If there are one or more entries in the table matching the received short digest, long digest and the size, returns the chunk identifications of entries in the table matching the received short digest, long digest and the size in a response message with a status code followed by a list of one or more chunk identifications to indicate the client should download and binary compare the g iven chunks
with the chunk requiring an identification; If the client finds the chunk requiring an identification is identica l to one of those given chunks to binary compare, then the client should use the identification of the matching chunk as the identification for the chunk requiring an identification; If the client finds the chunk requiring an identification is not identical to any one of those given chunks, then the client should issue a different chunk identification request method to the Chunk Identification Generation Server which indicates the client is making th is request after binary comparison and includes parameters cryptographic short digest of the chunk, long digest of the chunk, the size of the chunk in bytes, and the list of one or more chunk identifications received for the binary comparison and the Chunk Identification Generation Server verifies the list of one or more chunk identifications given with regards to the supplied short digest, long digest and the size and Chunk Identification Generation Server returns the received short digest appended with one or more alphanumeric characters to make it an unique identification, as the chunk identification and inserts an entry into the table.
4. A method of generating an unique chunk identification (generated ID based) by providing one digest and size of the chunk as parameters:
The method further comprising following steps:
(a) An external Chunk Identification Requests Processing Server ma intains a table with three(3) columns, first column is "Digest", second column is "Size" and third column is "Chunk ID"; Chunk identification request includes parameters cryptographic digest of the chunk and the size of the chunk in bytes;
(b) If the received digest not in the table, the Chunk Identification Requests Processing Server fetches a new identification from the Chunk Identification Generation Server and returns the new identification as the chunk identification and inserts an entry into the table; If the received digest exists in the table but the size is different, the Chunk Identification Requests Processing Server fetches a new identification from the Chunk Identification Generation Server and returns the new identification as the chunk identification and inserts an entry into the table.
(c) If there are one or more entries in the table matching the received digest and the size, returns the chunk identifications of entries in the table matching the received digest and the size in a response message with a status code followed by a list of one or more chunk identifications to indicate the client should download and binary compare the given chunks with the chunk requiring an identification; If the client finds the chunk requiring an identification is identical to one of those given chunks to binary compare, then the client should use the identification of the matching chunk as the identification for the chunk requiring an identification; If the client finds the chunk requiring an identification is not identical to any one of those given chunks, then the client should issue a different chunk identification request method to the Chunk Identification Requests Processing Server which indicates the client is making this request after binary comparison and includes parameters cryptographic digest of the chunk, the size of the chunk in bytes, and the list of one or more chunk identifications received for the binary comparison and the Chunk Identification Requests Processing Server verifies the list of one or more chunk identifications given with regards to the supplied digest and the size and the Chunk Identification Requests Processing Server fetches a new identification from the Chunk Identification Generation Server and returns the new identification as the chunk identification and inserts an entry into the table.
5. A method of generating an unique chunk identification (generated ID based) by providing two d igests and size of the chunk as parameters:
The method further comprising following steps:
(a) An external Chunk Identification Requests Processing Server ma intains a table with four(4) columns, first column is "Short Digest", second column is "Long Digest", third column is "Size" and fourth column is "Chunk ID"; Chunk identification request includes parameters, short cryptographic digest of the chunk, long cryptographic digest of the chunk and the size of the chunk in bytes;
(b) If the received short digest not in the table, the Chunk Identification Requests Processing Server fetches a new identification from the Chunk Identification Generation Server and returns the new identification as the chunk identification and inserts an entry into the table; If the received short digest exists in the table but the related long digest is different, the Chunk Identification Requests Processing Server fetches a new identification from the Chunk Identification Generation Server and
returns the new identification as the chunk identification and inserts an entry into the table.
If the received short and long digests exists together in a single entry in the table but the size is different, the Chunk Identification Requests Processing Server fetches a new identification from the Chunk Identification Generation Server and returns the new identification as the chunk identification and inserts an entry into the table;
(c) If there are one or more entries in the table matching the received short digest, long digest and the size, returns the chunk identifications of entries in the table matching the received short digest, long digest and the size in a response message with a status code followed by a list of one or more chunk identifications to indicate the client should download and binary compare the g iven chunks with the chunk requiring an identification; If the client finds the chunk requiring an identification is identica l to one of those given chunks to binary compare, then the client should use the identification of the matching chunk as the identification for the chunk requiring an identification; If the client finds the chunk requiring an identification is not identical to any one of those given chunks, then the client should issue a different chunk identification request method to the Chunk Identification Requests Processing Server which indicates the client is making this request after binary comparison and includes parameters cryptographic short digest of the chunk, long digest of the chunk, the size of the chunk in bytes, and the list of one or more chunk identifications received for the binary comparison and the Chunk Identification Requests Processing Server verifies the list of one or more chunk identifications given with regards to the supplied short digest, long digest and the size and the Chunk Identification Requests Processing Server fetches a new identification from the Chunk Identification Generation Server and returns the new identification as the chunk identification and inserts an entry into the table.
6. P2P request message method to get an unique Chunk ID from an external server, which takes a
cryptographic digest of the chunk and size of the chunk as parameters; And the method returns: (a) A response message with status code for success followed by a chunk identification or (b) A response message with status code and description for failure or (c) A response message with status code to binary compare given chunk identifications, and fol lowed by a list of one or more chunk identifications separated by either space or a delim iter.
7. P2P request message method to get an unique Chunk ID from an external server, which takes two different cryptographic digests of the chunk and size of the chunk as parameters; And the method returns: (a) A response message with status code for success followed by a chunk identification or (b) A response message with status code and description for fa ilure or (c) A response message with status code to binary compare given chunk identifications, and followed by a list of one or more chunk identifications separated by either space or a delimiter.
8. P2P request message method to get an unique Chunk ID after a binary comparison of chunks, which takes a cryptographic digest of the chunk, size the of the chunk, and the list of one or more chunk identifications received for binary comparison, as parameters; And the method returns: (a) A response message with status code for success followed by a chunk identification or (b) A response message with status code and description for failure.
9. P2P request message method to get an unique Chunk ID after a binary comparison of chunks, which takes two different cryptographic digests of the chunk, size of the chunk, and the list of one or more chunk identifications received for binary comparison, as parameters; And the method returns: (a) A response message with status code for success followed by a chunk identification or (b) A response message with status code and description for failure.
10. P2P request message method to download a chunk from a remote p2p storage service by the Chunk ID; And the method returns: (a) A response message with status code for success followed by a binary byte stream or (b) A response message with status code and description for failure. This method further comprising, if the requested chunk does not exists at the remote p2p storage service's chunk store, this method does not download the chunk.
11. P2P request message method to download part of a chunk from a remote Peer by the Chunk ID, starting byte position from the beginning of the chunk, and the length or number of bytes to read from the starting position, as parameters; And the method returns: (a) A response message with status code for success, fol lowed by the cryptographic digest of the portion to be downloaded, and followed by a binary byte stream or (b) A response message with status code and description for failure. This method further comprising, if the requested chunk does not exists at the remote p2p storage service's chunk store, th is method does not download the chunk.
12. P2P request message method to initiate upload a chunk to a local p2p storage service by the Chunk ID, the length of the chunk in bytes, and a cryptographic digest of the chunk, as parameters;
The method further comprising following steps:
(a) Upon receive the initiate chunk upload request, the p2p storage service checks the existence of the specified chunk by Chunk ID in its chunk store; The method returns: (1) A response message with status code for continue to send the chunk if the specified chunk does not exists in its chunk store or (2) A response message with status code if the specified chunk already exists in the p2p storage service's chunk store, therefore, do not send the chunk or (3) A response message with status code and description for failure.
(b) If the client receives the response message with status code for continue to send the chunk, then send the chunk to the loca l p2p storage service as a binary byte stream.
(c) The p2p storage service send either a response message with status code for success or a response message with status code and description for failure.
13. P2P request message method to get a local chunk within the same Peer by the Chunk ID, and a
cryptographic digest of the chunk, as parameters;
The method further comprising following steps:
(a) If the requested chunk ava ilable in the local p2p storage service's chunk store, sends a response message with status code for success followed by the full file system path of the chunk.
(b) If the requested chunk not available in the local p2p storage service's chunk store, the local p2p storage service downloads the requested chunk over Local Area Network (LAN) or over the Internet, verify the given cryptograph ic digest, and sends a response message with status code for success followed by the full file system path of the chunk.
(c) If any error, sends a response message with status code and description for failure.
14. P2P request message method to check existence of a chunk of a p2p storage service's chunk store by the Chunk ID; And the method returns: (a) A response message with status code for success if the requested chunk exists at the p2p storage service's chunk store, or (b) A response message with status code for chunk does not exist if the requested chunk does not exists at the p2p storage service's chunk store, or (c) A response message with status code and description for failure; Th is method further comprising, if the requested chunk does not exists at the p2p storage service's chunk store, the method does not download the chunk.
15. A p2p storage service which mainta ins a long-term local chunk cache on a peer so that chunks persist between reboots or shutdowns; The p2p storage service keeps chunks as separate individua l files without reassemble the main file; The p2p storage service implements Claims 10, 11, 12, 13, and 14; P2P storage service further comprising issue a Tracker Announcement for a chunk immediately after a chunk is uploaded to the P2P storage service's chunk store.
16. P2P request message method to get Peer list from a Tracker by the Chunk ID and cryptographic digest of the chunk, as parameters; And the method returns: (a) A response message with status code for success followed by one or more Internet Protocol (IP) addresses of peers sharing the requested chunk or (b) A response message with status code and description for failure.
17. P2P request message method for Tracker Announcement of a chunk by the Chunk ID and cryptographic digest of the chunk, as parameters; And the method returns: (a) A response message with status code for success if the Tracker Announcement request was accepted by the Tracker or (b) A response message with status code and description for fa ilure; This method further comprising the Tracker uses the chunk existence check to connect back to the peer to verify the existence of the announced chunk and the ability to share. If the Tracker finds the announced chunk does not exists or not possible to share, then the Tracker does not accept this Tracker Announcement request.
18. A Tracker service implements Claims 16 and 17. Tracker service further comprising a registry with three(3) essential columns, first column is "Chunk ID", second column is "Digest" and third column is "Peer List". The "Digest" is the cryptographic digest of the chunk provided with Tracker
Announcement. The "Peer List" consists IP addresses of peers who share the chunk.
19. A chunk according to Claim 1, wherein:
A chunk is a part or a piece of an input file or input source, and produced by a segmentation system .
20. The input sources according to Claim 19, wherein:
Input sources can be of (1) files or (2) transport streams, or (3) read raw audio data from an audio capturing device/system and/or raw video data from a video capturing device/system, or (4) any combination (1), (2) and (3).
21. The unique chunk identification according to Claim 1, wherein:
A chunk that is different to another chunk in terms in binary, is assigned with different chunk identification.
22. The cryptographic digest accord ing to Claim 2, wherein:
The cryptographic digest is produced by a cryptographic hash function on its input. The
cryptographic digest of a chunk is produced by a cryptographic hash function by taking the chunk as its input.
23. The binary compare according to Claim 2, wherein:
The binary compare of chunks means comparing two chunks in bitwise. If a single bit is different, those two chunks are considered different.
24. The alphanumeric characters according to Claim 2, wherein:
An alphanumeric character takes values only zero (0) to nine (9) and characters, simple 'a' to 'z' and characters, capital 'A' to 'Ζ'.
25. The Tracker according to Claim 16, wherein:
A Tracker is a software server system which ma intains a reg istry which maps file to peer IP addresses relationship. The "file" is a file being shared over a p2p network, and "IP address" is an Internet Protocol address of a device on a p2p network which holds the interested file. A p2p file downloading program finds who hold the interested file over a p2p network is by query the relevant Tracker. Once the relevant Tracker is queried by providing the interested file name, the Tracker sends a list of peer IP addresses where the file could be found over a p2p network.
26. The Tracker Announcement according to Cla im 17, wherein:
Tracker Announcement refers here to the announcement of existence of a peer to a Tracker regardi the ava ilability of a complete chunk with that peer and ready to share, so that the Tracker could update its registry to include the announcing peer.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/IB2012/056000 WO2014068364A1 (en) | 2012-10-30 | 2012-10-30 | Method and systems for chunk-level peer-to-peer (p2p) file sharing |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/IB2012/056000 WO2014068364A1 (en) | 2012-10-30 | 2012-10-30 | Method and systems for chunk-level peer-to-peer (p2p) file sharing |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2014068364A1 true WO2014068364A1 (en) | 2014-05-08 |
Family
ID=47470045
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/IB2012/056000 Ceased WO2014068364A1 (en) | 2012-10-30 | 2012-10-30 | Method and systems for chunk-level peer-to-peer (p2p) file sharing |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2014068364A1 (en) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10536275B2 (en) | 2017-05-10 | 2020-01-14 | Microsoft Technology Licensing, Llc | Verification of downloaded subsets of content |
| CN112532728A (en) * | 2020-11-30 | 2021-03-19 | 中国航空工业集团公司西安航空计算技术研究所 | Deterministic airborne high-performance file transmission method and system |
| CN112527515A (en) * | 2020-12-02 | 2021-03-19 | 厦门亿联网络技术股份有限公司 | State synchronization method, device, equipment and storage medium |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20090113145A1 (en) * | 2007-10-25 | 2009-04-30 | Alastair Slater | Data transfer |
| US20120254370A1 (en) * | 2011-04-01 | 2012-10-04 | International Business Machines Corporation | Method for distributing a plurality of data portions |
-
2012
- 2012-10-30 WO PCT/IB2012/056000 patent/WO2014068364A1/en not_active Ceased
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20090113145A1 (en) * | 2007-10-25 | 2009-04-30 | Alastair Slater | Data transfer |
| US20120254370A1 (en) * | 2011-04-01 | 2012-10-04 | International Business Machines Corporation | Method for distributing a plurality of data portions |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10536275B2 (en) | 2017-05-10 | 2020-01-14 | Microsoft Technology Licensing, Llc | Verification of downloaded subsets of content |
| CN112532728A (en) * | 2020-11-30 | 2021-03-19 | 中国航空工业集团公司西安航空计算技术研究所 | Deterministic airborne high-performance file transmission method and system |
| CN112527515A (en) * | 2020-12-02 | 2021-03-19 | 厦门亿联网络技术股份有限公司 | State synchronization method, device, equipment and storage medium |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN104980486B (en) | Method and apparatus for providing set synchronization using equivalent matching network names | |
| CA3145582C (en) | Extension for targeted invalidation of cached assets | |
| US9509784B2 (en) | Manifest chunking in content delivery in a network | |
| JP2016053951A (en) | System and method for maintaining distributed and fault-tolerant state over information centric network | |
| US11943273B2 (en) | System and method for data stream fragmentation with scalability | |
| CN112035422B (en) | Distributed real-time data synchronization method, node equipment and system based on IPFS | |
| US8725807B2 (en) | Decentralized data casting in an interest aware peer network | |
| EP2930903B1 (en) | Secure synchronization using matched network names | |
| US10187460B2 (en) | Peer-to-peer sharing in a content centric network | |
| US8559445B2 (en) | Systems and methods for distributing data | |
| US9253143B2 (en) | Reverse subscriptions | |
| KR101371202B1 (en) | Distributed file system having multi MDS architecture and method for processing data using the same | |
| WO2014068364A1 (en) | Method and systems for chunk-level peer-to-peer (p2p) file sharing | |
| JP2016048920A (en) | Method and system for comparing media assets | |
| AU2015281798B2 (en) | System of shared secure data storage and management | |
| US20150012745A1 (en) | Precalculating hashes to support data distribution | |
| EP2624523B1 (en) | System and method for data stream fragmentation with scalability | |
| AU2015101745A4 (en) | System of shared secure data storage and management |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12808891 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 12808891 Country of ref document: EP Kind code of ref document: A1 |