WO2007004992A1

WO2007004992A1 - System and method for encrypting/decrypting a coded audio signal, system and method for generating a set of cryptographic keys and computer program products

Info

Publication number: WO2007004992A1
Application number: PCT/SG2006/000181
Authority: WO
Inventors: Lakshminarayanan Anatharaman; Rongshan Yu; Waqas Ahmad; Ti Eu Chan; Hwee Hwa Pang; Susanto Rahardja
Original assignee: Agency for Science Technology and Research Singapore
Current assignee: Agency for Science Technology and Research Singapore
Priority date: 2005-07-05
Filing date: 2006-06-28
Publication date: 2007-01-11
Anticipated expiration: 2008-01-05

Abstract

A method for encrypting a coded audio signal comprising a plurality of quality layers, is described wherein a first cryptographic key is generated for the audio signal, for each of at least one of the plurality of quality layers, a second cryptographic key is generated by applying a concatenation of one-way functions to the first cryptographic key and each of the at least one of the plurality of quality layers is encrypted using the second cryptographic key generated for the respective quality layer.

Description

System and method for encrypting/decrypting a coded audio signal, system and method for generating a set of cryptographic keys and computer program products

Technical Field

The invention relates to a system and method for encrypting/decrypting a coded audio signal and a system and method for generating a set of cryptographic keys and computer program products.

Background of the Invention

Recently, with the advances in computers, networking and communications streaming audio contents over networks such as the Internet, wireless local area networks, home networks and commercial cellular phone systems is becoming a mainstream means of audio service delivery. It is believed that with the progress of the broadband network infrastructures, including xDSL, fiber optics, and broadband wireless access, bit rates for these channels are quickly approaching those for delivering high sampling-rate, high amplitude resolution (e.g. 96 kHz, 24 bit/sample) lossless audio signals. On the other hand, there are still application areas where high- compression digital audio formats, such as MPEG-4 AAC are required. As a result, interoperable solutions that bridge the current channels and the rapidly emerging broadband channels are highly demanded.

In addition, even when broadband channels are widely available and the bandwidth constraint is ultimately removed, a bit-rate-scalable coding system that is capable to produce a hierarchical bit-stream whose bit-rates can be dynamically changed during transmission is still highly favorable. For example, for applications where packet loss occurs occasionally due to accidents or resource sharing requirements, the current broadband waveform representations such as PCM (Pulse Code Modulation) and lossless coding formats may suffer serious distortions in a streaming situation. However, this problem can be solved if one could set packet priorities in the case that network resources are dynamically changing. Finally, a bit-rate-scalable coding system also provides the server advantageous for audio streaming services, where graceful QoS degradation could be achieved if an excessive number of demands from client sites come.

In this context, Advanced Audio Zip (AAZ) was developed. The Advanced Audio Zip (AAZ) bit stream format consists of two parts :

(i) The core Advanced Audio Coding (AAC) stream, and

(ii) AAZ (MPEG-4 SLS) enhancement stream

The purpose of the AAC stream is to act as the base line that provides the basic quality of the audio. The AAZ enhancement stream is to provide the fine grain scalable bit-stream that scales from the basic quality provided by the AAC stream up to lossless coding. In the absence of AAZ enhancement stream, the AAC quality music will still be available. The AAZ enhancement stream can be further broken down into multiple of streams of different rates, and the transmission system has the freedom to drop several stream of lower significance when bandwidth is insufficient.

The AAZ enhancement stream (described in [1] ) consists of two types of bit streams:

(i) LLE Main Elementary Stream, and

(ii) LLE Extension Stream.

LLE_MAIN Elementary stream comprises a LLE_HEADER_ELEMENT and a LLE_DATA_ELEMENT . However, LLE_Extension streams will only comprise the LLE_DATA_ELEMENT . LLE_Extension streams can be added or dropped for large step scalability. Depending upon the channel configuration, a sub frame of the audio signal comprises all the available channel elements that correspond to different physical audio channels, e.g., a sub-frame comprises one SCE (single channel element, e.g. center channel) and two CPEs (channel pair elements, e.g., L/R channel) . In order to reduce the transmission overheads, several sub frames are grouped into one super-frame in the enhancement stream where the number of sub-frames in a super frame will be specified in file (stream) configuration header.

An audio signal encoded according to AAZ can be contained in a file according to the MP4 format. The MP4 file format is designed to encapsulate the MPEG-4 presentations, defined by ISO/IEC MPEG as well as other media types, in a flexible, manageable and extensible manner. The MP4 file format is based on Apple's QuickTime Format. The basic data structure in the MP4 file format is the Atom A unique tag and length identify each atom. A collection of atoms describes a hierarchy of metadata giving information such as bit and frame rates, duration of the media, and pointers to the media data. Such a group of atoms related to a particular media presentation is referred to as Movie Atom (in very technical terms), stream or track. The actual media data may be located elsewhere; it may be..in..±Jie.ME4..file or -located- outside the MP4 file and referenced via URLs (Uniform Resource Locators) . The MP4 file format allows storing any number of data streams in a single file. Any number of the streams can be accessed, synchronized, e.g. video stream synchronized with audio stream, and played concurrently. The following three issues are of particular attention in presenting audio signals encoded according to AAZ (e.g. AAZ- encoded music) to users:

• Both AAC and AAZ streams have to be in a single file,

• A reliable mechanism to synchronize the AAC and AAZ data packets, and

• Support for the streams over different packet switched networks such as the Internet and 3^rd Generation mobile communication networks.

The AAC and AAZ bit streams can be stored as separate tracks in a single MP4 file, such that each frame of the AAC bit stream or the AAZ bit stream is stored as a one Access Unit (AU) on its respective track. The tracks encapsulating the AAC bit stream and the LLE Main Elementary Stream are obligatory, while the presence and number of the tracks encapsulating the LLE Extension Stream (s) depends upon the desired level of large step scalability.

The system layer, i.e. the player for the audio signal, can resynchronize the AAC bit stream frames and the AAZ bit stream frames frames together by using the index of the Access Unit. Frame synchronization is a more tedious issue in streaming the AAC bit stream and AAZ bit stream on packet switched networks as data packets may not arrive in sequence. An MPEG4 synchronization mechanism that is based on time stamps can be exploited.

A DRM scheme for audio would encrypt the audio data and control access to the plaintext audio data through controlled dissemination of the decryption keys. Each quality layer of a scalably coded audio signal should be encrypted with a different key. This enables the content owner to restrict access to users based on quality. If the number of quality layers is large, then the number of keys needed to manage a single audio file (with multiple quality layers) will also be large.

Summary of the Invention

A method for encrypting a coded audio signal comprising a plurality of quality layers is provided, wherein a first cryptographic key is generated for the audio signal, for each of at least one of the plurality of quality layers, a second cryptographic key is generated by applying a concatenation of one-way functions to the first cryptographic key and each of the at least one of the plurality of quality layers is encrypted using the second cryptographic key generated for the respective quality layer.

Further, a system for encrypting a coded audio signal, a method for decrypting an encrypted coded audio signal and a system for decrypting an encrypted coded audio signal according to the method for encrypting a coded audio signal described above are provided.

Further, a method for generating a set of keys for encrypting a coded audio signal comprising a plurality of quality layers is provided, wherein a first cryptographic key is generated for the audio signal, for each of at least one of the plurality of quality layers, a second cryptographic key is generated by applying a concatenation of one-way functions to the first cryptographic key and each of the at least one of the plurality of quality layers is assigned to the second cryptographic key generated for the respective quality layer.

Further, a system for generating a set of keys for encrypting a coded audio signal according to the method for generating a set of keys for encrypting a coded audio signal described above is provided. Further, a computer program product is provided which, when executed by a computer, makes the computer perform one of the methods described above.

With the invention an encrypting and decrypting scheme for scalably coded audio signals is provided wherein a little number of cryptographic keys need to be managed compared to prior art methods .

Brief Description of the Drawings

Illustrative embodiments of the invention are explained below with reference to the drawings.

Figure 1 shows a system according to an embodiment of the invention.

Figure 2 shows an audio file according to an embodiment of the invention.

Figure 3 shows a hash tree according to an embodiment of the invention.

Detailed Description

Illustratively, keys for -encrypting quality layers of an audio signal are generated based on the first cryptographic key by successively applying a one-way function to the first cryptographic key. This reduces the number of cryptographic keys to be managed significantly.

With one-way function a function is meant which is relatively simple to be carried out (i.e. the computational cost to apply the function is relatively small) but is difficult to invert, i.e. the computational cost to invert the function (i.e. to calculate the original value, to which the function is applied to generate a result value, based on the result value) is very high, such that it is practically impossible to invert the function.

Embodiments of the invention emerge from the dependent claims. The embodiments which are described in the context of the method for encrypting a coded audio signal are analogously valid for the a system for encrypting a coded audio signal, the method for decrypting an encrypted coded audio signal, the system for decrypting an encrypted coded audio signal, the method for generating a set of cryptographic keys, the system for generating a set of cryptographic keys and the computer program products.

In one embodiment, for the second cryptographic key, a third cryptographic key is generated which is specific for a user and the second cryptographic key is encrypted using the third cryptographic key.

The concatenation of one-way functions can be a hash chain.

The coded audio signal can be coded according to AAZ (Advanded Audio Zip) . But also, the coded audio signal can be coded according to any other layered coding scheme.

The invention can not only be applied to audio signals, but for example also to video signals which are scalably coded.

In one embodiment, the coded audio signal comprises a plurality of time partitions and a further cryptographic key is generated for each quality layer and each time partition by applying a concatenation of one-way functions to the first cryptographic key and an index specifying the respective time partition. In one embodiment the coded audio signal comprises a plurality of time partitions and a further cryptographic key- is generated for each quality layer and each time partition by using a hash tree.

The assignment of the second cryptographic key to the respective quality layer is for example stored in a table.

Pig.l shows a system 100 according to an embodiment of the invention.

The system 100 comprises a server unit 101 and a client unit 102. The server unit 101 comprises an audio signal database 103 which holds audio content, for example songs in digital format. It is assumed that an audio signal contained in the audio signal database 103 should be provided to the user of the client unit 102 in a certain quality. For example, the user has sent a message using the client unit 102 requesting a song that is avaiable in the audio signal database.

The audio signal is supplied to a scalable audio encoder 104 of the server unit 101. The scalable auio encoder 104 encodes the audio signal into a layered structure that comprises of a plurality of N layers (N some positive integer) . Each layer contains a plurality of bits, and represents a different quality level of the original audio contents. Let LN be the lowest quality layer and Ll the highest quality. Assume that the layers form a stack, ^LN at the bottom and Ll at the top. To construct music at quality level j , all quality layers from j to N are required. The user of the client unit 102 gets access to certain quality layers of the scalably coded audio signal based on her access permission.

To assure that the user has access to certain quality layers of the scalably coded audio signal, a hash chain is used. Let the audio signal (or the file in which the audio signal is conatined) have an identification denoted by audioFilelD. Let K (AudioFilelD) be the master key for that audio file.

A key generator 105 of the server unit 101 generates a master key K (AudioFilelD) specifically for that audio signal. This master key is hence generated and controlled by the audio content owner. The master key can be randomly generated (and stored in a secure database) or can be derived by a super master key controlled by the content owner, e.g. according to

K (AudioFilelD)

= keyDerivationFunction (SuperMasterKey¹, AudioFilelD)

(Eq. 1)

For simplicity, K (AudioFilelD) is also referred to as K in the following. All keys used in this embodiment are at least of length 128bits (lβBytes) .

The key generator 105 uses a hash chain based on K. Let h(x) be a hashing function (cryptographic hash, e.g. SHA-I, SHA- 256) and H(i) be the value of the hash chain for i = 1 to N.

H(I) = K; H (2) = h(K); ... H(J ) = h (H (J-I)); H(N) = h^N-1 (K)

(Eq. 2)

Based on this hash chain, the key generator 105 generates N master keys (for that particular audio signal) , one each for each quality layer. These N master key are supplied to an encrypting unit 106 of the server unit 101.

The encrypting unit 106 encrypts the scalably coded audio signal based on the N master keys. It uses a symmetric key block cipher (e.g. AES) or a properly constructed stream cipher (e.g. RC4) for encrypting data.

The encrypting unit 106 uses H(I) to encrypt the audio layer Ll, H (2) to encrypt layer L2, and so on (generally H(j) to encrypt layer j) to H(N), which is used to encrypt layer N. This encrypting can be done in before, that is before any user has requested the audio signal. In one embodiment, all audio signals of the audio signal database 103 are already stored scalably coded and encrypted.

An audio file containing the scalably coded and encrypted audio signal is illustrated in fig.2.

Fig.2 shows an audio file 200 according to an embodiment of the invention.

The audio file 200 holds the scalably coded and encrypted audio signal.

The audio file 200 contains a first encrypted audio layer 201, which corresponds to the audio layer Ll of the scalably coded audio signal, encrypted using the key K=H(I) . The audio file 200 contains a second encrypted audio layer 202, which corresponds to the audio layer L2 of the scalably coded audio signal, encrypted using the key h (K)=H (2) itself. The audio file 200 contains further encrypted audio layers 203, which correspond to the audio layers of the scalably coded- audio signal, wherein the jth is encrypted using the key H(j). Finally, the audio file 200 contains an Nth encrypted audio layer 204, which corresponds to the audio layer LN of the scalably coded audio signal, encrypted using the key H(N) .

When the scalably coded and encrypted audio signal should be provided to the user of the client unit 102, the key generator 105 generates a random key denoted by K(UserID, AudioFilelD) which is specific for the user and the audio signal. K(UserID, AudioFilelD) is also denoted by KysER ^ⁿ the following.

Let H¹(i) be the value of the hash chain according to equation 3 for I = 1 to N.

H¹ CL ) = KOSER; H¹ ( 2 ) = h ( K_ϋSER) ; ... H¹ C j ) = h ( Hl ( J - I ) ) ; H¹ (N ) = h ^ ( KDSER)

(Eq . 3 )

According to the key management scheme of this embodiment, the encrypting unit encrypts the master keys for each level with the corresponding user key. Hence H^1-(I) is used to encrypt H(I), H¹ (2) to encrypt H (2) and so on to H^1-(J) which is used to encrypt H(j) (where j is an integer between 1 and N) . j is assumed to be the layer corresponding to the highest quality the user is granted access to.

The encrypted master keys (encrypted H(I) to encrypted H(J)) are then appended along with user related information to the audio file containing the encrypted scalably coded audio signal. This audio file is supplied to a transmitting unit 107 which transmits the audio file to the client unit 102 where it is received by a receiving unit 108.

The transmitting unit 107 for example transmits the audio file via a mobile communication network or the client unit 102 downloads the audio file from the server unit 101.

Since the encrypted master keys contained in the audio file have been encrypted using the random key KUSER_J which is specific for the user of the client unit 102, the audio signal transmitted to the user has been effectively personalized. Or the client software can append the master key (with user information) to the audio file once the user has downloaded it from a super-distribution center. The content owner can delegate access control to the master keys to a content distribution system.

A decrypting unit 109 receives the encrypted master key H(j) (according to the maximum quality level granted to the user) and uses the corresping user key H^x(j) assigned to the user to decrypt the encrypted master key H(j) such that the master key H(j) is then known to the decrypting unit 109. By successively applying the function h, the decrypting unit 109 determines all master keys H^j+l), H¹(j+2), ... ,H¹ (N) according to Eq.3. Using these master keys, the decrypting unit 109 decrypts the quality levels Lj, Lj+1, ..., LN of the encrypted scalably coded audio signal. The quality levels Lj, Lj+1, ..., LN of the scalably coded audio signal are then decoded by a scalable audio decoder 110 and supplied to outputting means 111, such that, e.g., the audio signal is played by means of a speaker system.

In other words,^" the encrypted audio file and associated key management using hash chains are used by a DRM scheme. A user first obtains the key for a particular quality level j . The key assigned to him with be H¹Cj) . Using this key, the user can land out the master key that was used to encrypt the quality level j . Using this key, the user can find all the master keys that were used to decrypt quality levels from j to N. Now, the user has the plaintext of all quality levels form j to N and hence can play the audio file at quality level j .

A user (i.e. the client unit 102) needs to manage only one key (the highest quality layer he is granted access to) at any point in time. He can extract the plain text of all the quality layers using this key. If a user at a later time obtains the key that is used for encrypting a higher layer, the older key can be discarded and the new key stored since the older key (for a lower layer) can be generated using the new key (by means of the hash chain according to Eq. 3) .

In the following, an embodiment is described when the audio signal requested by the user and stored in the audio signal database 103 (as it is typically the case) comprises multiple time partitions. This means that the audio signal (or the audio file containing the audio signal) is segmented into smaller, e.g. equal sized, time partitions.

The audio signal is as above encoded by a scalable audio encoder 104 (or can be stored in the audio signal database already scalably coded) and each quality layer Ll, ..., LN is segmented into M equal sized time partitions, each partition indexed (from 1 to M) . (If the time partitions are not equal sized, then indexing based on startTime and endTime is used) . The user of the client unit 102 requests (and will get access accordingly) to one or some of these time partitions belonging to a higher quality layer (than what she already possesses) . Access to a time partition should not provide access to another other time partition in any other quality layer.

Therefore, each time partition in each quality layer is encrypted by the encrypting unit 106 using a different key. This enables the content owner to restrict access to users based on quality and time. Obviously, using prior art techniques, the number of keys need to manage this scheme ould be large especially if the number of quality layers as well as the number of time partitions is large.

In one embodiment, if the time partition of the audio signal that the user requests is pre-fixed, then the method described above with refrence to fig. 1 and fig.2 is used. The audio content of that pre-fixed partition (i.e. the requested time partition of the audio signal) is treated as a separate audio file and using the hash chain method as described above, an access control mechanism is realized. This means that the time partition of the audio signal requested by the user is treated like the audio signal as described above.

Similarly, each time partition is in one embodiment treated as a separate audio signal. M keys are needed for controlling access to the audio signal (one for each time partition of the audio signal, since each time partition is treated as one audio signal in the method described above, i.e. there exists one master key K for each time partition) . Therfore, M keys for each audio signal (e.g. each song) are needed instead of 1 key since each time partition is treated as a separate audio file. If M is large, then managing large number of keys can be cumbersome.

Therefore, in one embodiment, if the user of the client unit 102 can request any time partition of an audio signal, the solution described above. A hash (Cryptographic hash e.g. SHA-I, SHA-256) chain is used as well as a hash tree. All keys, used in this _^embodiment are of length 128 bits (16 bytes) at least. A symmetric key block cipher is used, e.g. AES (Advanced Encryption Standard) or a properly constructed stream cipher e.g. RC4 for encrypting data.

The above solution (i.e. the method according to Eq. 1, Eq. 2, Eq. 3 and the description of Fig.l and Fig. 2) is used for controlling access to a particular quality layer j of the whole audio signal, i.e. to provide the user with the quality according to layer j (and all layers beneath layer j, i.e., Lj +1, ... , LN) .

But since the user might want to have access to a particular time segment at a higher quality than according to Lj , for example for sampling the audio signal at that quality layer, each time partition in each quality layer is to be encrypted with a separate key. Access to one key will not reveal any other keys.

As above, the key generator 105 generates H(j) (master key for Lj) and H (j) (user key for Lj, as described above used by the encrypting unit 106 to encrypt H(J)). As mentioned, the audio signal is segmented into M time partitions.

Further, M keys are used, each one used for encrypting the appropriate time partition using equation 4.

H(j, t) = h ( H(j), t) where t = 1 to M ( M time segments)

(Eq.4)

Here, the time index t denotes an index of the time partitions.

Each time partition in each quality layer is to be encrypted using the corresponding key generated according to Eq.4. This means, - to- encrypt the t-th time partition of. th.e_.j-th . quality layer, H(j,t) is to be used.

Further, M keys for each quality layer using equation 5 are used in this embodiment:

H^x(j, t) = h ( H¹Cj), t) where t = 1 to M (M time segments)

(Eq. 5) where H (j) is the user key for the corresponding quality layer generated according to eq. 3.

If the user of the client unit 102 requests a quality layer (i.e. the content of a quality layer; with "quality layer") there is meant, depending on the context, the content of the quality layer) j, then the server unit 101 (illustrativley, the content owner or distributor) sends H(j) encrypted with H¹Cj) to the client unit 102. This is identical to the embodiment described above.

If the user (who already possesses quality layer x) , however, requests for a particular time segment ti a quality layer y ( y > x) then the key generator 105 generates H(j, t_∑) and H^3-(J_/ ti) for quality layer j from x+1 to Y according to equations 4 and 5. The H values are encrypted by the encrypting unit 106 with the H¹ values (i.e. H(j,t) is encrypted using H (j,t)) and sent to the client unit 102. Hence the number of keys that will be sent for each time partition will be y-x. If the number of time partitions requested is more than one, this solution will require p * (y-x) keys where p is the time of time partitions requested.

The encrypted time partitions of the audio signal are sent to the client unit 102 which decrypts and decodes them. This is done analogously as in the above embodiment. In particular, the decrypting unit decrypts the encrypted keys H(j,t) sent from the server unit 101 using the keys H (j,t) .

In one embodiment, instead of using Eq. 4 and 5 to generate the keys (for encrypting the time partitions) , a hash tree is used as shown in Fig.3. Fig.3 shows a hash tree 300 according to an embodiment of the invention.

The leaves 301 of the hash tree 301 correspond to keys. The number of keys required to control access to a consecutive set of time partitions is in this embodiment much smaller. In this embodiment, the hash tree 300 is a balanced binary tree.

The hash tree 300 follows the rule that value of a node = cryptographic hash (value of parent, 0 or

D wherein the value "0" is used if the node is the left child node of its parent node and the value "1" is used if the node is the right child node of its parent node.

The root 302 of the hash tree 300 corresponds to the key used for controlling the quality layer (i.e. H (j) or H^3-(J)).

As an example, if the leaves 301 denoted 1 to 4 in fig.3 hold (i.e. correspond to) the keys used to encrypt time partitions 1 to 4, and if the user requests these time partitions, instead of 4 keys, the number of keys needed is just 1.

If the user requests the time partitions 2 and 3, two keys are still needed sdnce in this embodiment, a binary tree is used.

Instead of a binary tree, also an n-ary tree can be used where each node has n children. Hence consecutive time segments can be managed with far fewer keys than what is necessary with a straightforward scheme. The "n" of the n-ary tree can be determined based on the specific need of the application. In this document, the following publication is cited:

[1] Study on wβ792 (Study on PDAM5 scalable lossless coding) v2

Claims

1. A method for encrypting a coded audio signal comprising a plurality of quality layers, wherein

- a first cryptographic key is generated for the audio signal

- for each of at least one of the plurality of quality layers, a second cryptographic key is generated by applying a concatenation of one-way functions to the first cryptographic key

- each of the at least one of the plurality of quality layers is encrypted using the second cryptographic key generated for the respective quality layer.

2. Method according to claim 1, wherein for the second cryptographic key, a third cryptographic key is generated which is specific for a user and the second cryptographic key is encrypted using the third cryptographic key.

3. Method according to claim 1, wherein the concatenation of one-way functions is a hash chain.

4. Method according to claim 1, wherein the coded audio signal is coded according to AAZ.

5. Method accroding to claim 1, wherein the coded audio signal comprises a plurality of time partitions and a further cryptographic key is generated for each quality layer and each^" time partition Toy applying ^~a^~ concatenation of one-way functions to the first cryptographic key and an index specifying the respective time partition.

6. Method according to claim 1, wherein the coded audio signal comprises a plurality of time partitions and a further cryptographic key is generated for each quality layer and each time partition by using a hash tree.

7. A system for encrypting a coded audio signal comprising a plurality of quality layers, the system comprising

- a first key generating unit adapted to generate a first cryptographic key for the audio signal

- a second key generating unit adapted to generate for each of at least one of the plurality of quality layers a second cryptographic key by applying a concatenation of one-way functions to the first cryptographic key

- an encrypting unit adapted to encrypt each of the at least one of the plurality of quality layers using the second cryptographic key generated for the respective quality layer.

8. A method for decrypting an encrypted coded audio signal, which coded audio signal comprises a plurality of quality layers, wherein

- a first cryptographic key is received for the audio signal

- each of the at least one of the plurality of quality layers is decrypted using the second cryptographic key generated for the respective quality layer.

9. A system for decrypting an encrypted coded audio signal, which coded audio signal comprises a plurality of quality layers, the system comprising

-^~a receiving unit adapted to receive a first cryptographic key for the audio signal

- a key generating unit adapted to generate for each of at least one of the plurality of quality layers a second cryptographic key by applying a concatenation of one-way functions to the first cryptographic key

- a decrypting unit adapted to decrypt each of the at least one of the plurality of quality layers using the second cryptographic key generated for the respective quality layer.

10. A method for generating a set of keys for encrypting a coded audio signal comprising a plurality of quality layers, wherein

- a first cryptographic key is generated for the audio signal

- each of the at least one of the plurality of quality layers is assigned to the second cryptographic key generated for the respective quality layer.

11. A system for generating a set of keys for encrypting a coded audio signal comprising a plurality of quality layers, the system comprising

- an assigning unit adapted to assing each of the at least one of the plurality of quality layers to the second cryptographic key generated for the respective quality layer.

12. A computer program product, which, when executed by a computer, makes the computer perform a method for encrypting a coded audio signal comprising a plurality of quality- layers, wherein

- a first cryptographic key is generated for the audio signal

13. A computer program product, which, -when executed by a computer, makes the computer perform a method for decrypting an encrypted coded audio signal, which coded audio signal comprises a plurality of quality layers, wherein

- a first cryptographic key is received for the audio signal

14. A computer program product, which, when executed by a computer, makes the computer perform a method for generating a set of keys for encrypting a coded audio signal comprising a plurality of quality layers, wherein

- a first cryptographic key is generated for the audio signal