WO2023013019A1 - 発話フィードバック装置、発話フィードバック方法、プログラム - Google Patents
発話フィードバック装置、発話フィードバック方法、プログラム Download PDFInfo
- Publication number
- WO2023013019A1 WO2023013019A1 PCT/JP2021/029278 JP2021029278W WO2023013019A1 WO 2023013019 A1 WO2023013019 A1 WO 2023013019A1 JP 2021029278 W JP2021029278 W JP 2021029278W WO 2023013019 A1 WO2023013019 A1 WO 2023013019A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- speech
- feedback
- speaker
- sound signal
- evaluation value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R27/00—Public address systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transforming into visible information
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/20—Arrangements for obtaining desired frequency or directional characteristics
- H04R1/32—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
- H04R1/40—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
- H04R1/406—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2410/00—Microphones
- H04R2410/05—Noise reduction with a separate noise microphone
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/02—Circuits for transducers, loudspeakers or microphones for preventing acoustic reaction, i.e. acoustic oscillatory feedback
Definitions
- the present invention relates to an acoustic signal processing technology for preventing the voice of a speaker from annoying surrounding people.
- Patent Document 1 describes a technique for acoustic signal processing to prevent the voice of a speaker from disturbing the surrounding people.
- an interference sound hereinafter referred to as a masking sound
- a masking sound is used to mask the voice of the far-end speaker reproduced from the speaker so that people around them cannot hear the voice, so that the voice is leaked to the surroundings.
- it prevents the masking sound from being excessively loud and disturbing the surrounding people.
- Patent Document 1 reproduces a masking sound so that surrounding people cannot hear the content of the speech. Therefore, the utterer cannot grasp how loud the utterance should be so that the surrounding people cannot hear the contents of the utterance.
- an object of the present invention is to provide a technique for feeding back the degree of speech volume to the speaker.
- the speech volume evaluation value an evaluation value for the volume of the spoken voice
- the second collected signal output by the second microphone installed at a position farther from the speaker than the first microphone a signal for emitting a feedback sound from the speaker that indicates the degree of the volume of the speech voice to the speaker from the first collected sound signal.
- FIG. 1 is a block diagram showing a configuration of speech feedback device 100.
- FIG. 4 is a flow chart showing the operation of the speech feedback device 100.
- FIG. 2 is a block diagram showing the configuration of speech feedback device 200.
- FIG. 4 is a flow chart showing the operation of the speech feedback device 200.
- FIG. 3 is a block diagram showing the configuration of speech feedback device 300.
- FIG. 4 is a flow chart showing the operation of the speech feedback device 300.
- FIG. 3 is a block diagram showing the configuration of speech feedback device 301.
- FIG. 4 is a flow chart showing the operation of the speech feedback device 301.
- FIG. 3 is a block diagram showing the configuration of speech feedback device 302.
- FIG. 4 is a flow chart showing the operation of the speech feedback device 302.
- FIG. 4 is a flow chart showing the operation of the speech feedback device 302.
- FIG. 2 is a block diagram showing the configuration of speech feedback device 400.
- FIG. 4 is a flow chart showing the operation of speech feedback device 400.
- FIG. 3 is a block diagram showing the configuration of an utterance evaluation unit 410.
- FIG. 4 is a flowchart showing the operation of an utterance evaluation unit 410; It is a figure which shows an example of the functional structure of the computer which implement
- ⁇ (caret) represents a superscript.
- x y ⁇ z means that y z is a superscript to x
- x y ⁇ z means that y z is a subscript to x
- _ (underscore) represents a subscript.
- x y_z means that y z is a superscript to x
- x y_z means that y z is a subscript to x.
- FIG. 1 is a block diagram showing the configuration of the speech feedback device 100.
- FIG. 2 is a flow chart showing the operation of the speech feedback device 100.
- speech feedback device 100 includes speech volume evaluation section 110 , feedback sound signal generation section 120 and recording section 190 .
- the recording unit 190 is a component that appropriately records information necessary for processing of the speech feedback device 100 .
- Speech feedback device 100 is also connected to microphone 910 and speaker 920 .
- a microphone 910 is installed near the speaker in order to pick up an uttered voice, which is the voice of the speaker.
- the speaker 920 is installed to emit a feedback sound that indicates the volume level of the uttered voice to the utterer. Headphones, earphones, or the like may be used instead of the speaker 920 .
- the speech volume evaluation unit 110 receives the picked-up sound signal output from the microphone 910, generates an evaluation value for the volume of the speech sound from the picked-up sound signal (hereinafter referred to as the speech volume evaluation value), and outputs it.
- the speech volume evaluation unit 110 generates a speech volume evaluation value by, for example, comparing the power of the collected sound signal with a predetermined threshold.
- the speech volume evaluation unit 110 may detect a speech section or suppress noise when calculating the power of the collected sound signal.
- the speech volume evaluation value may be a value indicating that the speech volume is high, a value indicating that the speech volume is low, or the like.
- the feedback sound signal generation unit 120 receives the collected sound signal output from the microphone 910 and the speech volume evaluation value generated in S110, and uses the feedback gain according to the speech volume evaluation value to generate a signal from the collected sound signal. , to generate and output a feedback sound signal (hereinafter referred to as a feedback sound signal) emitted from the speaker 920 .
- the speaker speaks while listening to the feedback sound generated from his or her own uttered voice, but if the feedback delay exceeds 20 ms, the delay becomes annoying, and if it exceeds 50 ms, the feedback sound interferes with speech, making it difficult to speak. is known to be Therefore, the feedback sound signal generating section 120 may generate the feedback sound signal so that the time from the utterance by the speaker until the speaker hears the feedback sound is within 20 ms, for example.
- the feedback sound signal generation unit 120 may set the feedback gain to a larger value as the speech volume evaluation value is larger. For example, if the speech volume evaluation value is a value indicating that it is excessive, a feedback sound signal may be generated using a feedback gain that causes temporary distortion. Whether or not the speech volume evaluation value is a value indicating that the speech volume evaluation value is excessive may be determined based on whether or not the speech volume evaluation value exceeds a predetermined threshold.
- the feedback sound signal generation unit 120 processes the collected sound signal using, for example, noise suppression processing, speech clarification processing, and spectral processing that emphasizes the speech band, so that the feedback sound becomes a sound that is easy for the speaker to hear. You may make it When active noise control (ANC) is used as noise suppression processing, the feedback sound signal generation unit 120 increases the effect of active noise control as the speech volume evaluation value increases.
- ANC active noise control
- the embodiment of the present invention it is possible to feed back the degree of speech volume to the speaker. This allows the speaker to voluntarily adjust the speech volume.
- noise suppression processing when generating the feedback sound signal, it is possible to adjust the speech volume in a form that applies the Lombard effect, that is, to suppress loud speech in noisy environments. It becomes possible.
- FIG. 3 is a block diagram showing the configuration of the speech feedback device 200.
- FIG. 4 is a flow chart showing the operation of the speech feedback device 200.
- speech feedback device 200 includes speech volume evaluation section 210 , feedback sound signal generation section 120 and recording section 190 .
- the recording unit 190 is a component that appropriately records information necessary for processing of the speech feedback device 200 .
- Speech feedback device 200 is also connected to first microphone 910 - 1 , second microphone 910 - 2 , and speaker 920 .
- the first microphone 910-1 is installed near the speaker in order to pick up the spoken voice, which is the voice of the speaker.
- the second microphone 910-2 is installed at a position farther from the speaker than the first microphone 910-1 in order to pick up the uttered voice. It is installed to measure audibility.
- the speaker 920 is installed to emit a feedback sound that indicates the volume level of the uttered voice to the utterer.
- a partition may be installed between the first microphone 910-1 and the second microphone 910-2. Specifically, with respect to the partition, the first microphone 910-1 is installed on the same side as the speaker, and the second microphone 910-2 is installed on the opposite side from the speaker. Headphones, earphones, or the like may be used instead of the speaker 920 .
- Speech feedback device 200 differs from speech feedback device 100 in that it includes speech volume evaluation section 210 instead of speech volume evaluation section 110 and in that it is connected to two microphones.
- speech volume evaluation section 210 receives as input the first collected sound signal output from first microphone 910-1 and the second collected sound signal output from second microphone 910-2.
- An evaluation value for the volume of the speech voice (hereinafter referred to as a speech volume evaluation value) is generated from the second collected sound signal and output.
- the speech volume evaluation unit 210 generates a speech volume evaluation value by, for example, comparing the power of the second collected sound signal with a predetermined threshold.
- the speech volume evaluation unit 210 uses the speech period detected using the first collected sound signal to eliminate the influence of noise.
- the speech volume evaluation unit 210 calculates the speech volume in consideration of the speech attenuation effect of the partition when the partition is installed.
- a rating value can be generated.
- the feedback sound signal generation unit 120 receives the first collected sound signal output by the first microphone 910-1 and the speech volume evaluation value generated in S210, and uses the feedback gain corresponding to the speech volume evaluation value. Then, a feedback sound signal (hereinafter referred to as a feedback sound signal) emitted from the speaker 920 is generated from the first collected sound signal and output.
- a feedback sound signal hereinafter referred to as a feedback sound signal
- Speech volume is more accurately obtained by obtaining the power of the second picked-up signal by using the voice interval detected using the first picked-up signal, in which mainly speech is picked up and the surrounding noise is relatively small. Evaluation values can be generated.
- FIG. 5 is a block diagram showing the configuration of the speech feedback device 300.
- FIG. FIG. 6 is a flow chart showing the operation of speech feedback device 300 .
- speech feedback device 300 includes speech volume evaluation section 110 , howling prevention section 310 , feedback sound signal generation section 320 , and recording section 190 .
- the recording unit 190 is a component that appropriately records information necessary for processing of the speech feedback device 300 .
- Speech feedback device 300 is also connected to microphone 910 and speaker 920 .
- Speech feedback device 300 differs from speech feedback device 100 in that it includes howling prevention section 310 and that it includes feedback sound signal generation section 320 instead of feedback sound signal generation section 120 .
- the operation of the speech feedback device 300 will be described according to FIG. Here, only the operations of howling prevention section 310 and feedback sound signal generation section 320 will be described.
- the howling prevention unit 310 receives the sound pickup signal output by the microphone 910, generates a howling evaluation value indicating the possibility of howling from occurring when the feedback sound is emitted from the speaker, from the sound pickup signal, Output.
- the feedback sound signal generation unit 320 receives the sound pickup signal output by the microphone 910, the speech volume evaluation value generated in S110, and the howling evaluation value generated in S310, and generates the speech volume evaluation value and the howling evaluation value.
- a feedback sound signal (hereinafter referred to as a feedback sound signal) to be emitted from the speaker 920 is generated from the collected sound signal using a feedback gain corresponding to .
- Feedback sound signal generation section 320 sets the feedback gain to a smaller value as the howling evaluation value increases.
- the speech feedback device may be connected with two microphones.
- FIG. 7 is a block diagram showing the configuration of the speech feedback device 301.
- FIG. 8 is a flow chart showing the operation of speech feedback device 301 .
- speech feedback device 301 includes speech volume evaluation section 210 , howling prevention section 310 , feedback sound signal generation section 320 , and recording section 190 .
- the recording unit 190 is a component that appropriately records information necessary for processing of the speech feedback device 301 .
- Speech feedback device 301 is also connected to first microphone 910 - 1 , second microphone 910 - 2 and speaker 920 .
- Speech feedback device 301 differs from speech feedback device 300 in that it includes speech volume evaluation section 210 instead of speech volume evaluation section 110 and in that it is connected to two microphones.
- the operation of the speech feedback device 301 will be explained according to FIG. Here, only the operations of howling prevention section 310 and feedback sound signal generation section 320 will be described.
- howling prevention unit 310 receives as input the first collected sound signal output from first microphone 910-1, and uses the first collected sound signal to determine the possibility of howling occurring when the feedback sound is emitted from the speaker. A feedback evaluation value is generated and output.
- the feedback sound signal generation unit 320 receives the first collected sound signal output by the first microphone 910-1, the speech volume evaluation value generated in S110, and the howling evaluation value generated in S310, and generates the speech volume evaluation.
- a feedback sound signal emitted from the speaker 920 (hereinafter referred to as a feedback sound signal) is generated from the first collected sound signal by using the feedback gain corresponding to the value and the howling evaluation value, and is output.
- the speech feedback device may be connected to a microphone array and speaker array instead of the microphone and speaker.
- FIG. 9 is a block diagram showing the configuration of the speech feedback device 302.
- FIG. 10 is a flow chart showing the operation of speech feedback device 302 .
- the speech feedback device 302 includes a microphone array processing unit 305, a speech volume evaluation unit 110, a howling prevention unit 310, a feedback sound signal generation unit 320, a speaker array processing unit 325, and a recording unit 190.
- the recording unit 190 is a component that appropriately records information necessary for processing of the speech feedback device 302 .
- the speech feedback device 302 is also connected to a microphone array 911 including N (N is an integer of 2 or more) microphones and a speaker array 921 including M (M is an integer of 2 or more) speakers.
- the microphone array 911 is installed near the speaker in order to pick up the spoken voice, which is the voice of the speaker.
- the speaker array 921 is installed to emit a feedback sound indicating the volume level of the uttered voice to the utterer.
- Speech feedback device 302 differs from speech feedback device 300 in that microphone array processing section 305 and speaker array processing section 325 are included, and that microphone array 911 and speaker array 921 are connected instead of microphone 910 and speaker 920 .
- the operation of the speech feedback device 302 will be described according to FIG. Only the operations of the microphone array processing unit 305 and the speaker array processing unit 325 will be described here.
- the microphone array processing unit 305 receives N sound pickup signals output by the N microphones included in the microphone array 911, generates an integrated sound pickup signal from the N sound pickup signals, and outputs the integrated sound pickup signal. do.
- the microphone array processing unit 305 may, for example, use predetermined signal processing to form directivity in the direction of the speaker and blind spots in the direction of the speakers included in the speaker array 921 to generate an integrated sound pickup signal.
- the speaker array processing unit 325 receives the feedback sound signal generated in S320, generates M individual feedback sound signals for emitting sound from the speakers included in the speaker array 921 from the feedback sound signal, Output.
- the speaker array processing unit 325 uses predetermined signal processing, for example, to form directivity in the direction of the speaker and blind spots in the direction of the microphones included in the microphone array 911, so as to form M individual feedback sound signals. should be generated.
- the direction of the speaker and the microphones included in the microphone array 911 may be obtained using any method. For example, the direction of the speaker can be obtained by sound source direction estimation by the microphone array processing unit 305. .
- the direction of the speaker and the microphones included in the microphone array 911 may be obtained from these information.
- Information on the speaker and the positions of the microphones included in the microphone array 911 may be obtained, for example, from a system (not shown) for estimating positions from images captured by a camera, or information on the positions may be obtained in advance. If available, use that information.
- the embodiment of the present invention it is possible to feed back the degree of speech volume to the speaker. By preventing howling, the speaker can more accurately and voluntarily adjust the speech volume.
- FIG. 11 is a block diagram showing the configuration of speech feedback device 400.
- FIG. FIG. 12 is a flow chart showing the operation of speech feedback device 400 .
- speech feedback device 400 includes speech evaluation section 410 , feedback sound signal generation section 420 and recording section 190 .
- the recording unit 190 is a component that appropriately records information necessary for processing of the speech feedback device 400 .
- Speech feedback device 400 is also connected to microphone 910 and speaker 920 . Headphones, earphones, or the like may be used instead of the speaker 920 .
- Speech feedback device 400 differs from speech feedback device 100 in that it includes speech evaluation section 410 instead of speech volume evaluation section 110 and feedback sound signal generation section 420 instead of feedback sound signal generation section 120 .
- the speech evaluation unit 410 receives the picked-up sound signal output from the microphone 910, generates an evaluation value for the speech sound from the picked-up sound signal (hereinafter referred to as the speech evaluation value), and outputs the evaluation value.
- FIG. 13 is a block diagram showing the configuration of the utterance evaluation unit 410.
- FIG. 14 is a flow chart showing the operation of the utterance evaluation unit 410.
- speech evaluation unit 410 includes speech volume evaluation unit 110 , speech clarity evaluation unit 412 , and speech evaluation value calculation unit 414 .
- the speech volume evaluation unit 110 receives the picked-up sound signal output from the microphone 910, generates an evaluation value for the volume of the speech sound from the picked-up sound signal (hereinafter referred to as the speech volume evaluation value), and outputs it.
- the speech articulation evaluation unit 412 receives the collected sound signal output from the microphone 910, generates an evaluation value for the clarity of the speech from the collected sound signal (hereinafter referred to as a speech articulation evaluation value), Output.
- a speech articulation evaluation value for example, short-time objective intelligibility (STOI) or speech recognition score can be used.
- the speech evaluation value calculation unit 414 receives the speech volume evaluation value generated in S110 and the speech clarity evaluation value generated in S412 as inputs, and calculates the weighted sum of the speech volume evaluation value and the speech clarity evaluation value. and outputs the sum as an utterance evaluation value.
- the feedback sound signal generation unit 420 receives as inputs the collected sound signal output by the microphone 910 and the speech evaluation value generated in S410, and uses the feedback gain according to the speech evaluation value to convert the collected sound signal into a speaker.
- a feedback sound signal (hereinafter referred to as a feedback sound signal) emitted from 920 is generated and output.
- the speech feedback device may provide feedback using visual information instead of feedback using sound.
- speech feedback device 400 includes feedback information generator 421 (not shown) instead of feedback sound signal generator 420 .
- the feedback information generation unit 421 receives the speech evaluation value generated in S410 as an input, and generates and outputs information indicating that the volume of the speech is loud when the speech evaluation value is greater than a predetermined threshold.
- the embodiment of the present invention it is possible to feed back to the speaker the degree of annoyance of speech based on the volume and clarity of speech.
- an utterance evaluation value that also considers the intelligibility of utterances, for example, even if the volume of the utterance is low, the content of the utterance can be heard, making it possible to provide feedback even on annoying utterances that may be offensive to the surrounding people. Become.
- FIG. 15 is a diagram showing an example of the functional configuration of a computer 2000 that implements each of the devices described above.
- the processing in each device described above can be performed by causing the recording unit 2020 to read a program for causing the computer 2000 to function as each device described above, and causing the control unit 2010, the input unit 2030, the output unit 2040, and the like to operate.
- the apparatus of the present invention includes, for example, a single hardware entity, which includes an input unit to which a keyboard can be connected, an output unit to which a liquid crystal display can be connected, and a communication device (for example, a communication cable) capable of communicating with the outside of the hardware entity.
- a communication device for example, a communication cable
- CPU Central Processing Unit
- memory RAM and ROM hard disk external storage device
- input unit, output unit, communication unit a CPU, a RAM, a ROM, and a bus for connecting data to and from an external storage device.
- the hardware entity may be provided with a device (drive) capable of reading and writing a recording medium such as a CD-ROM.
- a physical entity with such hardware resources includes a general purpose computer.
- the external storage device of the hardware entity stores a program necessary for realizing the functions described above and data required for the processing of this program (not limited to the external storage device; It may be stored in a ROM, which is a dedicated storage device). Data obtained by processing these programs are appropriately stored in a RAM, an external storage device, or the like.
- each program stored in an external storage device or ROM, etc.
- the data necessary for processing each program are read into the memory as needed, and interpreted, executed and processed by the CPU as appropriate.
- the CPU realizes a predetermined function (each structural unit represented by the above, . . . unit, . . . means, etc.).
- a program that describes this process can be recorded on a computer-readable recording medium.
- Any computer-readable recording medium may be used, for example, a magnetic recording device, an optical disk, a magneto-optical recording medium, a semiconductor memory, or the like.
- magnetic recording devices hard disk devices, flexible disks, magnetic tapes, etc., as optical discs, DVD (Digital Versatile Disc), DVD-RAM (Random Access Memory), CD-ROM (Compact Disc Read Only Memory), CD-R (Recordable) / RW (ReWritable), etc.
- magneto-optical recording media such as MO (Magneto-Optical disc), etc. as semiconductor memory, EEP-ROM (Electronically Erasable and Programmable-Read Only Memory), etc. can be used.
- this program is carried out, for example, by selling, assigning, lending, etc. portable recording media such as DVDs and CD-ROMs on which the program is recorded.
- the program may be distributed by storing the program in the storage device of the server computer and transferring the program from the server computer to other computers via the network.
- a computer that executes such a program for example, first stores the program recorded on a portable recording medium or the program transferred from the server computer once in its own storage device. When executing the process, this computer reads the program stored in its own storage device and executes the process according to the read program. Also, as another execution form of this program, the computer may read the program directly from a portable recording medium and execute processing according to the program, and the program is transferred from the server computer to this computer. Each time, the processing according to the received program may be executed sequentially. In addition, the above-mentioned processing is executed by a so-called ASP (Application Service Provider) type service, which does not transfer the program from the server computer to this computer, and realizes the processing function only by its execution instruction and result acquisition. may be It should be noted that the program in this embodiment includes information that is used for processing by a computer and that conforms to the program (data that is not a direct instruction to the computer but has the property of prescribing the processing of the computer, etc.).
- ASP Application Service Provide
- a hardware entity is configured by executing a predetermined program on a computer, but at least part of these processing contents may be implemented by hardware.
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Otolaryngology (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
Description
以下、図1~図2を参照して発話フィードバック装置100を説明する。図1は、発話フィードバック装置100の構成を示すブロック図である。図2は、発話フィードバック装置100の動作を示すフローチャートである。図1に示すように発話フィードバック装置100は、発話音量評価部110と、フィードバック音信号生成部120と、記録部190を含む。記録部190は、発話フィードバック装置100の処理に必要な情報を適宜記録する構成部である。また、発話フィードバック装置100は、マイク910と、スピーカ920と接続している。マイク910は、発話者の音声である発話音声を収音するために発話者の近くに設置されるものである。スピーカ920は、発話者に発話音声の音量の程度を示すフィードバック音を放音するために設置されるものである。なお、スピーカ920の代わりに、ヘッドホンやイヤホンなどを用いてもよい。
以下、図3~図4を参照して発話フィードバック装置200を説明する。図3は、発話フィードバック装置200の構成を示すブロック図である。図4は、発話フィードバック装置200の動作を示すフローチャートである。図3に示すように発話フィードバック装置200は、発話音量評価部210と、フィードバック音信号生成部120と、記録部190を含む。記録部190は、発話フィードバック装置200の処理に必要な情報を適宜記録する構成部である。また、発話フィードバック装置200は、第1マイク910-1と、第2マイク910-2と、スピーカ920と接続している。第1マイク910-1は、発話者の音声である発話音声を収音するために発話者の近くに設置されるものである。第2マイク910-2は、発話音声を収音するために第1マイク910-1より発話者から遠い位置に設置されるものであり、発話者の発話が周囲の人にどの程度の音量で聞こえるかを測定するために設置されるものである。スピーカ920は、発話者に発話音声の音量の程度を示すフィードバック音を放音するために設置されるものである。なお、第1マイク910-1と第2マイク910-2の間にパーティションを設置してもよい。具体的には、パーティションを境に、第1マイク910-1は発話者と同じ側に、第2マイク910-2は発話者と反対側になるように設置する。また、スピーカ920の代わりに、ヘッドホンやイヤホンなどを用いてもよい。発話フィードバック装置200は、発話音量評価部110の代わりに発話音量評価部210を含む点と、2つのマイクと接続する点において発話フィードバック装置100と異なる。
以下、図5~図6を参照して発話フィードバック装置300を説明する。図5は、発話フィードバック装置300の構成を示すブロック図である。図6は、発話フィードバック装置300の動作を示すフローチャートである。図5に示すように発話フィードバック装置300は、発話音量評価部110と、ハウリング防止部310と、フィードバック音信号生成部320と、記録部190を含む。記録部190は、発話フィードバック装置300の処理に必要な情報を適宜記録する構成部である。また、発話フィードバック装置300は、マイク910と、スピーカ920と接続している。発話フィードバック装置300は、ハウリング防止部310を含む点と、フィードバック音信号生成部120の代わりにフィードバック音信号生成部320を含む点において発話フィードバック装置100と異なる。
発話フィードバック装置は、2つのマイクと接続するようにしてもよい。
発話フィードバック装置は、マイクとスピーカの代わりにマイクアレイとスピーカアレイと接続するようにしてもよい。
以下、図11~図12を参照して発話フィードバック装置400を説明する。図11は、発話フィードバック装置400の構成を示すブロック図である。図12は、発話フィードバック装置400の動作を示すフローチャートである。図11に示すように発話フィードバック装置400は、発話評価部410と、フィードバック音信号生成部420と、記録部190を含む。記録部190は、発話フィードバック装置400の処理に必要な情報を適宜記録する構成部である。また、発話フィードバック装置400は、マイク910と、スピーカ920と接続している。なお、スピーカ920の代わりに、ヘッドホンやイヤホンなどを用いてもよい。発話フィードバック装置400は、発話音量評価部110の代わりに発話評価部410を含む点と、フィードバック音信号生成部120の代わりにフィードバック音信号生成部420を含む点において発話フィードバック装置100と異なる。
発話フィードバック装置は、音を用いてフィードバックする代わりに、視覚情報を用いてフィードバックするようにしてもよい。この場合、発話フィードバック装置400は、フィードバック音信号生成部420の代わりにフィードバック情報生成部421(図示しない)を含む。フィードバック情報生成部421は、S410で生成した発話評価値を入力とし、当該発話評価値が所定の閾値よい大きい場合、発話の音量が大きいことを示す情報を生成し、出力する。
図15は、上述の各装置を実現するコンピュータ2000の機能構成の一例を示す図である。上述の各装置における処理は、記録部2020に、コンピュータ2000を上述の各装置として機能させるためのプログラムを読み込ませ、制御部2010、入力部2030、出力部2040などに動作させることで実施できる。
Claims (6)
- 発話者の音声である発話音声を収音するために当該発話者の近くに設置された第1マイクが出力する第1収音信号と、当該発話音声を収音するために第1マイクより当該発話者から遠い位置に設置された第2マイクが出力する第2収音信号から、発話音声の音量に対する評価値(以下、発話音量評価値という)を生成する発話音量評価部と、
前記発話音量評価値に応じたフィードバックゲインを用いて、第1収音信号から、発話者に発話音声の音量の程度を示すフィードバック音をスピーカから放音するための信号(以下、フィードバック音信号という)を生成するフィードバック音信号生成部と、
を含む発話フィードバック装置。 - 請求項1に記載の発話フィードバック装置であって、
前記フィードバック音信号生成部は、発話音量評価値が大きいことを示す値であるほど、フィードバックゲインを大きな値とする
ことを特徴とする発話フィードバック装置。 - 請求項1に記載の発話フィードバック装置であって、
前記フィードバック音信号生成部は、前記発話音量評価値が所定の閾値を超える場合、歪が生じるようなフィードバックゲインを用いて前記フィードバック音信号を生成する
ことを特徴とする発話フィードバック装置。 - 請求項1ないし3のいずれか1項に記載の発話フィードバック装置であって、
第1収音信号を用いて、フィードバック音をスピーカから放音する場合にハウリングが生じる可能性を示すハウリング評価値を生成するハウリング防止部を含み、
前記フィードバック音信号生成部は、ハウリング評価値が大きいことを示す値であるほど、フィードバックゲインを小さな値とする
ことを特徴とする発話フィードバック装置。 - 発話フィードバック装置が、発話者の音声である発話音声を収音するために当該発話者の近くに設置された第1マイクが出力する第1収音信号と、当該発話音声を収音するために第1マイクより当該発話者から遠い位置に設置された第2マイクが出力する第2収音信号から、発話音声の音量に対する評価値(以下、発話音量評価値という)を生成する発話音量評価ステップと、
前記発話フィードバック装置が、前記発話音量評価値に応じたフィードバックゲインを用いて、第1収音信号から、発話者に発話音声の音量の程度を示すフィードバック音をスピーカから放音するための信号(以下、フィードバック音信号という)を生成するフィードバック音信号生成ステップと、
を含む発話フィードバック方法。 - 請求項1ないし4のいずれか1項に記載の発話フィードバック装置としてコンピュータを機能させるためのプログラム。
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US18/293,991 US20250080905A1 (en) | 2021-08-06 | 2021-08-06 | Utterance feedback apparatus, utterance feedback method, and program |
| PCT/JP2021/029278 WO2023013019A1 (ja) | 2021-08-06 | 2021-08-06 | 発話フィードバック装置、発話フィードバック方法、プログラム |
| JP2023539532A JP7677431B2 (ja) | 2021-08-06 | 2021-08-06 | 発話フィードバック装置、発話フィードバック方法、プログラム |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2021/029278 WO2023013019A1 (ja) | 2021-08-06 | 2021-08-06 | 発話フィードバック装置、発話フィードバック方法、プログラム |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2023013019A1 true WO2023013019A1 (ja) | 2023-02-09 |
Family
ID=85155448
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2021/029278 Ceased WO2023013019A1 (ja) | 2021-08-06 | 2021-08-06 | 発話フィードバック装置、発話フィードバック方法、プログラム |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US20250080905A1 (ja) |
| JP (1) | JP7677431B2 (ja) |
| WO (1) | WO2023013019A1 (ja) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2017064839A1 (ja) * | 2015-10-16 | 2017-04-20 | パナソニックIpマネジメント株式会社 | 双方向会話補助装置及び双方向会話補助方法 |
| JP2021022883A (ja) * | 2019-07-29 | 2021-02-18 | 大聖 今田 | 音声増幅装置及びプログラム |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| DE10110258C1 (de) * | 2001-03-02 | 2002-08-29 | Siemens Audiologische Technik | Verfahren zum Betrieb eines Hörhilfegerätes oder Hörgerätesystems sowie Hörhilfegerät oder Hörgerätesystem |
| DE102009060094B4 (de) * | 2009-12-22 | 2013-03-14 | Siemens Medical Instruments Pte. Ltd. | Verfahren und Hörgerät zur Rückkopplungserkennung und -unterdrückung mit einem Richtmikrofon |
| US8630437B2 (en) * | 2010-02-23 | 2014-01-14 | University Of Utah Research Foundation | Offending frequency suppression in hearing aids |
| DE102014218672B3 (de) * | 2014-09-17 | 2016-03-10 | Sivantos Pte. Ltd. | Verfahren und Vorrichtung zur Rückkopplungsunterdrückung |
-
2021
- 2021-08-06 WO PCT/JP2021/029278 patent/WO2023013019A1/ja not_active Ceased
- 2021-08-06 US US18/293,991 patent/US20250080905A1/en active Pending
- 2021-08-06 JP JP2023539532A patent/JP7677431B2/ja active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2017064839A1 (ja) * | 2015-10-16 | 2017-04-20 | パナソニックIpマネジメント株式会社 | 双方向会話補助装置及び双方向会話補助方法 |
| JP2021022883A (ja) * | 2019-07-29 | 2021-02-18 | 大聖 今田 | 音声増幅装置及びプログラム |
Also Published As
| Publication number | Publication date |
|---|---|
| JPWO2023013019A1 (ja) | 2023-02-09 |
| US20250080905A1 (en) | 2025-03-06 |
| JP7677431B2 (ja) | 2025-05-15 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP2012155339A (ja) | 音声状態モデルを使用したマルチセンサ音声高品質化 | |
| CN100525101C (zh) | 使用波束形成算法来记录信号的方法和设备 | |
| US20250095668A1 (en) | Input selection for wind noise reduction on wearable devices | |
| CN111801951B (zh) | 啸叫抑制装置、其方法以及计算机可读取记录介质 | |
| CN103814584B (zh) | 声音处理装置和声音处理方法 | |
| JP7639924B2 (ja) | マスキング装置、マスキング方法、プログラム | |
| JP7677431B2 (ja) | 発話フィードバック装置、発話フィードバック方法、プログラム | |
| US20240055011A1 (en) | Dynamic voice nullformer | |
| US12183317B2 (en) | Call environment generation method, call environment generation apparatus, and program | |
| US11894013B2 (en) | Sound collection loudspeaker apparatus, method and program for the same | |
| CN112544088B (zh) | 拾音扩音装置、其方法以及记录介质 | |
| JP4495704B2 (ja) | 音像定位強調再生方法、及びその装置とそのプログラムと、その記憶媒体 | |
| CN115278499B (zh) | 延时测量方法及装置、电子设备和存储介质 | |
| JP7636512B2 (ja) | ポータブルカラオケの低複雑度ハウリング抑制 | |
| JP6994221B2 (ja) | 抽出発生音補正装置、抽出発生音補正方法、プログラム | |
| JP6956929B2 (ja) | 情報処理装置、制御方法、及び制御プログラム | |
| CN116419111A (zh) | 耳机的控制方法、参数生成方法、装置、存储介质及耳机 | |
| JP6538002B2 (ja) | 目的音集音装置、目的音集音方法、プログラム、記録媒体 | |
| JP6639590B2 (ja) | ヘッドセット | |
| WO2025080244A1 (en) | Dynamic voice nullformer | |
| JP2023036332A (ja) | 音響システム | |
| WO2021210120A1 (ja) | 消去フィルタ係数生成方法、消去フィルタ係数生成装置、プログラム | |
| JP2012195756A (ja) | 音響再生装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21952842 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2023539532 Country of ref document: JP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 18293991 Country of ref document: US |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 21952842 Country of ref document: EP Kind code of ref document: A1 |
|
| WWP | Wipo information: published in national office |
Ref document number: 18293991 Country of ref document: US |