JPH086589A - Telephone line voice input system - Google Patents
Telephone line voice input systemInfo
- Publication number
- JPH086589A JPH086589A JP6138828A JP13882894A JPH086589A JP H086589 A JPH086589 A JP H086589A JP 6138828 A JP6138828 A JP 6138828A JP 13882894 A JP13882894 A JP 13882894A JP H086589 A JPH086589 A JP H086589A
- Authority
- JP
- Japan
- Prior art keywords
- information
- telephone line
- recognition
- information service
- vocabulary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
Description
【0001】[0001]
【産業上の利用分野】本発明は、音声により電話回線ネ
ットワークを介して情報サービスシステムとのやりとり
を行うシステムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a system for communicating with an information service system by voice via a telephone network.
【0002】[0002]
【従来の技術】従来の音声認識は、単語を認識単位とす
る方式に基づいていた。このため、音声認識装置の認識
語彙が固定されており、したがって、ユーザが情報サー
ビスシステムとネットワークを介して音声によってやり
とりを行うに際しては、情報サービスシステム側で音声
認識機能を持ち、情報サービスシステム固有の語彙や文
法情報に従って認識を行う方法をとる必要があった。2. Description of the Related Art Conventional speech recognition is based on a system in which words are used as recognition units. Therefore, the recognition vocabulary of the voice recognition device is fixed. Therefore, when the user exchanges voice with the information service system via the network, the information service system has a voice recognition function and is unique to the information service system. It was necessary to adopt a method of recognition according to the vocabulary and grammatical information of.
【0003】この方法では、以下のような問題がある。
ひとつは、ユーザの音声を電話回線ネットワークを介し
て情報サービスシステムへ送る必要があるため、音声が
歪み音声認識が難しくなるという点である。音声を電話
回線で送る場合には、回線歪等の大きな歪をさけること
ができない。もうひとつの問題は、情報サービスシステ
ムは、多数のユーザの音声を受けて認識処理しなければ
ならないため、不特定話者認識処理をおこなわなければ
ならないという点である。不特定話者認識では、話者に
よっては認識性能が著しく低下するという問題がある。
なお、ユーザ毎に標準パタンを用意することにより性能
を向上させることができるが、情報サービスシステム側
で個々のユーザ毎の標準パタンを用意することは非常に
コストがかかり現実には困難である。This method has the following problems.
One is that it is necessary to send the user's voice to the information service system via the telephone line network, so that the voice is distorted and speech recognition becomes difficult. When sending voice through a telephone line, it is impossible to avoid large distortion such as line distortion. Another problem is that the information service system must receive and perform recognition processing of the voices of a large number of users, and therefore must perform an unspecified speaker recognition processing. In the speaker-independent recognition, there is a problem that the recognition performance is significantly reduced depending on the speaker.
The performance can be improved by preparing a standard pattern for each user, but it is very expensive and actually difficult to prepare a standard pattern for each user on the information service system side.
【0004】[0004]
【発明が解決しようとする課題】本発明では、上記の問
題点を解決するため、音声認識機能はユーザ端末側に用
意し、認識に必要な語彙文法情報を情報サービスシステ
ム側から電話回線ネットワークを介して送ってもらう。
これにより、ユーザ端末側では、ユーザ専用の標準パタ
ンを用意することにより、高い認識精度の音声認識を実
現することが可能である。こうした音声認識として、例
えば、文献1(情報処理学会第47回全国大会講演論文
集、p.2−375からp.2−376)に示されたパ
ソコン音声認識ソフトウェアを使用することが可能であ
る。このソフトウェアによれば、パソコン上に、ユーザ
専用の音声標準パタンを持った音声認識が実現できる。In the present invention, in order to solve the above-mentioned problems, a speech recognition function is provided on the user terminal side, and vocabulary grammar information necessary for recognition is transmitted from the information service system side to the telephone line network. Have it sent through.
As a result, on the user terminal side, it is possible to realize voice recognition with high recognition accuracy by preparing a standard pattern exclusively for the user. For such speech recognition, it is possible to use, for example, personal computer speech recognition software described in Document 1 (Proceedings of the 47th Annual Conference of the Information Processing Society of Japan, pp. 2-375 to pp. 2-376). . According to this software, it is possible to realize the voice recognition having the voice standard pattern dedicated to the user on the personal computer.
【0005】[0005]
【課題を解決するための手段】本発明の電話回線音声入
力システムは、情報サービスシステムとユーザ端末とが
電話回線により接続され、該ユーザ端末から該情報サー
ビスシステムに対し音声により指示を行う電話回線音声
入力システムにおいて、前記情報サービスシステムは、
語彙文法を格納する語彙文法記憶部を備え、前記ユーザ
端末は、前記情報サービスシステムより前記語彙文法を
受け取り格納する語彙文法バッファと、該語彙文法を用
い音声認識を行う音声認識部とを備えることを特徴とす
る。A telephone line voice input system of the present invention is a telephone line in which an information service system and a user terminal are connected by a telephone line, and the user terminal gives a voice instruction to the information service system. In the voice input system, the information service system,
A vocabulary grammar storage unit for storing a vocabulary grammar, the user terminal includes a vocabulary grammar buffer for receiving and storing the vocabulary grammar from the information service system, and a voice recognition unit for performing voice recognition using the vocabulary grammar. Is characterized by.
【0006】[0006]
【作用】認識に用いられる語彙や文法は、情報サービス
のアプリケーションによって、また、そのアプリケーシ
ョンの中の場面に応じて変わる。例えば、最初は、どの
ようなサービスを選ぶかを選択するための語彙文法が用
いられ、ついでサービス内容が決まったら、サービスに
応じて入力文の語彙文法が変わる。情報サービスシステ
ムは、サービスの選択等の制御に加えて、場面に従って
認識用語彙文法の切り替えを制御し、ユーザ端末側へ認
識用語彙文法情報を送信する。ユーザ端末側では、音声
認識を作動させるに際して、場面毎に送信されてきた文
法語彙に従って認識動作が行われるように制御する。The vocabulary and grammar used for recognition vary depending on the application of the information service and the situation within the application. For example, at first, a vocabulary grammar for selecting what kind of service is selected is used, and then, when the service content is decided, the vocabulary grammar of the input sentence changes depending on the service. The information service system controls the switching of the recognized vocabulary grammar according to the scene in addition to the control of the service selection and the like, and transmits the recognized vocabulary grammar information to the user terminal side. When the user terminal activates the speech recognition, it controls so that the recognition operation is performed according to the grammar vocabulary transmitted for each scene.
【0007】文法語彙の情報は、単語の発音を表す単語
辞書情報と、どのような単語の並びを文として入力可能
であるかを表す単語の並びに関する文法情報から成る。
音声入力として、文発声を許容せず離散単語発声のみを
許容する場合は、単語の並びに関する文法情報は不要で
ある。[0007] The grammar vocabulary information includes word dictionary information indicating pronunciation of words and grammatical information on word sequences indicating what word sequences can be input as sentences.
In the case where only discrete word utterances are allowed as sentence utterances without utterance utterance, grammatical information on word arrangement is unnecessary.
【0008】ユーザ端末側の音声認識部としては、音素
や音節のように、これらの組み合わせにより任意の単語
を表現できるユニットを認識の単位とした方式に基づく
ものでなければならない。文献1に示された音声認識は
こうした方式のものの一例である。この場合には、語彙
文法情報を情報サービスシステムから受けとり、これを
用いた認識が可能である。従来、一般に使われてきた単
語を単位として単語の標準パタンを登録する方式の音声
認識は本用途には使えない。音声認識部は、ユーザ側に
設置されるのでユーザ用に標準パタンを用意することが
可能である。これにより認識性能を向上させることが可
能であることが知られており、情報サービスシステムの
側に不特定話者認識装置を設置するより高性能の音声認
識を実現できる。また、情報サービスシステム側の音声
認識装置にユーザ用の標準パタンを実現しようとすると
高価となるが、これを避けることができる。The speech recognition unit on the user terminal side must be based on a system in which a unit that can express an arbitrary word by a combination of these, such as a phoneme or a syllable, is a unit of recognition. The speech recognition shown in Document 1 is an example of such a system. In this case, it is possible to receive vocabulary grammar information from the information service system and perform recognition using this. Conventionally, speech recognition of a method of registering a standard pattern of words in units of commonly used words cannot be used for this purpose. Since the voice recognition unit is installed on the user side, it is possible to prepare a standard pattern for the user. It is known that this makes it possible to improve recognition performance, and it is possible to realize higher-performance voice recognition than installing an unspecified speaker recognition device on the information service system side. Further, it is expensive to implement a standard pattern for a user in the voice recognition device on the information service system side, but this can be avoided.
【0009】語彙文法情報は、図2に示されるように、
例えば、単語辞書情報については、単語のかな漢字表記
とカナ文字表記で与え、文法情報については、単語名を
アークとする有限状態ネットワークで与えられる。The lexical grammar information is, as shown in FIG.
For example, word dictionary information is given in Kana-Kanji notation and Kana-character notation of words, and grammar information is given in a finite state network using word names as arcs.
【0010】[0010]
【実施例】本発明による実施例を図1に示す。1は情報
サービスシステムであり、使用する語彙文法を認識用語
彙文法記憶部12に保持する。制御部11は、ユーザと
システムとの対話の進行場面に従って、各場面で用いら
れる語彙文法情報を語彙文法記憶部12から取り出し、
電話回線ネットワーク2を経由してユーザ端末3へ送出
する。ユーザ端末3は受信した語彙文法情報を語彙文法
バッファ22に格納する。音声認識部23は、語彙文法
バッファ22から語彙文法情報を読みだし、音声入力部
4から入力されたユーザの音声入力信号を認識し、認識
結果を制御部21、電話回線ネットワーク2を経由して
情報サービスシステム1へ送る。情報サービスシステム
1は認識結果に応じた処理、応答を行い、場面を進め
る。以上の処理がサービス完了まで繰り返される。FIG. 1 shows an embodiment according to the present invention. An information service system 1 stores a vocabulary grammar to be used in a recognition vocabulary grammar storage unit 12. The control unit 11 retrieves the vocabulary grammar information used in each scene from the vocabulary grammar storage unit 12 according to the progress scene of the dialogue between the user and the system,
It is sent to the user terminal 3 via the telephone line network 2. The user terminal 3 stores the received vocabulary grammar information in the vocabulary grammar buffer 22. The voice recognition unit 23 reads the vocabulary grammar information from the vocabulary grammar buffer 22, recognizes the voice input signal of the user input from the voice input unit 4, and outputs the recognition result via the control unit 21 and the telephone line network 2. Send to information service system 1. The information service system 1 performs processing and response according to the recognition result, and advances the scene. The above process is repeated until the service is completed.
【0011】[0011]
【発明の効果】以上説明したように、本発明は、音声認
識用の語彙文法情報を音声認識部とは切り放し、電話回
線ネットワークを介して遠隔の情報サービスシステム側
におくことによって、高い精度の音声認識を実現するこ
とができ、これにより、電話回線ネットワークを介した
情報サービスの利便性を向上させることができる。As described above, the present invention separates the vocabulary grammar information for speech recognition from the speech recognition unit and places it on the remote information service system side via the telephone line network, thereby achieving high accuracy. Voice recognition can be realized, thereby improving the convenience of an information service via a telephone line network.
【図1】本発明の一実施例を示すブロック図。FIG. 1 is a block diagram showing an embodiment of the present invention.
【図2】語彙文法の例。FIG. 2 shows an example of a lexical grammar.
1 情報サービスシステム 2 電話回線ネットワーク 3 ユーザ端末 4 音声入力部 11 制御部 12 語彙文法記憶部 21 制御部 22 語彙文法バッファ 23 音声認識部 REFERENCE SIGNS LIST 1 information service system 2 telephone line network 3 user terminal 4 voice input unit 11 control unit 12 vocabulary grammar storage unit 21 control unit 22 vocabulary grammar buffer 23 voice recognition unit
Claims (1)
話回線により接続され、該ユーザ端末から該情報サービ
スシステムに対し音声により指示を行う電話回線音声入
力システムにおいて、 前記情報サービスシステムは、語彙文法を格納する語彙
文法記憶部を備え、 前記ユーザ端末は、前記情報サービスシステムより前記
語彙文法を受け取り格納する語彙文法バッファと、該語
彙文法を用い音声認識を行う音声認識部とを備えること
を特徴とする電話回線音声入力システム。1. A telephone line voice input system in which an information service system and a user terminal are connected by a telephone line, and the user terminal gives voice instructions to the information service system, wherein the information service system uses a vocabulary grammar. A vocabulary grammar storage section for storing the vocabulary grammar storage section, wherein the user terminal includes a vocabulary grammar buffer for receiving and storing the vocabulary grammar from the information service system; Telephone line voice input system.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP6138828A JP2655086B2 (en) | 1994-06-21 | 1994-06-21 | Telephone line voice input system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP6138828A JP2655086B2 (en) | 1994-06-21 | 1994-06-21 | Telephone line voice input system |
Publications (2)
Publication Number | Publication Date |
---|---|
JPH086589A true JPH086589A (en) | 1996-01-12 |
JP2655086B2 JP2655086B2 (en) | 1997-09-17 |
Family
ID=15231179
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP6138828A Expired - Fee Related JP2655086B2 (en) | 1994-06-21 | 1994-06-21 | Telephone line voice input system |
Country Status (1)
Country | Link |
---|---|
JP (1) | JP2655086B2 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002542501A (en) * | 1999-02-19 | 2002-12-10 | カスタム・スピーチ・ユーエスエイ・インコーポレーテッド | Automatic transcription system and method using two speech conversion instances and computer-assisted correction |
US6522725B2 (en) | 1997-12-05 | 2003-02-18 | Nec Corporation | Speech recognition system capable of flexibly changing speech recognizing function without deteriorating quality of recognition result |
JP2005284543A (en) * | 2004-03-29 | 2005-10-13 | Chugoku Electric Power Co Inc:The | Business support system and method |
JP2007072481A (en) * | 2006-11-20 | 2007-03-22 | Ricoh Co Ltd | Speech recognition system, speech recognizing method, and recording medium |
US7225134B2 (en) | 2000-06-20 | 2007-05-29 | Sharp Kabushiki Kaisha | Speech input communication system, user terminal and center system |
-
1994
- 1994-06-21 JP JP6138828A patent/JP2655086B2/en not_active Expired - Fee Related
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6522725B2 (en) | 1997-12-05 | 2003-02-18 | Nec Corporation | Speech recognition system capable of flexibly changing speech recognizing function without deteriorating quality of recognition result |
JP2002542501A (en) * | 1999-02-19 | 2002-12-10 | カスタム・スピーチ・ユーエスエイ・インコーポレーテッド | Automatic transcription system and method using two speech conversion instances and computer-assisted correction |
US7225134B2 (en) | 2000-06-20 | 2007-05-29 | Sharp Kabushiki Kaisha | Speech input communication system, user terminal and center system |
JP2005284543A (en) * | 2004-03-29 | 2005-10-13 | Chugoku Electric Power Co Inc:The | Business support system and method |
JP2007072481A (en) * | 2006-11-20 | 2007-03-22 | Ricoh Co Ltd | Speech recognition system, speech recognizing method, and recording medium |
JP4658022B2 (en) * | 2006-11-20 | 2011-03-23 | 株式会社リコー | Speech recognition system |
Also Published As
Publication number | Publication date |
---|---|
JP2655086B2 (en) | 1997-09-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5615296A (en) | Continuous speech recognition and voice response system and method to enable conversational dialogues with microprocessors | |
US7260529B1 (en) | Command insertion system and method for voice recognition applications | |
JPH06175682A (en) | Context changing system and method for speech recognition device | |
WO2001099096A1 (en) | Speech input communication system, user terminal and center system | |
JPH06214587A (en) | Predesignated word spotting subsystem and previous word spotting method | |
JP2010085536A (en) | Voice recognition system, voice recognition method, voice recognition client, and program | |
KR19980070329A (en) | Method and system for speaker independent recognition of user defined phrases | |
US8126703B2 (en) | Method, spoken dialog system, and telecommunications terminal device for multilingual speech output | |
US20060190268A1 (en) | Distributed language processing system and method of outputting intermediary signal thereof | |
JP2667408B2 (en) | Translation communication system | |
JP2655086B2 (en) | Telephone line voice input system | |
JPH07129594A (en) | Automatic interpretation system | |
JP2020113150A (en) | Voice translation interactive system | |
JP2009104047A (en) | Information processing method and information processing apparatus | |
JP3058125B2 (en) | Voice recognition device | |
JPH03132797A (en) | Voice recognition device | |
JP3526549B2 (en) | Speech recognition device, method and recording medium | |
JP3478171B2 (en) | Voice recognition device and voice recognition method | |
JP2000047684A (en) | Voice recognizing method and voice service device | |
Shimizu et al. | Development of client-server speech translation system on a multi-lingual speech communication platform | |
JP3136038B2 (en) | Interpreting device | |
JP4445371B2 (en) | Recognition vocabulary registration apparatus, speech recognition apparatus and method | |
Baggia | THE IMPACT OF STANDARDS ON TODAY’S SPEECH APPLICATIONS | |
JP3700743B2 (en) | Recording medium and character input device | |
JPH07219585A (en) | Processor and method for speech processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 19970415 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20090530 Year of fee payment: 12 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20100530 Year of fee payment: 13 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20110530 Year of fee payment: 14 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20110530 Year of fee payment: 14 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20120530 Year of fee payment: 15 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20120530 Year of fee payment: 15 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20130530 Year of fee payment: 16 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20140530 Year of fee payment: 17 |
|
LAPS | Cancellation because of no payment of annual fees |