WO2019194342A1

WO2019194342A1 - Mobile apparatus and method of providing similar word corresponding to input word

Info

Publication number: WO2019194342A1
Application number: PCT/KR2018/004622
Authority: WO
Inventors: Sang Hun Lee
Original assignee: PHILL IT CO Ltd
Current assignee: PHILL IT CO Ltd
Priority date: 2018-04-02
Filing date: 2018-04-20
Publication date: 2019-10-10
Anticipated expiration: 2020-10-02
Also published as: KR20190115320A

Abstract

Provided are a mobile apparatus and a method that rapidly retrieves and provides a similar word corresponding to an input word, the method comprising: determining a cluster including the input word among a plurality of clusters generated by clustering words on a vector space; retrieving a first similar word corresponding to the input word from the determined cluster based on a precalculated similarity between words included in the determined cluster; retrieving a second similar word corresponding to the input word from another cluster other than the determined cluster based on a similarity between the input word and words included in the another cluster; and providing a similar word corresponding to the input word based on the retrieved first similar word and the retrieved second similar word.

Description

MOBILE APPARATUS AND METHOD OF PROVIDING SIMILAR WORD CORRESPONDING TO INPUT WORD

One or more embodiments relate to mobile apparatuses and methods of providing a similar word corresponding to an input word.

As the popularization of mobile apparatuses such as smart phones is universalized, there is a growing interest in methods that may increase the convenience of users in mobile apparatuses.

A mobile apparatus may provide a user with an output in a user's desired direction through a predetermined processing process on a user's input, thus increasing the user's satisfaction and becoming a user-friendly product.

One or more embodiments include mobile apparatuses and methods of providing a similar word corresponding to an input word based on clustering of words on a vector space.

Additional aspects will be set forth in part in the description which follows and, in part, will be apparent from the description, or may be learned by practice of the presented embodiments.

According to one or more embodiments, a method of providing a similar word corresponding to an input word includes: determining a cluster including an input word among a plurality of clusters generated by clustering words on a vector space; retrieving a first similar word corresponding to the input word from the determined cluster based on a precalculated similarity between words included in the determined cluster; retrieving a second similar word corresponding to the input word from another cluster other than the determined cluster based on a similarity between the input word and words included in the another cluster; and providing a similar word corresponding to the input word based on the retrieved first similar word and the retrieved second similar word.

According to one or more embodiments, a mobile apparatus for providing a similar word corresponding to an input word includes: a user interface device; a memory storing a computer-executable instruction; and a processor executing the computer-executable instruction to determine a cluster including an input word input through the user interface device among a plurality of clusters generated by clustering words on a vector space, retrieve a first similar word corresponding to the input word from the determined cluster based on a precalculated similarity between words included in the determined cluster, retrieve a second similar word corresponding to the input word from another cluster other than the determined cluster based on a similarity between the input word and words included in the another cluster, and provide a similar word corresponding to the input word based on the retrieved first similar word and the retrieved second similar word.

According to one or more embodiments, a non-transitory computer-readable storage medium having stored therein processor-executable instructions includes: instructions for determining a cluster including an input word among a plurality of clusters generated by clustering words on a vector space; instructions for retrieving a first similar word corresponding to the input word from the determined cluster based on a precalculated similarity between words included in the determined cluster; instructions for retrieving a second similar word corresponding to the input word from another cluster other than the determined cluster based on a similarity between the input word and words included in the another cluster; and instructions for providing a similar word corresponding to the input word based on the retrieved first similar word and the retrieved second similar word.

These and/or other aspects will become apparent and more readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings in which:

FIG. 1 is a block diagram illustrating a configuration of a mobile apparatus for providing a similar word corresponding to an input word according to an embodiment;

FIG. 2 is a diagram illustrating a process of retrieving a similar word corresponding to an input word based on clustering of words on a vector space according to an embodiment;

FIG. 3 is a diagram illustrating a similarity matrix representing a precalculated similarity between words included in a cluster according to an embodiment;

FIG. 4 is a flowchart illustrating a method of providing a similar word corresponding to an input word according to an embodiment;

FIG. 5 is a detailed flowchart illustrating an operation of retrieving a first similar word and an operation of retrieving a second similar word, in case of retrieving a similar word based on a range corresponding to a similarity, according to an embodiment;

FIG. 6 is a diagram illustrating a process of retrieving a first similar word from a cluster including an input word, in case of retrieving a similar word based on a range corresponding to a similarity, according to an embodiment;

FIG. 7 is a diagram illustrating a process of retrieving a second similar word from another cluster other than a cluster including an input word, in case of retrieving a similar word based on a range corresponding to a similarity, according to an embodiment;

FIG. 8 is a detailed flowchart illustrating an operation of retrieving a first similar word and an operation of retrieving a second similar word, in case of retrieving a similar word based on a predetermined number according to a similarity order, according to an embodiment;

FIG. 9 is a diagram illustrating a process of retrieving a first similar word from a cluster including an input word, in case of retrieving a similar word based on a predetermined number according to a similarity order, according to an embodiment;

FIG. 10 is a diagram illustrating a process of retrieving a second similar word from another cluster other than a cluster including an input word, in case of retrieving a similar word based on a predetermined number according to a similarity order, according to an embodiment;

FIG. 11 is a diagram illustrating a program code of an algorithm for retrieving a similar word based on a range corresponding to a similarity, according to an embodiment; and

FIG. 12 is a diagram illustrating a program code of an algorithm for retrieving a similar word based on a predetermined number according to a similarity order, according to another embodiment.

Reference will now be made in detail to embodiments, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like elements throughout. In this regard, the present embodiments may have different forms and should not be construed as being limited to the descriptions set forth herein. Accordingly, the embodiments are merely described below, by referring to the figures, to explain aspects of the present description. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items. Expressions such as "at least one of," when preceding a list of elements, modify the entire list of elements and do not modify the individual elements of the list.

Hereinafter, various embodiments will be described in detail with reference to the drawings. The embodiments described below may be modified and implemented in various different forms. In order to more clearly describe features of the embodiments, detailed descriptions of the details widely known to those of ordinary skill in the art will be omitted herein.

Meanwhile, in this specification, when something is referred to as "including" one or more components, it may further include one or more other components unless specified otherwise.

The present embodiments relate to mobile apparatuses and methods of providing a similar word corresponding to an input word, and detailed descriptions of the details widely known to those of ordinary skill in the art will be omitted herein.

FIG. 1 is a block diagram illustrating a configuration of a mobile apparatus 100 for providing a similar word corresponding to an input word according to an embodiment.

Referring to FIG. 1, the mobile apparatus 100 for providing a similar word corresponding to an input word according to an embodiment may include a memory 110, a processor 120, and a user interface device 130. Those of ordinary skill in the art may understand that other general-purpose components may be further included in addition to the components illustrated in FIG. 1.

The mobile apparatus 100 may be an electronic apparatus such as a smart phone, a tablet PC, or a laptop computer that may mount an operating system (OS) and execute an application installed therein to display a processing result according to a user input. The application may be a term collectively referring to an application program or a mobile application. A user may select and execute an application to be executed among various types of applications installed in the mobile apparatus 100.

The mobile apparatus 100 may perform wired/wireless communication with another device or network. For this purpose, the mobile apparatus 100 may include a communication module that supports at least one of various wired/wireless communication methods. The mobile apparatus 100 may be connected to a server 200 through the communication module to transmit/receive signals or data thereto/therefrom. For example, the mobile apparatus 100 may receive a word embedding model from the server 200 according to an update period.

The memory 110 may store software and/or programs. For example, the memory 110 may store various types of data and programs such as an application and an application programming interface (API).

The processor 120 may access and use data stored in the memory 110 or may store new data in the memory 110. Also, the processor 120 may execute a program installed in the memory 110. Also, the processor 120 may install an application received from outside in the memory 110.

The processor 120 may include at least one processing module. The processor 120 may control other components included in the mobile apparatus 100 to perform an operation corresponding to a user input received through the user interface device 130.

The user interface device 130 may receive a user input or the like from the user. The user interface device 130 may display information such as an application execution result, a processing result corresponding to a user input, and a state of the mobile apparatus 100 in the mobile apparatus 100. The user interface device 130 may include hardware units for receiving an input from the user or providing an output from the mobile apparatus 100 and may include a dedicated software module for driving the hardware units. For example, the user interface device 130 may be a touch screen but is not limited thereto.

The memory 110 may store instructions executable by the processor 120. The processor 120 may execute the instructions stored in the memory 110. The processor 120 may execute the application installed in the mobile apparatus 100 according to a user input.

The processor 120 may determine a cluster including an input word input through the user interface device 130 among a plurality of clusters generated by clustering words on a vector space. The processor 120 may retrieve a first similar word corresponding to the input word from the cluster including the input word based on a precalculated similarity between words included in the cluster including the input word. The processor 120 may retrieve a second similar word corresponding to the input word from another cluster other than the cluster including the input word based on a similarity between the input word and words included in the another cluster other than the cluster including the input word. The processor 120 may provide a similar word corresponding to the input word based on the retrieved first similar word and the retrieved second similar word. The processor 120 may display the similar word corresponding to the input word through the user interface device 130 or may transmit the similar word to a processing module for performing another operation.

Meanwhile, the processor 120 may operate according to various methods of retrieving a similar word corresponding to an input word, which will be described below in detail.

FIG. 2 is a diagram illustrating a process of retrieving a similar word corresponding to an input word based on clustering of words on a vector space according to an embodiment.

Through a word embedding process of representing a word by vectorization, words may be treated as being on a vector space. In this case, a Word2Vec (Word embedding to Vector) model may be used as a word embedding model for word embedding. Meanwhile, by using the word embedding model, vectors corresponding to predetermined word groups may be pre-provided in a word-vector database.

Vectors of words similar to each other may be distributed at close positions on a vector space. In other words, vectors close to each other on a vector space may be similar words having a high similarity with respect to each other. Thus, a clustering operation may be performed to classify the vectors on the vector space into several groups based on the position distribution of vectors or the similarity between words. In the case of the words included in each cluster, by precalculating a similarity therebetween, a similar word may be rapidly retrieved from the cluster, and memory usage may be greatly reduced.

Meanwhile, when there are a plurality of clusters generated by clustering words on a vector space, in order to retrieve a similar word with respect to an input word, similar word retrieval may be performed in the order of from a cluster including the input word to another cluster other than the cluster including the input word, as illustrated in FIG. 2. Also, similar word retrieval may not be performed on the cluster expected to include words having an extremely low similarity with respect to the input word.

FIG. 3 is a diagram illustrating a similarity matrix representing a precalculated similarity between words included in a cluster according to an embodiment.

As described above with reference to FIG. 2, in order to retrieve a similar word with respect to an input word, a similarity with respect to the input word may be determined starting from the words included in a cluster including the input word. In this case, a similarity matrix may be used, which was generated in the process of clustering words on a vector space and represents a precalculated similarity between the words included in a cluster. FIG. 3 illustrates a similarity matrix having lower values for words with higher similarities and having higher values for words with lower similarities; however, the present disclosure is not limited thereto.

FIG. 4 is a flowchart illustrating a method of providing a similar word corresponding to an input word according to an embodiment.

In operation S410, the mobile apparatus 100 may determine a cluster including an input word among a plurality of clusters generated by clustering words on a vector space.

In operation S420, the mobile apparatus 100 may retrieve a first similar word corresponding to the input word from the determined cluster based on a precalculated similarity between the words included in the cluster determined as the cluster including the input word.

In operation S430, the mobile apparatus 100 may retrieve a second similar word corresponding to the input word from another cluster based on a similarity between the input words and the words included in another cluster other than the cluster determined as the cluster including the input word.

In operation S440, the mobile apparatus 100 may provide a similar word corresponding to the input word based on the retrieved first similar word and the retrieved second similar word.

FIG. 5 is a detailed flowchart illustrating an operation of retrieving a first similar word and an operation of retrieving a second similar word, in case of retrieving a similar word based on a range corresponding to a similarity, according to an embodiment.

In operation S510, the mobile apparatus 100 may retrieve the first similar word within a predetermined range (allowance) corresponding to a reference similarity for determining similarity from the input word among the words included in the cluster determined as the cluster including the input word, by using a similarity matrix representing a precalculated similarity between the words included in a cluster.

FIG. 6 is a diagram illustrating a process of retrieving a first similar word from a cluster including an input word, in case of retrieving a similar word based on a range corresponding to a similarity, according to an embodiment.

Referring to FIG. 6, a word within a predetermined range corresponding to a reference similarity for determining similarity from an input word may be retrieved as a first similar word. In other words, a word that is in the same cluster as an input word and is within a predetermined range corresponding to a reference similarity for determining similarity may be determined as a first similar word corresponding to the input word.

Referring again to FIG. 5, in operation S520, the mobile apparatus 100 may retrieve the second similar word within a predetermined range from the input word among the words included in another cluster within a predetermined range of a minimum distance from the input word to another cluster other than the cluster determined as the cluster including the input word.

FIG. 7 is a diagram illustrating a process of retrieving a second similar word from another cluster other than a cluster including an input word, in case of retrieving a similar word based on a range corresponding to a similarity, according to an embodiment.

FIG. 7 illustrates retrieving a second similar word according to whether a minimum distance from an input word to another cluster other than a cluster including the input word is within a predetermined range corresponding to a reference similarity for determining similarity. The minimum distance from the input word to another cluster other than the cluster including the input word may be calculated by using a triangle inequality, by subtracting the distance between the center word of another cluster and the word located at a maximum distance from the center word of another cluster from the distance between the input word and the center word of another cluster.

When the minimum distance from the input word to another cluster other than the cluster including the input word exceeds a predetermined range corresponding to a reference similarity for determining similarity, the second similar word may not be retrieved from the corresponding cluster.

On the other hand, when the minimum distance from the input word to another cluster other than the cluster including the input word is within a predetermined range corresponding to a reference similarity for determining similarity, the second similar word may be retrieved from the corresponding cluster. In this case, when the distance between the input word and the center word of another cluster other than the cluster including the input word is within a predetermined range corresponding to a reference similarity for determining similarity, a similarity may be calculated with respect to all words included in another cluster. On the other hand, when the distance between the input word and the center word of another cluster other than the cluster including the input word exceeds a predetermined range corresponding to a reference similarity for determining similarity, it is determined whether each word is located within a predetermined range from the word located at a maximum distance from the center word of another cluster (i.e., the word having a lowest similarity with respect to the center word) among the words included in another cluster, and when it is initially determined that all words located at a same distance from the center word are outside a predetermined range, the second similar word may not be retrieved with respect to all words located within the initially-determined distance from the center word.

FIG. 8 is a detailed flowchart illustrating an operation of retrieving a first similar word and an operation of retrieving a second similar word, in case of retrieving a similar word based on a predetermined number according to a similarity order, according to an embodiment.

In operation S810, the mobile apparatus 100 may retrieve the first similar word up to the order corresponding to a predetermined number in the similarity order with respect to the input word among the words included in the cluster determined as the cluster including the input word, by using a similarity matrix representing a precalculated similarity between the words included in a cluster. The mobile apparatus 100 may store a similarity corresponding to each of the first similar words in a priority queue.

FIG. 9 is a diagram illustrating a process of retrieving a first similar word from a cluster including an input word, in case of retrieving a similar word based on a predetermined number according to a similarity order, according to an embodiment.

Referring to FIG. 9, first similar words up to the order corresponding to a predetermined number according to the similarity order with respect to the input word among the words included in the cluster including the input word may be retrieved. In this case, the predetermined number may be set by the user. In FIG. 9, the predetermined number is '7'. Referring to FIG. 9, it may be seen that the words up to the seventh order having a high similarity with respect to the input word among the words included in the cluster including the input word are determined as the first similar words. This is a result of repeating a method in which, with respect to the words included in the cluster including the input word, when the similarity with respect to the input word is higher than the lowest similarity of the priority queue, the similarity of the current word is input into the priority queue and the similarity of the word having the lowest similarity is deleted. The similarities corresponding respectively to seven words determined as the first similar words may be stored in the priority queue, and may be stored in the priority queue in order from the highest-similarity word to the lowest-similarity word. Since the size of the priority queue is equal to the predetermined number, it may be '7'. Referring to FIG. 9, it may be seen that the highest similarity stored in the priority queue is '2' and the lowest similarity is '8'.

Referring again to FIG. 8, in operation S820, the mobile apparatus 100 may retrieve the second similar word more similar to the input word than the lowest similarity stored in the current priority queue among the words included in another cluster within a range corresponding to the lowest similarity stored in the priority queue in terms of the minimum distance from the input word to another cluster. The mobile apparatus 100 may update the current priority queue based on the retrieved second similar word.

FIG. 10 is a diagram illustrating a process of retrieving a second similar word from another cluster other than a cluster including an input word, in case of retrieving a similar word based on a predetermined number according to a similarity order, according to an embodiment.

When the minimum distance from the input word to another cluster other than the cluster including the input word exceeds a range corresponding to the lowest similarity stored in the priority queue, the second similar word may not be retrieved from the corresponding cluster. In this case, the minimum distance from the input word to another cluster other than the cluster including the input word may be calculated by using a triangle inequality.

On the other hand, when the minimum distance from the input word to another cluster other than the cluster including the input word is within a range corresponding to the lowest similarity stored in the priority queue, the second similar word may be retrieved from the corresponding cluster. In this case, the second similar word corresponding to the input word may be retrieved from the cluster having a small minimum distance from the input word to another cluster other than the cluster including the input word. The second similar word more similar to the input word than the lowest similarity stored in the current priority queue among the words included in another cluster may be retrieved, and the current priority queue may be updated based on the retrieved second similar word. In this case, it is determined whether each word is more similar to the input word than the lowest similarity stored in the current priority queue starting from the word located at a maximum distance from the center word of another cluster (i.e., the word having a lowest similarity with respect to the center word) among the words included in another cluster, and when it is initially determined that all words located at a same distance from the center word are lower than the lowest similarity stored in the current priority queue, the second similar word may not be retrieved with respect to all words located within the initially-determined distance from the center word.

FIG. 11 is a diagram illustrating a program code of an algorithm for retrieving a similar word based on a range corresponding to a similarity, according to an embodiment.

The algorithm program code illustrated in each of FIGS. 11 and 12 is merely an example, and may be modified according to the above descriptions and is not limited thereto.

The above embodiments of the method of providing a similar word corresponding to an input word may be provided in the form of a computer program or application stored in a computer-readable storage medium to cause a computer to perform the method of providing a similar word corresponding to an input word.

The above embodiments may be implemented in the form of a computer-readable recording medium storing a computer-executable instruction and data. At least one of the instruction and the data may be stored in the form of program code and may, when executed by a processor, generate a predetermined program module to perform a predetermined operation. The computer-readable recording medium may be Read-Only Memory (ROM), Random-Access Memory (RAM), flash memories, Compact Disk Read-Only Memory (CD-ROM), Compact Disk Recordable (CD-R), CD+R, Compact Disk Rewritable (CD-RW), CD+RW, Digital Versatile Disk Read-Only Memory (DVD-ROM), Digital Versatile Disk Recordable (DVD-R), DVD+R, Digital Versatile Disk Rewritable (DVD-RW), DVD+RW, Digital Versatile Disk Random-Access Memory (DVD-RAM), Blu-ray Disk Read-Only Memory (BD-ROM), Blu-ray Disk Recordable (BD-R), Blu-ray Disk Recordable Low to High (BD-R LTH), Blu-ray Disk Recordable Erasable (BD-RE), magnetic tapes, floppy disks, magneto-optical data storages, optical data storages, hard disks, Solid-State Disk (SSD), or any device that may store instructions or software, related data, data files, and data structures and may provide instructions or software, related data, data files, and data structures to a processor or computer to enable the processor or computer to execute instructions.

The present disclosure has been described above with reference to the embodiments. However, those of ordinary skill in the art will understand that various changes in form and details may be made therein without departing from the spirit and scope of the present disclosure as defined by the appended claims. Therefore, the described embodiments should be considered in descriptive sense only and not for purposes of limitation. Thus, the scope of the present disclosure may be defined not by the above descriptions but by the appended claims, and all differences within the equivalent scope thereof will be construed as being included in the scope of the present disclosure.

It should be understood that embodiments described herein should be considered in a descriptive sense only and not for purposes of limitation. Descriptions of features or aspects within each embodiment should typically be considered as available for other similar features or aspects in other embodiments.

While one or more embodiments have been described with reference to the figures, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the disclosure as defined by the following claims.

Claims

A method of providing a similar word corresponding to an input word, the method comprising:

determining a cluster including an input word among a plurality of clusters generated by clustering words on a vector space;

retrieving a first similar word corresponding to the input word from the determined cluster based on a precalculated similarity between words included in the determined cluster;

retrieving a second similar word corresponding to the input word from another cluster other than the determined cluster based on a similarity between the input word and words included in the another cluster; and

providing a similar word corresponding to the input word based on the retrieved first similar word and the retrieved second similar word.
The method of claim 1, wherein in case of retrieving a similar word based on a range corresponding to a similarity,

the retrieving of the first similar word retrieves the first similar word within a predetermined range corresponding to a reference similarity for determining similarity from the input word among the words included in the determined cluster, by using a similarity matrix representing a precalculated similarity between words included in a cluster, and

the retrieving of the second similar word retrieves the second similar word within the predetermined range from the input word among the words included in the another cluster within the predetermined range of a minimum distance from the input word to the another cluster.
The method of claim 2, wherein the retrieving of the second similar word determines, starting from a word located at a maximum distance from a center word of the another cluster among the words included in the another cluster, whether each word is located within the predetermined range, and when it is initially determined that all words located at a same distance from the center word are outside the predetermined range, does not retrieve the second similar word with respect to all words located within the initially-determined distance from the center word.
The method of claim 1, wherein in case of retrieving a similar word based on a predetermined number according to a similarity order,

the retrieving of the first similar word retrieves the first similar word up to an order corresponding to the predetermined number in the similarity order with respect to the input word among the words included in the determined cluster, by using a similarity matrix representing a precalculated similarity between words included in a cluster, and stores a similarity corresponding to each of the first similar words in a priority queue, and

the retrieving of the second similar word retrieves the second similar word more similar to the input word than a lowest similarity stored in a current priority queue among the words included in the another cluster within a range corresponding to a lowest similarity stored in the priority queue in terms of a minimum distance from the input word to the another cluster, and updates the current priority queue based on the retrieved second similar word.
The method of claim 4, wherein the retrieving of the second similar word determines, starting from a word located at a maximum distance from a center word of the another cluster among the words included in the another cluster, whether each word is more similar to the input word than the lowest similarity stored in the current priority queue, and when it is initially determined that all words located at a same distance from the center word are lower than the lowest similarity stored in the current priority queue, does not retrieve the second similar word with respect to all words located within the initially-determined distance from the center word.
A mobile apparatus for providing a similar word corresponding to an input word, the mobile apparatus comprising:

a user interface device;

a memory storing a computer-executable instruction; and

a processor executing the computer-executable instruction to determine a cluster including an input word input through the user interface device among a plurality of clusters generated by clustering words on a vector space, retrieve a first similar word corresponding to the input word from the determined cluster based on a precalculated similarity between words included in the determined cluster, retrieve a second similar word corresponding to the input word from another cluster other than the determined cluster based on a similarity between the input word and words included in the another cluster, and provide a similar word corresponding to the input word based on the retrieved first similar word and the retrieved second similar word.
The mobile apparatus of claim 6, wherein in case of retrieving a similar word based on a range corresponding to a similarity, the processor

retrieves the first similar word within a predetermined range corresponding to a reference similarity for determining similarity from the input word among the words included in the determined cluster, by using a similarity matrix representing a precalculated similarity between words included in a cluster, and

retrieves the second similar word within the predetermined range from the input word among the words included in the another cluster within the predetermined range of a minimum distance from the input word to the another cluster.
The mobile apparatus of claim 7, wherein the processor determines, starting from a word located at a maximum distance from a center word of the another cluster among the words included in the another cluster, whether each word is located within the predetermined range, and when it is initially determined that all words located at a same distance from the center word are outside the predetermined range, does not retrieve the second similar word with respect to all words located within the initially-determined distance from the center word.
The mobile apparatus of claim 6, wherein in case of retrieving a similar word based on a predetermined number according to a similarity order, the processor

retrieves the first similar word up to an order corresponding to the predetermined number in the similarity order with respect to the input word among the words included in the determined cluster, by using a similarity matrix representing a precalculated similarity between words included in a cluster, and stores a similarity corresponding to each of the first similar words in a priority queue, and

retrieves the second similar word more similar to the input word than a lowest similarity stored in a current priority queue among the words included in the another cluster within a range corresponding to a lowest similarity stored in the priority queue in terms of a minimum distance from the input word to the another cluster, and updates the current priority queue based on the retrieved second similar word.
The mobile apparatus of claim 9, wherein the processor determines, starting from a word located at a maximum distance from a center word of the another cluster among the words included in the another cluster, whether each word is more similar to the input word than the lowest similarity stored in the current priority queue, and when it is initially determined that all words located at a same distance from the center word are lower than the lowest similarity stored in the current priority queue, does not retrieve the second similar word with respect to all words located within the initially-determined distance from the center word.
A non-transitory computer-readable storage medium having stored therein processor-executable instructions comprising:

instructions for determining a cluster including an input word among a plurality of clusters generated by clustering words on a vector space;

instructions for retrieving a first similar word corresponding to the input word from the determined cluster based on a precalculated similarity between words included in the determined cluster;

instructions for retrieving a second similar word corresponding to the input word from another cluster other than the determined cluster based on a similarity between the input word and words included in the another cluster; and

instructions for providing a similar word corresponding to the input word based on the retrieved first similar word and the retrieved second similar word.