AU2005201818A1 - Method for Navigating and Displaying Complex Information - Google Patents
Method for Navigating and Displaying Complex Information Download PDFInfo
- Publication number
- AU2005201818A1 AU2005201818A1 AU2005201818A AU2005201818A AU2005201818A1 AU 2005201818 A1 AU2005201818 A1 AU 2005201818A1 AU 2005201818 A AU2005201818 A AU 2005201818A AU 2005201818 A AU2005201818 A AU 2005201818A AU 2005201818 A1 AU2005201818 A1 AU 2005201818A1
- Authority
- AU
- Australia
- Prior art keywords
- data
- visualization
- depth
- column
- attributes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims description 112
- 238000004590 computer program Methods 0.000 claims description 4
- 238000012800 visualization Methods 0.000 description 277
- 238000007405 data analysis Methods 0.000 description 27
- 238000004422 calculation algorithm Methods 0.000 description 23
- 230000008859 change Effects 0.000 description 21
- 238000013479 data entry Methods 0.000 description 19
- 230000009471 action Effects 0.000 description 15
- 238000000547 structure data Methods 0.000 description 15
- 239000003550 marker Substances 0.000 description 14
- 238000013459 approach Methods 0.000 description 13
- 230000008569 process Effects 0.000 description 10
- 238000003860 storage Methods 0.000 description 10
- 230000004075 alteration Effects 0.000 description 9
- 238000004364 calculation method Methods 0.000 description 9
- 238000010276 construction Methods 0.000 description 9
- 238000013500 data storage Methods 0.000 description 9
- 238000004458 analytical method Methods 0.000 description 8
- 238000007792 addition Methods 0.000 description 6
- 230000007246 mechanism Effects 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 238000003825 pressing Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000012508 change request Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000000750 progressive effect Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 101150037092 CHLD gene Proteins 0.000 description 1
- 101100129500 Caenorhabditis elegans max-2 gene Proteins 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013079 data visualisation Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000012432 intermediate storage Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000007794 visualization technique Methods 0.000 description 1
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- User Interface Of Digital Computer (AREA)
Description
S&F Ref: 707044
AUSTRALIA
PATENTS ACT 1990 COMPLETE SPECIFICATION FOR A STANDARD PATENT Name and Address of Applicant: Actual Inventor(s): Address for Service: Invention Title: Canon Kabushiki Kaisha, of 30-2, Shimomaruko 3-chome, Ohta-ku, Tokyo, 146, Japan Geoffrey Charron Field Evelene Choi Chi Ma Alexander Will Spruson Ferguson St Martins Tower Level 31 Market Street Sydney NSW 2000 (CCN 3710000177) Method for Navigating and Displaying Complex Information The following statement is a full description of this invention, including the best method of performing it known to me/us:- 5845c N METHOD FOR NAVIGATING AND DISPLAYING COMPLEX INFORMATION (Field of the Invention The present invention relates to the presentation of data in a computer 00 5 environment and in particular to the presentation of data by means of a non-hierarchical visualization and the navigation through the non-hierarchical visualizations.
Background Databases in use today are becoming more complicated, as both the amount of the data being stored and the complexity of the data being stored is rapidly increasing.
As such, when a user would like to browse through databases, the amount and complexity of the data involved is usually significant. This makes the task of presenting the entire data content to a user in an understandable and timely manner more difficult.
Existing arrangements exemplify that the presentation task can be made considerably easier by employing a navigation task to reduce the amount of data to be presented. The visualization presented to the user can be changed so that it displays less data. A means of navigation can also be introduced to access the data currently not displayed. There are currently a number of general approaches which have been employed in order to accomplish the presentation of complex data in this way.
The first approach used is the visualization of complex data by means of a hierarchical technique. This is mainly used in the visualization of data obtained from hierarchical databases, but can be extended to relational databases. The hierarchical techniques include basic tree visualizations, cone-tree visualizations, tree-map visualizations and hyperbolic tree visualizations, to mention a few. These types of visualization are very good at illustrating the structural relationships within the databases 707044.doc -2- ,iA and between individual data items. However, the ultimate aim for the majority of users is to identify patterns within the data content of a group of items or associations between the data content of groups of items. This type of visualization fails to achieve this aim due to the following issues: 00 5 (1 a) Users often find the hierarchical visualizations difficult to comprehend.
O
Given their structure and complexity, users often find it difficult to quickly and easily obtain useful information about the data content of both items and groups of items.
Further, it is undesirable for the user to miss identifying associations between the data content of both items and groups of items because, if such has occurred, the user has incorrectly interpreted the visualization or perhaps worse, the user may be misled into believing associations exist where they do not.
(lb) Data sources are increasing in size, making it difficult to present an entire data source in a simple, GUI display-sized hierarchical visualization. There are techniques which have been adopted for this need, such as the use of a fish-eye viewing techniques, hyperbolic space techniques, zooming, scrolling and so forth, or combinations thereof. However, these often just add to the level of complexity in the visualization, making it even harder for the average user to comprehend and thus, make use of the data content held within the visualization.
(l c) Navigating around the data items within this visualization is often a tedious and time consuming process for the user. The way that the user navigates around the visualization is usually via a GUI which allows for the entire data content to be examined by simply clicking or double clicking on the item of interest which changes the focus and visualization as appropriate. The main limitation with this navigation process is that finding the actual data content of the item of interest is tedious and time 707044.doc -3consuming, especially if the item being searched is hierarchically distant, for example if the present focus is upon a node at one side of an expression tree and the navigation takes the focus to the other side of the expression tree. This is due to the required steps of scrolling, zooming and so forth (visualization variant and data source dependent) and 00 00 5 clicking on nodes till the data content in the item of interest can be sighted. The reason
O
Ni, for this is that in all variants of this visualization, the data content can often not be seen until the item currently under scrutiny is close to, adjacent to or even the exact item of interest. The navigation process problem is compounded when attempting to find the data content of a group of items as it is identical to finding the data content of a single item multiple times. To add further complication, comparisons of the data content for associated items or groups of items is equivalent to performing multiple finds, remembering the data content discovered for each, and mentally performing the comparison in question.
The second approach used is the visualization of complex data by means of a non-hierarchical technique such as a table, bar graphs, pie graphs, plots and so forth.
This is mainly used in the visualization of data obtained from relational databases.
Various attempts have also been made at applying this technique to data obtained from hierarchical databases. This solution is better to use in terms of presenting the information to an average user so that it can be easily interpreted by the user as they are more familiar with these types of visualization and can easily identify patterns within the data content of a group of items or associations between the data content of groups of items presented. However, there still exist a number of significant issues making conventional approaches to non-hierarchical visualizations of complex data problematic.
These issues include the following: 707044.doc -4- I (2a) Navigation around the data items, sourced from relational or flattened hierarchical data sources, is not really treated within these visualization types. At best, for a tabular visualization, the entire data retrieved is displayed and then the user must zoom or scroll around the entire data content of the visualization. For another type of oo00 5 non-hierarchical visualization, at best, a selection of all the data sourced could be made to produce the relevant graph or chart.
(2b) This form of visualization tends to produce numerous repetitions of the data content in order to achieve the embedded 1:n one to many) relationships. There is currently no way of limiting a non-hierarchical visualization so as to only display those repetitions necessary for the limits placed on the visualization. This not only results in numerous superfluous duplications of the data to be displayed, but further adds to the level of unnecessary complexity presented to the user.
A third approach used is to employ a combination of the two approaches above.
This is mainly used in the visualization of data obtained from hierarchical databases, where the visualization of complex data itself is achieved by means of a non-hierarchical technique such as a table, bar graphs, pie graphs, plots and so forth with an extension to use a small hierarchical visualization for navigational purposes. The small hierarchical visualization allows the user to select data which limits the data content used in the nonhierarchical visualization. Here, like the second approach, the advantage is that it is a more ideal solution to use in terms of presenting the information to a novice user so that it can be easily interpreted. Novice users are more familiar with these types of visualization and can easily identify patterns within the data content of a group of items or associations between the data content of groups of items presented. However, there 707044.doc ,i still exist a number of significant issues making this approach to non-hierarchical visualizations of complex data problematic. These issues include the following: ,i (3a) Navigation around the data items, sourced from relational or flattened hierarchical data sources, has been accomplished by means of a hierarchical navigation 00 oo00 5 bar, but is still limited. At best, for a tabular visualization, the selected data groups from
O
the data sourced is displayed but they are not necessarily related and may still require (Ni some form of zoom or scroll. For another type of non-hierarchical visualization, at worst, empty graphs or charts may be produced by novice users.
(3b) This form of visualization and navigation tends to produce numerous repetitions of the data content in order to achieve the embedded 1 to n (one to many) relationships. There is currently no way of limiting a non-hierarchical visualization so as to only display those repetitions necessary for the limits placed on the visualization.
This not only results in numerous superfluous duplications of the data to be displayed, but further adds to the level of unnecessary complexity presented to the user.
Numerous extensions to this problem set have been touched on, but mostly in theoretical terms with the data abstraction space and visual abstraction space.
Consequentially, there is a need for a way of presenting data that is easily understandable and easy to navigate.
Summary It is an object of the present invention to substantially overcome or at least ameliorate one or more problems of existing arrangements.
In accordance with one aspect of the present disclosure, there is provided a method of displaying data from at least one data source, said method comprising the steps of: 707044.doc -6- I selecting data attributes desired to be duplicated from said data sources for display, wherein said data attributes have values in a one to many relationship; S(ii) determining unique values in the data from said data sources for each said selected data attribute by removing redundant representations of the data; and 00 5 (iii) displaying those data values determined to be unique.
O
In accordance with another aspect of the present disclosure, there is provided a method for displaying data from at least one data source, said method comprising the steps of: analysing the data sources to identify attributes thereof that are related by a one-to-many relationship; assigning a measure to each of the identified attributes according to the one-to-many relationship therebetween; forming a display representation including at least one selectable indicator of the assigned measures; and for at least one selected measure, forming a corresponding representation of the corresponding ones of said attributes.
Specific implementations of these methods permit convenient navigation through data sources across different levels thereof without prior knowledge of the contents or structure of the data source.
Brief Description of the Drawings At least one embodiment of the present invention will now be described with reference to the drawings in which: Fig. 1 is a schematic block diagram of a general purpose computer upon which the arrangements described can be practiced; 707044.doc -7- N Fig. 2 illustrates a configuration where heterogeneous data sources are available both from a local network, and over the Internet; (Fig. 3 shows components of a system for presenting complex data and navigating through the presentation; 00 5 Fig. 4 illustrates an example of a typical workflow the system may follow in N order to present complex data and navigate through the presentation; Fig. 5 illustrates an example of a typical workflow the system may follow in order to store incoming data; Fig. 6 is a flowchart of an algorithm followed by the data storage component in order to populate the structure data; Fig. 7A is a flowchart of the first part of an algorithm followed by the data storage component in order to populate the data store; Fig. 7B is a flowchart of the second part of an algorithm followed by the data storage component in order to populate the data store; Fig. 8 is a flowchart of an algorithm followed by the data storage component in order to determine the relationship between two structure nodes; Fig. 9 illustrates an example of a typical workflow the system may follow in order to analyse the information stored by the data storage component; Fig. 10 is a flowchart of an algorithm followed by the data analysis component in order to generate the entire flat data, the type data, the column depth data and the interim data; Fig. 11 is a flowchart of an algorithm followed by the data analysis component in order to generate a row of the flat data, the type data, the column depth data and the interim data; 707044.doc -8- C, Fig. 12 is a flowchart of an algorithm followed by the data analysis component in order to generate the unique data, the intersect data and the row depth data; N Fig. 13 is a flowchart of an algorithm followed by the data analysis component in order to generate the proportion data; 00 5 Fig. 14 provides an example of hierarchical data (XML format) from which
O
Ni presentation and navigation information can be constructed;
V')
0Fig. 15 provides an illustration of the hierarchical data (XML result tree format) created from that in Fig. 14, from which presentation and navigation information can be constructed; Fig. 16A provides an illustration of the structure data constructed by executing the algorithm shown in Fig. 6 on the XML result tree in Fig. Fig. 16B provides an illustration of the data store constructed by executing the algorithm shown in Figs. 7A and 7B on the XML result tree and structure data shown in Figs. 15 and 16A; Figs. 17A to 17D provide an illustration of the four stages of construction of the flat data by executing the algorithm shown in Figs. 10 and 11 on the data store shown in Fig. 16B; Figs. 18A to 18D provide an illustration of the four stages of construction of the interim data by executing the algorithm shown in Figs. 10 and 11 on the data store shown in Fig. 16B; Fig. 19A provides an illustration of the type data constructed by executing the algorithm shown in Figs. 10 and 11 on the data store shown in Fig. 16B; Fig. 19B provides an illustration of the column depth data constructed by executing the algorithm shown in Figs. 10 and 11 on the data store shown in Fig. 16B; 707044.doc -9- C, Fig. 20 provides a slightly more complete example of hierarchical data (XML format) for which presentation and navigation information is constructed; N Fig. 21A provides an illustration of the flat data constructed by executing the algorithm shown in Figs. 10 and 11 on the data store created from the XML data shown OO 5 in Fig. (Ni Fig. 21B provides an illustration of the interim data constructed by executing the algorithm shown in Figs. 10 and 11 on the data store data created from the XML data shown in Fig. Fig. 21C provides an illustration of the type data constructed by executing the algorithm shown in Figs. 10 and 11 on the data store data created from the XML data shown in Fig. Fig. 21D provides an illustration of the column depth data constructed by executing the algorithm shown in Figs. 10 and 11 on the data store data created from the XML data shown in Fig. Fig. 22A provides an illustration of the row depth data constructed by executing the algorithm shown in Fig. 12 on the interim data and column depth data shown in Figs. 21B and Fig. 21D respectively; Fig. 22B provides an illustration of the unique data constructed by executing the algorithm shown in Fig. 12 on the interim data shown in Fig. 21B; Fig. 22C provides an illustration of the intersect data constructed by executing the algorithm shown in Fig. 12 on the interim data shown in Fig. 21B; Fig. 23 provides an illustration of the proportion data constructed using depths 0 and 1, and executing the algorithm shown in Fig. 13 on the column depth data, unique data and intersect data shown in Figs. 21 D, 22B and 22C respectively; 707044.doc ,i Fig. 24 provides an illustration of the visualization order data constructed using depths 0 and 1, the type data and proportion data shown in Figs. 21C and 23 respectively; Fig. 25 illustrates one possible graphical display used to change the type of nonoo00 5 hierarchical visualization used and its construction; N Fig. 26 illustrates another possible graphical display used to change the type of non-hierarchical visualization used and its construction; Fig. 27 is intentionally absent; Figs. 28A to 28E illustrate one possible graphical display used to control navigation, showing a choice of depths: 0 and 1; 0; 1 and 2; 0, 1 and 2; and 0 and 2 respectively, using the data from Fig. 21D; Fig. 29 illustrates another possible graphical display used to control navigation, showing a choice of depths 0 and 1, and using the data from Fig. 21D; Fig. 30 illustrates another possible graphical display used to control navigation, showing a choice of depths 0 and 1, and using the data from Fig. 21D; Fig. 31 illustrates another possible graphical display used to control navigation, showing a choice of depths 0 and 1, and using the data from Fig. 21D; Fig. 32 illustrates another possible graphical display used to control navigation, showing a choice of depths 0 and 1, and using the data from Fig. 21D; Figs. 33A to 33E illustrate examples of a tabular visualizations created from the flat data, column depth data and row depth data shown in Figs. 21A, 21D and 22A respectively, using the depths: 0 and 1; 0 only; 1 and 2; 0, 1 and 2; and 0 and 2 respectively; 707044.doc -11r, Figs. 34A to 34C illustrates an example and two alternate examples of a plot visualizations created from the flat data, column depth data and row depth data shown N in Figs. 21A, 21D and 22A respectively, using the depths 0 and 1; Fig. 35 illustrates an example of a bar visualization created from the flat data, 00 5 column depth data and row depth data shown in Figs. 21A, 21D and 22A respectively,
O
N, using the depths 0 and 1; Figs. 36A to 36B illustrates an example and an alternate example of pie visualizations created from the flat data, column depth data and row depth data shown in Figs. 21A, 21D and 22A respectively, using the depths 0 and 1; and Figs. 37A and 37B provide an example of using additional indicators to view information from a one-to-many data source.
Detailed Description including Best Mode 1. Overview The intention for the detailed description described herein is to provide the user with a means of easily visualizing and navigating through the entire content of complex data. The content is displayed as a non-hierarchical visualization tabular, bar, pie, plot etc.) with the system being capable of being extended to include other visualizations. The content is easily navigated by exploiting the relationships existing within the complex data, such that the amount of data viewed is limited by the number of relationships existing within the complex data to be viewed. This provides a powerful extension to previously existing applications of its kind.
Fig. 2 shows a system 200 for obtaining complex data to be presented and navigating through manipulation of a client computer system 202. The client computer 202 executes a software application which presents the complex data to a user through a 707044.doc 12graphical user interface via a display device (not seen in Fig. The user can navigate through the presentation by manipulating the graphical user interface via the display device in order to view the entire contents of the complex data. Such complex data is obtained from the data sources which may be available over a local network connection 00 5 212 shown as data sources 204 and 206, or from an external network, such as the
O
N Internet connection 214, shown as data sources 208 and 210. A data source may be (Ni either a data file, or a computer server capable of receiving requests and responding with the requested information. As a consequence of actions taken by the user, the software on the client computer system 202 issues a structured query to the data sources, from which the results are obtained. In the described implementations, the Extensible Markup Language (XML) format will be considered as a preferred format for the results, and the W3C XQuery language shall be considered as a preferred language for the structured query. The XQuery language provides constructs which permit the specification of the location of data sources, and the extraction and collation of parts of XML documents served by such data sources. The location of a data source is specified using the "doc" primitive, which takes as a parameter a Uniform Resource Identifier (URI). Those skilled in the art can see that this can easily be applied to any other form of data source with an appropriate means of querying the data source.
An exemplary implementation of the client computer system 202 is shown in Fig. 1 wherein the processes of Figs. 3 to 37C may be implemented as software, such as an application program 300 executing within the computer system 202. In particular, the steps of data source visualisation are effected by instructions in the software application 300 that are carried out by the computer. The instructions may be formed as one or more code modules, each for performing one or more particular tasks. The software 707044.doc 13 1 application 300 may also be divided into separate parts, in which one part performs the visualisation methods and another part manages a user interface between the first part and the user. The software may be stored in a computer readable medium, including the storage devices described below, for example. The software is loaded into the computer o00 5 from the computer readable medium, and then executed by the computer. A computer
O
N readable medium having such software or computer program recorded on it is a computer program product. The use of the computer program product in the computer preferably effects an advantageous apparatus for visualising data sources. Arrangements structurally and to a point functionally equivalent to the computer system 202 may be used, where appropriate, to implement one or more of the data sources 204-210, as would be understood by skilled persons. It will be understood that the software application 300 for data visualisation need not be implemented upon the data sources 204-210 unless those sources were clients for other sources.
The computer system 202 is formed by a computer module 101, input devices such as a keyboard 102 and mouse 103, output devices including a printer 115, a display device 114 and loudspeakers 117. A Modulator-Demodulator (Modem) transceiver device 116 is used by the computer module 101 for communicating to and from a communications network 212, 214, for example connectable via a telephone line 121 or other functional medium. The modem 116 can be used to obtain access to the Internet, and other network systems, such as a Local Area Network (LAN) or a Wide Area Network (WAN), and may be incorporated into the computer module 101 in some implementations.
The computer module 101 typically includes at least one processor unit 105, and a memory unit 106, for example formed from semiconductor random access memory 707044.doc 14- (RAM) and read only memory (ROM). The module 101 also includes an number of input/output interfaces including an audio-video interface 107 that couples to the video display 114 and loudspeakers 117, an I/O interface 113 for the keyboard 102 and mouse 103 and optionally a joystick (not illustrated), and an interface 108 for the o00 5 modem 116 and printer 115. In some implementations, the modem 1116 may be N incorporated within the computer module 101, for example within the interface 108. A (ti storage device 109 is provided and typically includes a hard disk drive 110 and a floppy disk drive 111. A magnetic tape drive (not illustrated) may also be used. A CD-ROM drive 112 is typically provided as a non-volatile source of data. The components 105 to 113 of the computer module 101, typically communicate via an interconnected bus 104 and in a manner which results in a conventional mode of operation of the computer system 202 known to those in the relevant art. Examples of computers on which the described arrangements can be practised include IBM-PC's and compatibles, Sun Sparcstations or alike computer systems evolved therefrom.
Typically, the application program is resident on the hard disk drive 110 and read and controlled in its execution by the processor 105. Intermediate storage of the program and any data fetched from the network 120 may be accomplished using the semiconductor memory 106, possibly in concert with the hard disk drive 110. In some instances, the application program may be supplied to the user encoded on a CD-ROM or floppy disk and read via the corresponding drive 112 or 111, or alternatively may be read by the user from the network 120 via the modem device 116. Still further, the software application 300 can also be loaded into the computer system 202 from other computer readable media. The term "computer readable medium" as used herein refers to any storage or transmission medium that participates in providing instructions and/or 707044.doc data to the computer system 202 for execution and/or processing. Examples of storage media include floppy disks, magnetic tape, CD-ROM, a hard disk drive, a ROM or integrated circuit, a magneto-optical disk, or a computer readable card such as a PCMCIA card and the like, whether or not such devices are internal or external of the 00 oo00 5 computer module 101. Examples of transmission media include radio or infra-red
O
N transmission channels as well as a network connection to another computer or networked device, and the Internet or Intranets including e-mail transmissions and information recorded on Websites and the like.
Fig. 3 illustrates the components that constitute the software application 300 for presenting complex data and navigating through the presentation and which are executable on the computer system 202. A graphical user interface (GUI) component 302 is used to present information to the user and respond to actions taken by the user via an input device, such as the keyboard 102 or mouse 103. The GUI component 302 has two parts, being a graphical presentation component 304, which is responsible for the display of visualizations generated in the visualization components 308 or 316 upon the video display 114, and a graphical navigation component 306, which is responsible for the interaction with visualizations generated in visualization components 308 or 316, for the purpose of navigation through the data displayed and altering the way the data is displayed upon the video display 114. The visualization component 308 is involved in a specific implementation described below, and the alternate visualization component 316 is used to illustrate how alternate visualizations may be integrated with the software application 300. The visualization component 308 includes three parts being, a data analysis component 314, used to perform a number of analysis algorithms on the data generated by a data examination component 324, a visualization alteration component 707044.doc 16- N1 312, used to manage any alterations made to the visualization such as those made through the GUI component 302, and more specifically the graphical navigation component 306, and a visualization generation component 310 which is used to manage the creation of the visualization which will be presented in the graphical presentation 00 00 5 component 304. The software application 300 also includes a data storage component N, 318 having a data parsing component 320, used to parse an XML result tree in the data cache component 326, and a data examination component 324, used to examine a parsed XML result node from the data parsing component 320 and store within a data cache component 326. The data cache component 326 is used as a place to store all data used by the software application 300. A data retrieval component 328 is also included and used to request data from the data sources, for example data sources 204 to 210.
For the purposes of this document, the manner of visualization may be considered as the way in which a type of visualization is presented to a user. For example, a type of visualisation could be a plot or pie chart, whereas the manner in which a plot may be visualized will incorporate the data content presented in each of the axis of the plot and whether or not to use any data content for a data series. The manner is therefore a particular instance of a type.
Fig. 4 illustrates a method 400 including steps 402 to 428 followed in a typical implementation of the application 300 executed upon a computer system 202. The method 400 commences with an entry step 402, where the application 300 is loaded from the HDD 110 to the memory 106 for rapid access and execution by the processor 105 in concert with other components of the computer module 101. In step 404, in response to a user request for data, for example from a database, data is fetched from the data sources 204 to 210. The user request may be in the form of an XQuery provided to the 707044.doc 17application 300, by means of an user interface such as a file/open command or other similar approach known in the art. The XQuery is supplied to the data retrieval component 328, which then executes the XQuery. Execution of the XQuery typically involves transmitting the XQuery to the target data source 204-210 which then processes 00 5 the XQuery to return result data to the client computer system 202. Alternatively, the XQuery may be processed locally if the remote data source does not support that format.
(Ni When results are available, they are stored by the client computer system 202 in a data cache component 326 in XML format. From these cached results, the data retrieval component 328 constructs an XML result tree in the data cache component 326.
In step 406, the data parsing component 320 is then used to traverse through each XML result node in the XML result tree whereupon the data examination component 324 stores both the configuration of the XML result tree and data content and relationships contained within the XML result tree in the data cache component 326. For any one XQuery, step 404 need not be completed prior to step 406 commencing, as the data storage component 318 may parse and examine data as soon as any information is made available from the data retrieval component 328.
In step 408, the data analysis component 314 then analyses the data created by the data examination component 324 at step 406, to determine how the relationships and patterns of content within the returned data can be used. Again, step 406 need not be completed prior to step 408 commencing, since the data analysis component 314 may complete some of the data analysis as soon as any information is made available from the data examination component 324. In step 410, once all the complex data is obtained, the step operates to initialize the setting identifying what portion of the complex data is 707044.doc 18to be visualized and the way the complex data is to be visualized or, more specifically, what type of visualization is to be used tabular, bar, pie, plot etc.).
In step 412, if the portion of the complex data to visualize has been changed or initialized, the visualization generation component 310 generates data used to aid in the 00 o00 5 choice of visualizations and manner in which the chosen visualization is presented from the portion of the complex data being used. The portion of complex data can be used to (ti build and render a corresponding visualization of the type and manner selected in the visualization generation component 310 by the graphical presentation component 304 for presentation to the video display 114. This building and rendering of the visualization is described in greater detail in Section 4 below. As with the previous steps, it is not necessary to wait until the analysis has fully completed prior to the visualization of the complex data as indicated by the steps 408, 410 and 412 of Fig. 4. A default visualization tabular), may instead be used to progressively build and render the visualization as the data is being analysed. Alternatively the default visualization (e.g.
tabular), may instead be used to build and render the visualization in full by waiting until the first phase of data analysis as detailed in Section 3 below and then continuing with the data analysis. Both of these approaches allow for a faster visualization of the complex data.
In step 413, the application 300 waits for user interaction, via one of the input devices 102, 103 with the GUI component 302 or, more specifically, the GUIs generated by the graphical navigation component 306 and the graphical presentation component 304. When such an action is received from the user, control passes to step 414, being the first of a series of steps that analyses the user's requested action. Step 414 checks whether the user issued a type of visualization change request tabular, bar, pie, plot 707044.doc -19r, etc.). Typically the request received at step 413, is generated by means of the graphical user interface 302 which may provide an interface 2500 as shown in Fig. 25 or an interface 2600 shown in Fig. 26, or another similar interface. The graphical user interfaces of Figs. 25 and 26 are generated by the graphical presentation component 304, 00 5 whereby a dropdown list 2502 of the GUI 2500 or a dropdown list 2602 of the GUI 2600 allows for the selection of one of the possible visualization types. In the example GUIs (Ni 2500 and 2600 shown, the current selection in each instance is "plot" for the visualization or chart type. This may be changed to "table" for example, causing control to be passed to step 416. The operation of the interfaces shown in Figs. 25 and 26 is described in more detail in Section 5 below. In step 416, the visualization type is then set to the new visualization type "table" and control passes back to step 412 to build and render the portion of complex data to visualize as a "table". Control then passes through steps 413 where a user action is received and on to 414 where there is no change to the visualization type, resulting in the control passing to step 418.
In step 418, the user requested action may have been a manner of visualization change request data to be presented in x-axis, y.-axis, z-axis, data series as appropriate etc.), received at step 413 again via the GUIs 2500 or 2600. The GUI 2500 of Fig. 25 is generated by the graphical presentation component 304, whereby the columns of data in a list box 2512 can be dragged and dropped into various text boxes 2504, 2506, 2508 and 2510 corresponding with the x-axis, y-axis, z-axis and data series respectively, enabling the user to alter the manner in which the visualization type is presented. It will be appreciated that the text boxes illustrated in Fig. 25 correspond to the chart type "plot" presently selected. The text boxes accordingly change based upon the selected chart type. The GUI 2600 of Fig. 26 is generated by the graphical 707044.doc 20 r, presentation component 304, whereby a dropdown list 2606 allows for the selection of the dimension of the graph, a check box 2604 allows for the choice of whether to use a N data series in the graph, and alternate visualization selection buttons 2608 and 2610 automatically change the configuration of which columns of data to present in which 00 5 axis and data series. The operation of the interfaces shown in Figs. 25 and 26 is
O
N described in more detail in Section 5 below.
If an alteration was made to the manner of visualization, control is passed to step 420. In step 420, the manner of visualization is then set as appropriate and control passes back to step 412 to build and render the portion of complex data to visualize.
Control then passes through steps 413 where a (further) user action is received, on to step 414 where there is no change to the visualization type and on to step 418 where there is no change to the manner of visualization, resulting in the control passing to step 422. In step 422, the user action may have been a navigation request or an alteration to the portion of complex data visualized, received at step 413, possibly by means of graphical user interface which could comprise of one of the interfaces as shown for example in Figs. 28A, 29, 30, 31 or 32. The graphical user interfaces of Figs. 28A, 29, 31 and 32 are generated by the graphical navigation component 306, whereby a different portion of data to visualize or a navigation request can be achieved by resizing or dragging a window 2814 in Fig. 28A using the mouse 103, altering the settings of checkboxes 2904, 2906 or 2908 in Fig. 29, selecting ("pressing") one of the buttons 3002, 3006 or 3008 in Fig. 30 using the mouse 103, altering the text in the textbox 3110 of Fig. 31 to a valid selection, or altering the settings of the checkboxes 3202, 3204 or 3206 in Fig. 32. The operation of the interfaces shown in Figs. 28A, 29, 30, 31 and 32 is also described in more detail in Section 5 below.
707044.doc -21r, If an alteration was made to the portion of complex data to visualize, control is passed to step 424. In step 424, the portion of complex data to visualize is then set as appropriate and control passes back to step 412 to build and render the visualization using the new portion of complex data. Control then passes through step 413 where a 00 5 user action is received, on to steps 414, 418,and 422, resulting in the control passing to
O
N step 426. In step 426, if the user action was not a termination request received at step 413, for example via an interface having a file/exit command or other similar, the control is passes back to step 413. In step 413, if a termination request were received, control would be passed through steps 414, 418, 422 and 426, where the termination request received causes control to be passed to step 428. In step 428, the application 300 is terminated, which has the effect of releasing any cached information stored in the data cache component 326.
2. Reading Complex Information and Data Storage This section describes in detail the method by which the data retrieval component 328 fetches the complex information in XML format from the data sources 204 to 210 and stores the result as an XML result tree in the data cache component 326. The data parsing component 320, parses through a given XML result tree, from top to bottom and left to right, so that the data examination component 324 can store the examined data in the data cache component 326. Prior to proceeding, the meaning of a number of commonly used terms is clarified for use in this section.
"Structure Data" The structure data, held within the data cache component 326, is a data structure that is used to hold a summary of the structure contained within an XML result tree.
707044.doc 22 i "Data Store" The data store, held within the data cache component 326, is a data structure that is used to hold the data content of individual XML result nodes and N the relationships that exist between them.
"Structure Node" A structure node, held within the structure data, is a place o00 5 where data related to a set of similar XML result nodes can be collected. A structure
O
node may contain data like an instance counter, a maximum counter, a XML result node name, a reference to the structure group containing the structure node's children and a reference to the corresponding data component.
"Structure Group" A structure group, held within the structure data, is a place where one or more related structure nodes can be collected in an ordered fashion. A structure group may include data like a reference to the parent structure node, and also mechanisms like one for searching for a specific structure node contained therein and one for adding new structure nodes.
"Simple Node" A simple node is defined as either a structure node that is text or a structure node's referenced structure group only contains a single structure node that is text. The type of the structure node in this case is simple.
"Complex Node" A complex node is defined as a structure node that is not a simple node. This means that the structure node is not text and the structure node's referenced structure group contains something other than a single structure node that is text. The type of the structure node in this case is said to be complex.
"Representative Node" A representative node of a complex node is the first text node which can be parsed from the corresponding structure group in the structure data.
The parsing order is firstly down through the structure nodes in different structure groups, secondly right through the structure nodes of the same structure group and 707044.doc 23 r, thirdly back up through the structure nodes. The representative node is the structure node which is used for all processes occurring to the corresponding complex node. An example of this is taken from Fig. 16A where the representative node of the "MEMBERS" complex node 1610 is the first structure node that is text or structure node o00 5 1618. This is determined by parsing through the structure data from structure node 1610
O
N to structure group 1612 to structure node 1614 to structure group 1616 and to structure node 1618 which is text.
"Record Collection" A record collection, held within the data store, is used for the storage of references to sequential records taken from the XML result tree. In the example taken from Fig. 15, record collection will refer to the level of the "FAMILY" XML result nodes. There will be one record as there is only a single "FAMILY" XML result node 1504.
"Component Collection" A component collection, held within the data store, is a place where data components can be collected in an ordered fashion. In the example taken from Fig. 16B, the structure indicated by 1666 is the component collection which in this case contains seven data components indicated by 1670 to 1676.
"Data Component" A data component, held within the component collection of the data store, is equivalent to a column in a tabular visualization. Each data component is used to reference a pool of data in the data cache component 326 to be used for the storage of data blocks derived from the XML result tree. Further, the data component will contain a reference to the last group of data blocks added to its pool of data in the data cache component 326.
"Data Block" A data block, held within the pool of data referenced by a data component in the data store, is a data structure that defines an item of data by its content 707044.doc 24- N and the relationships that it has with the other data blocks in the data store. A data block in the present implementation includes two counters: one for simple relationships; one for complex relationships; and three references: one reference for simple relationships; one reference for complex relationships; and one reference for the XML result tree data.
00 5 "Sub-Group of Data Blocks" A sub-group of data blocks, held within the pool
O
N of data referenced by a data component in the data store, is a collection of one or more (Ni data blocks. The number of data blocks held within a sub-group depends on the value of the simple counter in the first contained data block.
"Group of Data Blocks" A group of data blocks, held within the pool of data referenced by a data component in the data store, is a collection of one or more subgroups of data blocks. The number of sub-groups of data blocks held within the group depends on the value of the complex counter in the first data block contained in the first sub-group. An example of this would be there where five data blocks were arbitrarily labelled Block A, Block B, Block C, Block D, and Block E. Further assume the complex and simple counters of Block A are 3 and 2 respectively, the simple counter of Block C is 1 and the simple counter of Block D is 2. This effectively means the group of data blocks is Blocks A through E. The three sub-groups into which the group is split are a sub-group containing Block A and Block B, a sub-group containing Block C only and a sub-group containing Block D and Block E.
In the application 300, as mentioned in Section 1 above, the preferred format for data retrieval in the data retrieval component 328 is XML format. An XQuery is supplied to the data retrieval component 328, which then executes the XQuery and requests the appropriate data from the corresponding data sources, for example data sources 204 to 210. When results are available, they are stored in a data cache 707044.doc 25 1 component 326, for example "FamilyDB.xml" as shown in Fig. 14. The data retrieval component 328 then constructs an XML result tree consisting only of a root XML result Nnode, and stores this XML result tree in the data cache component 326. The XML result 00 tree is a programmatic construct which is used to represent the hierarchical relationships 00 5 conveyed by an XML document. Since the tree does not contain all the required XML
O
Sresult nodes, it is considered to be a partial XML result tree. The partial XML result tree is gradually filled in as the data arrives from the data sources 204-210. When the XML result tree is complete, the data cache component 326 will contain all XML result nodes corresponding to the example "FamilyDB.xml" as shown in Fig. 15. This example relates to the progression of the retrieval component 328 as outlined in step 404 of Fig.
4. Traditional methods by which an XQuery is used to fetch XML data and generate an XML result tree from the XML data are similar to that described above. Whilst the present specification makes reference to specified XML related technologies, the principles described herein can be extended not only to other hierarchical databases but also to other types of database.
An example of an XML result tree is shown in Fig. 15, from which the data examination component 324 creates the structure data 1600 and the data store 1665 as shown in Figs. 16A and 16B respectively. These examples relate to the progression of the data storage component 318 outlined in step 406 of Fig. 4, which is further detailed by the workflow described below in Fig. In the application 300, the XML result tree data Fig. 15 stored in the data cache component 326 is used as an input for the data parsing component 320, starting from the root XML result node 1502 and proceeding to the first XML result node 1504 as illustrated at step 504 of Fig. 5. At step 506 it is determined that there is a valid XML 707044.doc 26 result node and control passes to step 508. Step 508 is further detailed by the flowchart of Fig. 6, where at an initial step 604 the direction information or parse direction "down" NI and the XML result node information from 1504 is used. At step 606, control is passed to step 612 as the direction is "down". Similarly, at step 612, control is passed to step 00 00 5 614 as the direction is "down". At step 614 the structure data is at 1602 and a check is
O
performed to see if there is a structure group down a level. As no structure group 1604 (Ni exists down a level the control is passed to step 616. At step 616 the structure group 1604 is created and set to reference structure node 1602, and the reference in structure node 1602 is set to reference group 1604. At step 618 the level is changed to that of the new group 1604. At step 622, a new structure node 1606 is created with an appropriate name from the XML result node, a maximum counter of 0 and an instance counter of 1.
This structure node 1606 is then added to the structure group 1604 passing control back to step 510.
Step 510 then checks if the XML result node 1504 is a text node, which in the present example it is not, passing control back to step 504. At step 504 the next XML result node to be parsed is the node 1506, causing control to pass through steps 506, 508, 604, 606, 612, 614, 616, 618, 622, 510 and 504, as described previously. This results in the new structure group 1608 being created and the new structure node 1610 being created and added to the structure group 1608.
At step 504 the next XML result node to be parsed is node 1508, causing control to pass through steps 506, 508, 604, 606, 612, 614, 616, 618, 622, 510 and 504, as described previously. This results in the new structure group 1612 being created and the new structure node 1614 being created and added to the structure group 1612. Control then passes through steps 506, 508 and 604, with the next parsed XML result node of 707044.doc -27- N node 1510 and the direction is "down". At step 606, control is passed to step 612 as the direction is "down". Similarly, at step 612, control is passed to step 614. At step 614 ,I the structure data is at 1614 in Fig. 16A, and a check is performed to see if there is a structure group down a level. As no structure group presently exists down a level from 00 00 5 1614, the control is passed to step 616 where a structure group 1616 is created and set to
O
N reference structure node 1614, and the reference in structure node 1614 is set to reference group 1616. At step 618 the level is changed to that of the new group 1616. At step 622, a new structure node 1618 is created and initialized and added to the structure group 1616. Step 510 then checks if the XML result node 1510 is a text node, which it is in this example, resulting in a passing of control to step 512, which is further detailed in Fig. 7A.
Prior to commencing with the process of Fig. 7A for the first time, it is assumed that the data store will be initialized to contain an empty component collection 1666 (Fig. 16B) for storage of each data component to be added and an empty record collection 1667 for storage of each record to be added. At step 702 in Fig. 7A, the previous structure node reference will be non-existent, the current structure node reference will be the reference 1618 and the XML result node will be the node 1510 (Bob). In step 702, the types of the structure nodes are determined according to the method shown in Fig. 8, beginning at step 804. At step 804 it is determined there was no previous structure node, resulting in a passing of control onto step 824, which designates the types for the previous and current structure node as simple and complex respectively, and then passes control on to step 704.
At step 704, the current structure type is complex which causes control to be passed on to step 706. At step 706, where the current structure node 1618 is checked to 707044.doc 28 N see if it is the representative node, which it is, passing control to step 708, which access a sub-routine 740, the detail of which is seen in Fig. 7B. At step 742 in Fig. 7B, the N "PARENT" text node 1618 of the node 1614 is checked to see if a reference exists to a 00 corresponding data component, in this case 1670 (Fig. 16B), which does not exist, 00 5 causing control to be passed onto step 744. At step 744, the component 1670 is added
O
NI along with pool 1678 and the structure node 1618 has the reference to its corresponding data component set to the data component 1670. At step 746, the previous structure node is checked to see if it exists, which it does not in this example because the current structure node 1618 is the first text node in the structure data 1600. This causes control to be passed to step 748. At step 748 a new record 1668 is added to the record collection 1667, and the block reference to be used, which is local data stored in the data cache component 326, is set to that of the new record 1668. Control is then returned to step 710, where a new block 1679 is created in the pool 1678, setting the counters of the new block 1679 to 1, initializing its references and setting its content to that of the XML result node 1510. The block reference to be used, previously set to the record 1668, is set to reference the newly created data block 1679.
At this point control is passed back to step 504, where the next parsed XML result node is 1508. This causes the control to pass to steps 506, 508 and 604 with the direction At step 606, as the direction is control is passed to step 608. At step 608 all structure nodes in the current structure group 1616 have their maximum counter set to the value in the instance counter if the maximum counter is less than the instance counter, and the instance counter is set to 0. In this case, the current structure group 1616 has one structure node 1618, has its maximum counter set to the value in the instance counter since the maximum counter is less than the instance counter and the 707044.doc 29 r, instance counter is set to 0. At step 610, the current structure node is shifted up a level by following the reference in structure group 1616 to structure node 1614. Step 510 then checks if the corresponding XML result node 1508 is a text node, which it is not, resulting in a passing of control back to step 504 to parse the next XML result node 00 5 1512. This causes the control to pass to steps 506, 508 and 604 with the direction "right".
At step 606, as the direction is "right", and control is passed to step 612 and similarly onto step 620. At step 620, the current structure node 1614 which has the corresponding structure group 1612 is checked for a structure node related to the node 1512. No corresponding structure node exists, and so control is passed to step 622. At step 622 a structure node 1620 is created and appended to the structure group 1612, and step 508 ends. Step 510 then checks if the XML result node 1512 is a text node, which it is not, passing control back to step 504. Step 504 then parses the next XML result node 1514. This causes the control to pass to step 506, 508 and 604 with the direction "down". Control then passes through steps 606, 612, 614, 616, 618 and 622 as previously described, resulting in the structure group 1622 being added with the structure node 1624. Step 510 then checks if the XML result node 1514 is a text node, which it is, passing control to steps 512 and 702.
At step 702, the references to the previous and current structure nodes will be 1618 and 1624 respectively and the XML result node will be 1514. Control passes to the detail of Fig. 8 for step 702 to determine the types. At step 804, it is determined there was a previous structure node 1618, which causes control to pass to step 806. At step 806, the previous and current structure nodes 1618 and 1624 respectively are different which means the previous and current structure node reference should be moved up the 707044.doc structure data 1600 until they are both at the same level and they are structure nodes belonging to the same group. In this case they are moved to their structure group 1616 ,I and 1622 respectively which are at the same level but not the same group, so they are moved up to their parent level, which in this case are structure nodes 1614 and 1620 00 00 5 respectively, which are both members of the same structure group 1612. At step 808, the
O
N moved previous and current structure nodes are 1614 and 1620 respectively, which are not the same. This results in step 808 passing control to step 809. At step 809 the moved structure nodes remain the nodes 1614 and 1620 respectively are different, and the instance counters are both 1, causing control to be passed to step 812. At step 812, the previous moved structure node 1614 only owns a text node which causes control to be passed to step 816. At step 816 the previous structure type is simple so control is passed to step 818. At step 818, the current moved structure node 1620 only owns a text node which causes control to be passed to step 822. At step 822 the current structure type is simple thereby ending step 702 and resulting in control being passed to step 704.
At step 704, as the current structure type is simple, and so control is passed to step 718.
At step 718 the instance counter of the current structure node 1624 is 1, causing control to be passed to step 720. At step 720, as there is no reference to a valid data component, and hence control is passed to step 708 and hence step 742 in the method 740. As described previously, steps 742 and 744 will then create a new data component 1671 with pool 1681 in the component collection 1666 and the structure node 1624 has the reference to its corresponding data component set to 1671. At step 746, the previous structure node is checked to see if it exists, which it does (ie. the node 1618) causing control to be passed to step 750. At step 750 the block reference to be used is reset and control passed to step 752. At step 752, the passed structure node, which is the node 707044.doc -31 "1 1624, has an instance counter of 1, causing control to be passed to step 754. At step 754, the previous structure node type is simple so the control is passed to step 756. At step NI 756 the block reference to be used is set to the sibling reference of the last data block 1679 of the last sub-group of the group which is referred to by the data component 1670 00 00 5 which is referred to by the structure node 1618. Control is then returned to step 710,
O
Ni where a new data block 1683 is created in the 1681, setting 1683's counters to 1, initializing its references and setting its content to that of the XML result node 1514.
The block reference to be used, previously set to that of the simple reference in 1679 is then set to reference the newly created data block 1683.
At this point control is passed back to step 504, where the next parsed XML result node is the node 1512. This causes the control to pass through steps 506, 508 and 604 with the direction Control then passes through steps 606, 608 and 610 as previously described, resulting in the current structure node 1624 having its maximum counter set to 1, its instance counter set to 0 and the current structure node used is changed to structure node 1620. Step 510 then checks if the XML result node 1512 is a text node, which it is not, passing control back to step 504 to parse the next XML result node 1516. This causes the control to pass through steps 506, 508 and 604 with the direction "right". Control then passes through steps 606, 612 as previously described, and into step 620. At step 620 the current structure node 1620 uses the corresponding group 1612 to check for a structure node 1620 related to 1516 which exists, so control is passed to step 624. At step 624 the structure node 1620 has its instance counter incremented to 2. Step 510 then checks if the XML result node 1516 is a text node, which it is not, passing control back to step 504 to parse the next XML result node 1518.
This causes the control to pass through steps 506, 508 and 604 with the direction 707044.doc 32 c,1 "down". Control is then passed through steps 606, 612, 614, 620 and 624 as previously described, resulting in the instance counter of the structure node 1624 in the structure group 1622 incrementing to 1. Step 510 then checks if the XML result node 1518 is a text node, which it is, passing control to steps 512 and 702.
00 5 At step 702, the references to the previous and current structure nodes in the
O
Nstructure data will be 1624 and 1624 respectively and the XML result node will be 1518.
Control passes to the method of Fig. 8 where step 804 operates to determine the types.
As described previously, control passes through steps 804, 806, 808 and 809, resulting in the structure groups for both being 1622 where the instance counter is 1, causing control to pass to step 810. At step 810 the group is moved up a level to structure node 1620.
At step 808, the current moved structure node 1620 has an instance counter of 2 causing control to pass to step 812. As described previously, control passes through steps 812, 816, 818 and 822, resulting in the current structure node types of simple and simple respectively being determined. At step 704, as the current structure type is simple, control is passed to steps 718 and 720, as described previously, where the component 1683 is not empty causing control to be passed to steps 708 and 742 in the method 740.
At step 742, as the component exists, control is passed to step 746. At step 746, the previous structure node is checked to see if it exists, which it does causing control to be passed to step 750. At step 750 the block reference to be used is reset and control passed to step 752. At step 752, the passed structure node 1624 has an instance counter 2, causing control to be passed to step 760. At step 760, the previous structure node type is simple so the control is passed to step 762. At step 762 the simple counter of the first data block 1683 of the last sub-group of the group which is referred to by the data component 1671 which is referred to by the structure node 1624, is incremented to 2.
707044.doc -33 N Control is then returned to step 710, where a new data block 1682 is created in the pool 1681, setting 1682's counters to 1, initializing its references and setting its content to that of the XML result node 1518. The block reference to be used has not been set, so no references require adjusting.
00 5 At this point control is passed back to step 504, where the next parsed XML
O
result node is 1516. This causes the control to pass to steps 506, 508 and 604 with the direction At step 606, as the direction is "up" control is passed to step 608. At step 608 the structure group 1622 only has one structure node 1624, which has its maximum counter set to 2 and its instance counter set to 0. At step 610, the structure node is shifted up a level by following the reference in structure group 1622 to structure node 1620. Step 510 then checks if the XML result node 1516 is a text node, which it is not, passing control back to step 504 to parse the next XML result node 1506. The control then passes through steps 506, 508, 604, 606, 608, 610 and 510 as described previously. Control is then passed back to step 504 to parse the next XML result node 1520. The control then passes through steps 506, 508, 604, 606, 612, 620 and 622 as described previously, resulting in a structure node 1626 being created and added to structure group 1608. Step 510 then checks if the XML result node 1520 is a text node, which it is not, passing control back to step 504 to parse the next XML result node 1522.
The control then passes through steps 506, 508, 604, 606, 612, 614, 616, 618 and 622 as described previously, resulting in the structure group 1628 being created and added with the structure node 1630. Step 510 then checks if the XML result node 1522 is a text node, which it is not, passing control back to step 504 to parse the next XML result node 1524. The control passes through steps 506, 508, 604, 606, 612, 614, 616, 618 and 622 as described previously, resulting in the structure group 1632 being created and added 707044.doc -34- N with the structure node 1634. Step 510 then checks if the XML result node 1524 is a text node, which it is, passing control to steps 512 and on to step 702.
At step 702, the references to the previous and current structure nodes in the structure data will be 1624 and 1634 respectively and the XML result node will be 1524.
00 5 Control passes to Fig. 8 and step 804 to determine the types. The control passes through
O
Nsteps 804, 806, 808, 809 and 812 as described previously, resulting in the final moved previous and current structure nodes being 1610 and 1626 respectively of structure group 1608. At step 812, the previous moved structure node 1610 has more than just a text node which causes control to be passed to step 814. At step 814 where the previous structure type is complex causing control to be passed to step 818. At step 818, the current moved structure node 1624 has more than just a text node which causes control to be passed to step 820. At step 820 the previous structure type is complex causing control to be passed to step 704. At step 704 the previous and current structure nodes of 1624 and 1634 are used with types complex and complex respectively to pass control through steps 706, 708, 742, 744, 746, 750 and 752 as described previously. This results in the creation of a new data component 1672 with pool 1685 in the component collection 1666, set the reference in the structure node 1634 to its corresponding data component 1672 and reset the block reference to be used. At step 752, the current moved structure node 1634 has an instance of 1 causing control to pass to step 754. At step 754 the previous structure node type is complex causing the control to be passed to step 758. At step 758 the block reference to be used is set to that of the complex reference of the first data block 1679 of the group which is referred to by the data component 1670 which is referred to by the structure node 1618. Control is then returned to step 710, where a new data block 1687 is created in the pool 1685, setting the 707044.doc N counters of data block 1687 to 1, initializing its references and setting its content to that of the XML result node 1524. The block reference to be used, previously set to that of N the complex reference in 1679, is set to reference the newly created data block 1687.
At this point control is passed back to step 504, where the next parsed XML 00 5 result node is 1522 causing control to be passed through steps 506, 508, 604, 606, 608, N610 and 510 and then back to step 504 as described previously, resulting in the next XML result node to parse being 1526. The control then passes through steps 506, 508, 604, 606, 612, 620, 624 and 510 and then back to step 504 as described previously, resulting in the structure node 1630 having its instance counter incremented to 2 and the next XML result node to parse will be determined as 1528. The control then passes through steps 506, 508, 604, 606, 612, 614, 620, 622 and 510 and then back to 504 as described previously, resulting in the structure node 1636 being created and appended to the structure group 1630 and the next XML result node to parse being 1530. The control then passes through steps 506, 508, 604, 606, 612, 614, 616, 618 and 622 and on to step 510 as described previously, resulting in the structure group 1638 being created and added and the structure node 1640 being created and added to the structure group 1638.
At step 510, the XML result node 1530 is checked to see whether it is a text node, which it is, causing control to be passed to steps 512 and 702.
At step 702, the references to the previous and current structure nodes are 1634 and 1640 respectively and the XML result node is 1530. Step 702 then performs the method of Fig. 8 to determine the types as described previously, wherein steps 804, 806, 808, 809, 810, 808, 812, 814, 818 and 820 are passed resulting with the current and previous moved structure nodes as 1630 and 1630 respectively and the current and previous structure types as complex and complex respectively. Since the previous and 707044.doc -36current structure nodes of 1634 and 1640 respectively are used with the types complex and complex respectively, to pass control through steps 704, 706, and 711, resulting in the current structure node 1640 being used. This will cause control to pass through steps 742, 744, 746, 750, 752, 754 and 758 as described previously, resulting in a new data 00 5 component 1673 with pool 1689 being created in the component collection 1666, the Nreference in the structure node 1640 being set to its corresponding data component 1673 and the block reference to be used being set to the complex reference within the data block 1687. Control is then returned to step 712, where the representative node is calculated to be 1634 by determining the first text node which can be visited in all contained groups of the current moved structure node 1630, which is used to pass control through steps 742, 746, 750, 752, 760 and 764 as described previously. In this way the complex counter of the first data block 1687 of the group which is referred to by the data component 1672 which is referred to by the structure node 1634, is incremented to 2.
Control is then returned to step 714, where a new data block 1686 is created in the pool 1685, setting the counters of data block 1686 to 1, initializing its references and leaving its content empty. The block reference to be used has not been set, so no references require adjusting. At step 716, a new data block 1690 is created in the pool 1689, with the counters of the block 1690 being set to 1, initializing its references and setting its content to that of the XML result node 1530. The simple reference in the previously created block 1686 is then set to reference the data block 1690.
At this point control is passed back to step 504, where the next parsed XML result node is 1528. This causes control to be passed through steps 506, 508, 604, 606, 608, 610 and 510 and then back to step 504 as described previously, where the next XML result node to parse will be 1532. This causes the control to be passed through 707044.doc -37- 1 steps 506, 508, 604, 606, 612, 620, 624 and 510 and then back to 504 as described previously, which results in the structure node 1634 having its instance counter Nincremented to 1. Step 510 then checks if the XML result node 1532 is a text node, which it is, passing control to step 512 and on to step 702.
00 5 At step 702, the references to the previous and current structure nodes are 1640 N, and 1634 respectively and the XML result node is 1532. Control passes again to the detail of Fig. 8 to determine the types as described previously, wherein steps 804, 806, 808, 809, 812, 816, 818 and 822 are passed resulting with the current and previous moved structure nodes as 1636 and 1634 respectively and the current and previous structure types as simple and simple respectively. Since the previous and current structure nodes of 1640 and 1634 are used with types simple and simple respectively, control will pass through steps 704, 718, and 720, where the data component referenced from the current structure node 1634 is 1672 which has no content in the last data block 1686 of the last sub-group of the group causing control to be passed to step 722. At step 722, the data block 1686 has its content set to that of the XML result node 1532.
The above method is then continued in such a way that structure groups 1644, 1648, 1654, 1660 and the structure nodes 1642, 1646, 1650, 1652, 1656, 1658, 1662 are formed from the parsed XML result nodes 1526, 1534, 1536, 1538, 1536, 1540, 1542, 1540, 1534, 1544, 1546, 1544, 1520, 1504 and 1502 to produce components 1674, 1675, 1676 and the pools 1692, 1695 and 1698 respectively, which will contain the associated data blocks 1693, 1696 and 1699 appropriately referenced.
The present implementation need not be limited to gathering the XML result tree information in the data store in this way. There are a number of restrictions the current embodiment places on the complexity of XML result tree information that can be placed 707044.doc -38r, in the data store and as such, additions to the described method could be used to contend with XML result tree information with higher levels of complexity. One such addition may include the use of an additional level of references or mechanism by which a level of nested references could be applied. This is intended to cope with cases where a 00 5 structure node in the hierarchical data can potentially be one of a number of
O
representative nodes. This would be required where the hierarchical data is very deep.
(Ni Another such addition may include the use of an additional level of reference (subsimple reference) which could be applied to cope with cases where a structure node has more than one sibling applied at the same level in the hierarchical data. This may be required where the order of the nodes changes in separate records or similar. Further, there are many optimisations and variations, for both space and speed, which may be made thereto without departing from fundamental purpose and implementation of the data simplification and storage method described.
3. Data Analysis This section describes the method by which the data analysis component 314, analyses the data content and relationships generated from the data store, an example 1665 of which is shown in Fig. 16B, as created by the data examination component 324 and stored in the data cache component 326. The analysis is carried out in two phases, whereby seven collections of data, examples of which are shown in Figs. 21A, 21B, 21C, 21D, 22A, 22B and 22C, are generated and stored in the data cache component 326.
Those collections of data are named herein as the "flat data" 2110; the "interim data" 2120; the "type data" 2130; "column depth data" 2140; the "row depth data" 2210; the "unique data" 2220; and the "intersect data" 2230.
707044.doc -39- The flat data 2110 is a "flattened" or tabular version of the data items obtained from the data source.
The interim data 2120, so called because of its nature, identifies the elements of the flat data 2110 which are duplicated data content, unique data content or no data 00 o00 5 content. The interim data 2120 is only required during the full execution of data analysis
O
and can be destroyed when the analysis has been completed.
The type data 2130 identifies what the worst case type for all the data items is in a column of the flat data 2110, where the type may include the some of the following: "integer" means all data items in the column of flat data 2110 are integers; (ii) "double" means at least one data item in the column of flat data 2110 is a double and the rest are integers; (iii) "date" means all data items in the column of flat data 2110 are dates of any acceptable date or time or combination format; (iv) "currency" means all data items in the column of flat data 2110 are the same currency of any acceptable currency format; "string" is used where none of the above is suitable.
The column depth data 2140 identifies the worst case "depth" for each data item in a column of the flat data 2110, the depth being a measure of how many 1 :n (one-tomany) relationships exist between the XML result node under test and the related node at one level below the root. The depth in the present example implementation is always described as a value greater than or equal to 0. Depth may alternatively be defined as the number of 1 :n relationships, end-to-end, between nodes.
707044.doc 40 The row depth data 2210 identifies the range of worst case depths within each row of non-duplicated flat data 2110 as described by the interim data 2120.
The unique data 2220 identifies how many unique data items there are in each column.
OO
00 00 5 The intersect data 2230 identifies the number of times each of the data items of any one column in the flat data 2110 has a corresponding data item in any other column.
An example of the progression of the first phase of the data analysis can now be described. In this example, the data store shown in Fig. 16B is analysed and the flat data, interim data type data and column depth data, as shown in Figs. 17D, 18D, 19A and 19B respectively, are generated and stored in the data cache component 326. The flat data, interim data, type data and column depth data Figs. 17D, 18D, 19A and 19B respectively are different to those mentioned at the start of the section relating to Figs.
21A, 21B, 21C and 21D respectively, as a simpler example is used to better enable the illustration of the first phase of data analysis. This first phase of data analysis takes place in the data analysis component 314 as described by step 408 of Fig. 4, which is further detailed by the workflow shown in steps 902, 904, 908, 910 and 912 of Fig. 9, with step 912 being further detailed by the procedure shown in Fig. In the application 300, beginning with step 902 in Fig. 9, the component collection 1666 is retrieved from the data store 1665 of Fig. 16B, which is stored in the data cache component 326. At step 904 the component collection information is used to create the flat data and interim data shown in Figs. 17A and 18A respectively, create the type data shown in Fig. 19A with all values set to Integer and create the column depth data shown in Fig. 19B with all values set to 0. At step 908, the data store 1665 is scanned, beginning with the first record 1668, which in this case is the only record. At 707044.doc -41r, step 910, as the record 1668 exists, control passes to step 912, which is further detailed in Fig. 10, beginning with step 1002.
N At step 1004 in Fig. 10, the data block being referred to by record 1668, which in the present case is the block 1679 of the data component 1670. A temporary data 00 5 structure is created for the storage of the tables found in this procedure. This data structure is referred to as table data and is set to empty. The depth data used within this procedure tracks the current depth and is initially set to 0. Control is then passed to step 1006, where a new row is created in the flat data and interim data and is further detailed as a procedure 1100 in Fig. 11.
In Fig. 11, at step 1104, using the data block 1679 and the current depth of 0, set each of a simple flag and a complex flag to true. Step 1106 then examines the complex count and in the present example the complex counter of data block 1679 is 1 which causes control to pass from step 1106 to step 1114 which examines the corresponding simple counter. In this example control then proceeds from step 1114 to step 1118. At step 1118, the flat data is updated in which the data content of data block 1679, being "Bob", which is stored in the "PARENT" column shown in Fig. 17B. At step 1120, the "PARENT" column of the interim data in Fig. 18B is marked with a or unique entry marker, the reason for which will be explained later. At step 1122, if the current depth is smaller than the depth in the "PARENT" column of the column depth data in Fig. 19B, the depth of the "PARENT" column is changed to that of the current depth. In this case the current depth is 0 which is the same as the depth in the "PARENT" column of Fig.
19B, so no change is made. At step 1124, the type data as shown in the "PARENT" column of Fig. 19A is changed to whichever type supports both that of the data content of data block 1679 or "Bob", and the type in the "PARENT" column of Fig. 19A, which 707044.doc 42 are integer and string respectively, resulting in a change to the type string, being the worst case type to accommodate these alternatives. Control proceeds to step 1126, where the simple flag and simple reference are checked. Here, the simple flag is set and the simple pointer of data block 1679 is valid, so control passes to step 1128, where the oo00 5 simple pointer is reset and the original value, being a pointer to data block 1683, and the
O
N current depth 0, are used in a recursive call of procedure 1100.
In this recursive call, the data block is 1683 and current depth is 0. Control will similarly pass through steps 1104, 1106 and 1114 where the simple counter is 2, passing control to step 1116. At step 1116, as there are no simple pointers in any of the data blocks other than the first, the simple flag is left set. At step 1112, the "CHILD" column of the flat data of Fig. 17B is marked as a table with a reference to the data block 1683, TCHILD used in this case for reference purposes. Control then passes through steps 1126 and 1130 as there are no complex pointer or simple pointer in the data block 1683. The recursive call for data block 1683 then completes and passes control back to step 1130 for the recursive call in respect of data block 1679.
At step 1130 the complex flag and complex reference are checked. The complex flag is set and there is a complex pointer in data block 1679, thereby causing control to pass to step 1132. At step 1132, the complex pointer is reset and the original value, a pointer to data block 1687, and current depth 0 are used in a recursive call of the procedure 1100.
In this recursive call, data block is 1687 and current depth is 0. Control will similarly pass through steps 1104 and 1106 where the complex counter is 2 causing control to pass to step 1108. At step 1108, as there are no complex pointers in any of the data blocks other than the first, the complex flag is left set. At step 1110, the simple flag 707044.doc -43 r, is reset. At step 1112, the "PHONE" column of the flat data of Fig. 17B is marked as a table with a reference to the data block 1687, TPHONE used in this case for reference purposes. Control then passes to step 1126 and onto 1130, where the control passes to 1132 since the complex flag is set and the complex pointer is valid. At step 1132, the 00 5 complex pointer is reset and the original value, a pointer to data block 1693, and current depth 0 are used in a recursive call of the procedure 1106.
In this recursive call, the data block is 1693 and the current depth is 0. Control will similarly pass through steps 1104, 1106, 1114, 1118, 1120, 1122, 1124, 1126 and 1128, resulting in the data content of data block 1693 or "One St" being stored in the flat data shown in the "STREET" column of Fig. 17B, the "STREET" column of the interim data in Fig. 18B being marked with a or unique entry marker, the depth in the "STREET" column of the column depth data in Fig. 19B remaining unchanged and the type in the "STREET" column of the type data in Fig. 19A being changed to string. At step 1128, the simple pointer is reset and the original value, a pointer to data block 1696, and current depth 0 are used in a recursive call of the procedure 1100.
In this recursive call, the data block is 1696 and the current depth is 0. Control will similarly pass through steps 1104, 1106, 1114, 1118, 1120, 1122, 1124, 1126 and 1130, resulting in the data content of data block 1693 or "ABC" being stored in the flat data shown in the "STATE" column of Fig. 17B, the "STATE" column of the interim data in Fig. 18B being marked with a or unique entry marker, the depth in the "STATE" column of the column depth data in Fig. 19B remaining unchanged and the type in the "STATE" column of the type data in Fig. 19A being changed to string. The recursive call for data block 1696 then completes and passes control back to step 1130 for data block 1693.
707044.doc -44 c,1 At step 1130 the complex flag is set and there is a complex pointer in 1696 causing control to pass to step 1132. At step 1132, the complex pointer is reset and the original value, a pointer to data block 1699, and current depth 0 are used in a recursive call of the procedure 1100.
00 00 5 In this recursive call, the data block is 1699 and the current depth is 0. Control will similarly pass through steps 1104, 1106, 1114, 1118, 1120, 1122, 1124, 1126 and 1130, resulting in the data content of data block 1693 or "54321" being stored in the flat data shown in the "INCOME" column of Fig. 17B, the "INCOME" column of the interim data in Fig. 18B being marked with a or unique entry marker, the depth in the "INCOME" column of the column depth data in Fig. 19B remaining unchanged and the type in the "INCOME" column of the type data in Fig. 19A remaining unchanged.
The recursive call for data block 1699 then completes and passes the control back to caller for data block 1696.
The recursive call for data block 1696 then completes and passes control back to the recursive call for data block 1693. The recursive call for data block 1693 then completes and passes control back to the recursive call for data block 1687. The recursive call for data block 1687 then completes and passes control back to the recursive call for data block 1679. The recursive call for data block 1679 then completes and passes control back from step 1006 to step 1008.
At step 1008, the table data is cleared of content and the current depth is incremented to give a depth of 1. At step 1010 the rows in the flat data associated with current record as shown in Fig. 17B are searched, from the leftmost column to rightmost, first row to last, to extract any tables of which there are two: marked as TCHILD and TPHONE; which are stored sequentially in the table data. At step 1012, the table data is 707044.doc 45 (checked to see if there are any table entries, which there are, passing control to step 1014. At step 1014, the first table entry TCHILD is indexed, 1683 being the related data block. At step 1016, the complex counter is checked to see if it is greater than 1, which it is not, so control is passed to step 1020. At step 1020, the simple counter is set to 1 00 00 5 and for each of the originally counted data blocks, the two being data blocks 1683 and 1682. Step 1020 then performs the same tasks as step 1006, implementing the procedure 1100 of Fig. 11.
The execution for data block 1683 and current depth 1, causes control to be passed through steps 1104, 1106, 1114, 1118, 1120, 1122, 1124, 1126 and 1130, resulting in the data content of data block 1683 or "Sam" being stored in the first row of the "CHILD" column of the flat data shown in Fig. 17C, the first row of the "CHILD" column of the interim data in Fig. 18C being marked with a or unique entry marker, the depth in the "CHILD" column of the column depth data in Fig. 19B being changed to the value of current depth or I and the type in the "CHILD" column of the type data in Fig. 19A being changed to string.
The execution for data block 1682 and current depth 1, causes control to be passed through steps 1104, 1106, 1114, 1118, 1120, 1122, 1124, 1126 and 1130, resulting in the data content of data block 1682 or "Dan" being stored in the second row of the "CHLD" column of the flat data shown in Fig. 17C, the second row of the "CHILD" column of the interim data in Fig. 18C being marked with a or unique entry marker, the depth in the "CHILD" column of the column depth data in Fig. 19B remaining unchanged and the type in the "CHILD" column of the type data in Fig. 19B remaining unchanged.
707044.doc -46 At step 1022, there is one table TPHONE left in the table data, with data block 1687 being the related data block, so control passes to step 1024, which moves the table index i to TPHONE. At step 1016, the complex counter of data block 1687 is 2 causing control to 00 be passed to step 1018. At step 1018, the complex counter is set to 1 and for each of the 00 5 originally counted data blocks, the two being data blocks 1687 and 1686, the procedure i 1100 of Fig. 11 is performed. It will be appreciated that the difference between steps S1018 and 1020 is that the complex and simple counters, respectively are set to 1.
Alternatively, this process could be separated into 3 steps instead of 2, with one recursive call.
The execution for data block 1687 and current depth 1, causes control to be passed through steps 1104, 1106, 1114, 1118, 1120, 1122, 1124, 1126 and 1130, resulting in the data content of data block 1687 or "1234" being stored in the first row of the "PHONE" column of the flat data shown in Fig. 17C, the first row of the "PHONE" column of the interim data in Fig. 18C is marked with a or unique entry marker, the depth in the "PHONE" column of the column depth data in Fig. 19B being changed to the value of current depth or 1, and the type in the "PHONE" column of the type data in Fig. 19A remaining unchanged.
The execution for data block 1686 and current depth 1, causes control to be passed through steps 1104, 1106, 1114, 1118, 1120, 1122, 1124, 1126 and 1128, resulting in the data content of data block 1686 or "12345" being stored in the second row of the "PHONE" column of the flat data shown in Fig. 17C, the second row of the "PHONE" column of the interim data in Fig. 18C is marked with a or unique entry marker, the depth in the "PHONE" column of the column depth data in Fig. 19B remaining unchanged, the type in the "PHONE" column of the type data in Fig. 19A 707044.doc -47 remaining unchanged, and the procedure 1100 is called recursively for data block 1690 and current depth 1. The recursive call for data block 1690 and current depth 1, causes N control to be passed through steps 1104, 1106, 1114, 1118, 1120, 1122, 1124, 1126, and 1130 and back to step 1130 for data block 1686, resulting in the data content of data 00 5 block 1690 or "Mbl" being stored in the second row of the "TYPE" column of the flat data shown in Fig. 17C, the second row of the "TYPE" column of the interim data in Fig. 18C is marked with a or unique entry marker, the depth in the "TYPE" column of the column depth data in Fig. 19B being changed to the value of the current depth or 1, and the type in the "TYPE" column of the type data in Fig. 19A being changed to string.
At step 1022, there are no tables left in the table data so control passes to step 1026. At step 1026 it is determined that a number of rows were inserted in the flat data, causing control to pass to step 1028. At step 1028, one row was inserted between the data of Fig. 17B and the data of Fig. 17C, so the data in Fig. 17B not containing tables is duplicated in the inserted rows of Fig. 17C as shown in Fig. 17D. Similarly the interim data has these items of duplicated data marked with a or duplicated entry marker as shown in Fig. 18D. At step 1008 the table data is reset and the current depth incremented to 2. At step 1010 the flat data of Fig. 18D is searched for any remaining tables, of which there are none. At step 1012 there are no tables left in the table data, completing the first phase (steps 902-912) of data analysis 408.
The first phase of data analysis of the data store 1665 provides for the formation of relatively compact data. A number of variations may alternatively be applied in the production of the flat data and interim data. Those variations may allow the formation of sparser data, wherein a more accurate expression of the hierarchical nature of the 707044.doc -48 information is reflected in the presentation to the user. A disadvantage of this approach is that the analysed data is not as flexible for the presentation in non-hierarchical visualizations. Another variation may be to use cross products of all related data at the same depth wherein a more flexible expression of the information is reflected in the 00 5 presentation to the user. A disadvantage of this approach is that the analysed data is usually more bulky.
An example of the interim data 2120 and column depth data 2140 and their corresponding row depth data 2210, unique data 2220 and intersect data 2230, are shown in Figs. 21B, 21D, 22A, 22B and 22C respectively. These examples relate to the progression of the second phase of the data analysis, where the interim data 2120 and column depth data 2140 are analysed in the data analysis component 314 and the row depth data 2210, unique data 2220 and intersect data 2230 are generated and stored in the data cache component 326. This data analysis takes place in step 408 of Fig. 4, which is further detailed by the workflow shown in Fig. 9 steps 913, 914, 916 and 918, with step 918 being further detailed by the process shown in Fig. 12.
In the application 300, the interim data 2120 and column depth data 2140 shown in Figs. 21B and 21D respectively, and stored in the data cache component 326, are used as the input for the second phase of the data analysis in the data analysis component 314.
Starting at step 913 of Fig. 9, the component collection information, retrieved in step 902, is used to create the three sets of output data: the row depth data 2210 as shown in Fig. 22A, where one row is created for each row of interim data 2120 in Fig 21B, with the two entries in each row of row depth data 2210 set such that the minimum depth is equal to the smallest depth from column depth data 2140 or 0 from Fig. 21D and the 707044.doc 49 ~maximum depth is equal to the largest depth from column depth data 2140 or 2 from Fig. 21D. The data is taken from Fig. 20 and the same method 406, being that described in steps 504, 506, 508, 510 and 512, is used for treating the data store of Fig. 20. Then the data store created at from Fig. 20 is then (c) 00 00 5 processed with the same method 408, being steps 902, 904, 908, 910 and 912 as described above; (ii) the unique data 2210 entries seen in Fig. 22B are all set to 0; and (iii) the intersect data 2230 entries (Fig. 22C) are all set to 0.
At step 914 the rows in the interim data 2120 of Fig. 21B are parsed starting with the first row. At step 916, it is determined whether a new row of data was found in the previous step, which it was, so control proceeds to step 918 which is further detailed in Fig. 12.
In Fig. 12, processing commences at step 1206 using the content of a row of interim data 2120 from Fig. 21B, starting at the first row 2122 of interim data 2120 and the column depth data 2140 from Fig. 21D. A first cell index is set to reference the cell at the intersection of row 2122 and the first column "Name", of the interim data 2120.
At step 1208, the cell content of the first cell index within row 2122 is checked for marker (ie. No Entry) which in this case it is not, so control proceeds to step 1214.
At step 1214, the counter in the intersect data 2230 in Fig. 22C corresponding with entry [first cell index, first cell index] (cell at intersection of row "Name" and column "Name") is incremented. At step 1216, the column depth data 2140 entry in Fig. 21D corresponding to the first cell index (or column "Name"), which is is used to adjust the minimum and maximum row depths stored in Fig. 22A of the corresponding row.
This is done by performing two checks and actions as required. Firstly, if the row depth 707044.doc data entry in the minimum depth column in the current row (Fig. 22A) is greater than the column depth data entry at first cell index, then the row depth data entry in the minimum depth column in the current row of Fig. 22A is set to be equal to column depth data entry at first cell index. In the present case 2 0 is true, and so row depth data entry in the 00 00 5 minimum depth column in current row is set to 0. Secondly, if the row depth data entry
O
in the maximum depth column in the current row (Fig. 22A) is greater than the column depth data entry at first cell index, then the row depth data entry in the maximum depth column in the current row is set to be equal to the column depth data entry at thefirst cell index. In the present case 0 0 is false, so no action is taken. At step 1218, the cell content of the first cell index within row 2122 is checked for marker (unique), which in this case it is, so control is passed to step 1220. At step 1220, the counter in the unique data corresponding with entry [first cell index] (entry in column "Name") is incremented. At step 1221, a check is performed to see if there are any cells left in row 2122, which there are, so control is passed to step 1222. At step 1222, the second cell index is set to the cell immediately after that of the first cell index which will be the cell in column "Age". At step 1224, the cell content of the second cell index within row 2122 is checked for marker which in this case it is not, so control is passed to step 1226. At step 1226, the counter in the intersect data 2230 corresponding with entry [first cell index, second cell index] (cell at intersection of row "Name" and column "Age") is incremented. At step 1228, check to see if there are any cells left in row 2122, which there are, so control is passed to step 1230. At step 1230, the second cell index is incremented to reference the cell in column "Dept". Control is passed through steps 1224, 1226, 1228, 1230, 1224, 1228, 1230, 1224, 1228, 1230, 1224, 1228, 1230, 1224, 707044.doc -51 r, 1228, 1230, 1224, 1228, 1230, 1224 and 1228 as described previously, where step 1228 detects no further cells in row 2122 passing control to step 1212.
At step 1212, the first cell index is incremented to reference the cell in column "Age". Control is passed through steps 1208, 1214, 1216, 1218, 1220, 1221, 1222, 00 00 5 1224, 1228, 1230, 1224, 1228, 1230, 1224, 1228, 1230, 1224, 1228, 1230, 1224, 1228, 1230, 1224, 1228 and 1212 as described previously, where the first cell index is now referencing the cell in column "Dept" of row 2122. At step 1208, the cell content of the first cell index within row 2122 is checked for marker which in this case is true, so control is passed to step 1210. At step 1210, check to see if there are any cells left in row 2122, which there are, so control is passed to step 1212. Control is then passed through steps 1212, 1208, 1210, 1212, 1208, 1210, 1212, 1208, 1210, 1212, 1208, 1210, 1212, 1208, 1210, 1212, 1208 and 1210 as described previously, where there are no further cells to be indexed by the first cell index for row 2122, causing the control to pass to step 914 of Fig. 9.
At step 914 the next row of interim data 2120 is fetched from Fig. 21B. At step 916, control passes to step 918 as a row of interim data 2120 was found. At step 918, the row of interim data 2120 is processed in the same way as described above and the appropriate entries in the row depth data 2210, unique data 2220 and intersect data 2230 are updated. This process repeats for all of the interim data 2120 from row 2122 through to row 2124, to produce the completed row depth data 2210, unique data 2220 and intersect data 2230 as shown in Figs. 22A, 22B and 22C respectively. Note that unique data is simply a count of the unique entries. Further, each column is stand-alone, so if any content is wanted in a column, a entry is required, remembering that is a duplicate of a entry.
707044.doc 52 N 4. Build and Render Visualization This section describes the method by which the visualization generation component 310, takes the data stored by the data analysis component 314 in the data cache component 326 to build and render a non-hierarchical visualization in the 00 5 graphical presentation component 304, according to visualization type, visualization
O
manner and navigation selection information stored in the data cache component 326.
The navigation selection information is an indicator of the portion of the complex data to visualize by means of the depth. This built and rendered visualization of the graphical presentation component 304 is then presented to the user via the display 114. Initially the visualization type, visualization manner and navigation selection information or depths selected are set to the default values. For example, the visualisation type could be defined as tabular, the navigation selection information may be set to depths of 0 and 1 respectively, assuming the depth 1 is a valid depth for the given data otherwise just depth 0 could be used. Other default values may be used and the visualisation manner would not be required as the visualisation type is tabular. For example, such may be based on user preference, or alternatively based on the data being presented such the number of columns to be presented and the intersection of those columns with each other. Other means for deciding the defaults may be chosen.
There are two sets of data generated by visualization generation component 310, those being proportion data 2300 and visualization order data 2400. The proportion data 2300 such as that shown in Fig. 23, indicates the combinations of data columns that are meaningful in terms of intersecting data content for the given navigation selection information. This is effectively correlation of the unique data values. The visualization order data 2400 such as that shown in Fig. 24, indicates what types of visualization are 707044.doc 53 possible for the respective entries in the proportion data 2300. The proportion data 2300 and visualization order data 2400 are constructed from the type data 2130, column depth data 2140, unique data 2220 and intersect data 2230 such as those shown in Figs. 21C, 21D, 22B and 22C respectively, which are taken from the data stored by the data analysis 00 5 component 314 in the data cache component 326. The method by which this is done is described in detail later in this section. From the proportion data 2300 and visualization (ti order data 2400, the visualization types that are possible can be deduced, as well as the manner in which these visualizations can be presented to the user. As such, the proportion data 2300 and visualization order data 2400 are required when a new visualization type is selected, or modifications are made to the manner in which the visualization is presented. The proportion data 2300 and visualization order data 2400 are therefore constructed whenever the navigation selection information is changed or initialized. In the described arrangements, the proportion data 2300 and visualization order data 2400 are always constructed, if required, prior to the build and render.
However it is not essential to perform the construction first. For instance, since tabular visualizations do not require this information, it is possible to build and render a tabular visualization prior to the construction of the proportion data 2300 and visualization order data 2400, which may allow for faster presentations.
In order to build and render a non-hierarchical visualization, the visualization generation component 310, takes the data stored by the data analysis component 314 in the data cache component 326, namely the flat data 2110, column depth data 2140, row depth data 2210, visualization order data 2400 and the navigation selection information such as those shown in Figs. 21A, 21D, 22A and 24 and indicated by the selected depths.
This visualization can then be used by the visualization presentation component 304 for 707044.doc 54- (presentation to the graphical user interface 302. Once again the method in which this is done is described in detail later in this section. The relevant data indicated by the flat data 2110, column depth data 2140 and row depth data 2210 is built strictly according to the navigation selection information for the selected visualization type and presented in 00 5 the manner indicated. As previously mentioned, if a tabular visualization is required, the
O
visualization order data is not required.
An example of using the type data 2130, column depth data 2140, unique data 2220 and intersect data 2230 to construct the proportion data 2300 and visualization order data 2400 as shown in Figs. 21C, 21D, 22B, 22C, 23 and 24 respectively, with the navigation selection information of depths 0 and 1 follows. This data construction phase takes place in the visualization generation component 310 at step 412 of Fig. 4, with the construction of the proportion data 2300 being further detailed by the workflow 1300 shown in Fig. 13.
In the application 300, the column depth data 2140, unique data 2200 and intersect data 2230 as shown in Figs. 21D, 22B and 22C and stored in the data cache component 326 are used as input data along with the visualization type, visualization manner and navigation selection information for the visualization generation component 320. As previously mentioned, the proportion data 2300 and related visualization order data 2400 is only generated if the navigation selection data is initialized or altered and non-tabular visualizations are required. So, the first time step 412 is encountered results in the generation of the proportion data 2300 according to the method 1300 of Fig. 13, with the navigation selection information of depths 0 and 1. Before continuing, a few terms are defined for use in this section: 707044.doc "Attribute Index An attribute index is the index of the column in the flat data 2110. For example, in Fig. 21A, an attribute index of 2 refers to all entries in the "Dept" column. The described implementation uses a tabular index, starting at 0, as a means of referencing the entries, but is not limited to the use of this kind of reference.
"Pair" Pair is a sequence of two unique attribute indexes or (Idxl, Idx 2 Here: 0 Idx Idx 2 NumberOfAttributes; "Pair Relationship" Pair Relationship refers to a relationship which exists between the two specified columns of data as indicated by the attribute indexes which make up the pair.
"Triplet" Triplet is a sequence of three unique attribute indexes or (Idxi, Idx 2 Idxj). Here: 0 Idx Idx 2 Idx 3 NumberOfAttributes.
"Triplet Relationship" Triplet Relationship refers to three pair relationships which exist between the three specified columns of data as indicated by the attribute indexes which make up the triplet. The order in which the three pair relationships are expressed is (Idxl, Idx 2 (Idxj, Idx 3 and (Idx 2 Idx 3 "Quartet" Quartet is a sequence of three unique attribute indexes or (Idxl, Idx 2 Idx 3 Idx4). Here: 0 Idx Idx 2 Idx 3 Idx 4 NumberOfAttributes "Quartet Relationship" Quartet Relationship refers to six pair relationships which exist between the four specified columns of data as indicated by the attribute indexes which make up the quartet. The order in which the six pair relationships are expressed is (Idxl, Idx 2 (Idx1, Idx 3 (Idxj, Idx 4 (Idx 2 Idx 3 (Idx 2 Idx 4 and (Idx 3 Idx4).
707044.doc 56
O
N To generate the proportion data 2300, Fig. 13 is used starting at step 1304. At step 1304 two columns of dimension data entries are completed according to the following methods: Columns Description, Method and Example oo00 Idx Description: All attribute indexes possible for the given navigation selection N information in which there is more than one unique data entry.
Method: This is done by listing all attribute indexes in the proportion data 2300 of Fig. 23 for the given depths selected where there is more than one unique data entry. The depth for those attribute indexes are found in the column depth data 2140 of Fig. 21D and the number of unique data entries for those attribute indexes are found in the unique data in Fig. 22B.
Example: So, for a depth selection of 0 and 1, the attribute indexes 0, 1, 2, 3, 4 and 7 ("Name", "Age", "Dept", "Product", "Price" and "Totalltems" columns) are found to be within the depth selected according to the column depth data 2140 of Fig. 21D and have 2, 2, 2, 4, 4 and 4 unique data entries according to the unique data 2220 of Fig. 22B, so all these attribute indexes are valid and are therefore recorded in the Idx column of the proportion data 2300 in Fig. 23 as shown.
Uniq Description: The number of unique data entries.
Method: This is done by finding the unique data 2220 in Fig. 22B corresponding with that of each attribute index indicated by Idx and recording it in the corresponding entry of the Uniq column in the proportion data 2300 of Fig. 23.
Example: So, for the attribute indexes 0, 1, 2, 3, 4 and 7 taken from the Idx column of the proportion data 2300 in Fig. 23, the corresponding values for Uniq 707044.doc 57- Cj1 will be 2, 2, 2, 4, 4 and 4 from the data obtained by using the attribute indexes to reference the unique data 2220 of Fig. 22B. This is shown in the Uniq column of the proportion data 2300 in Fig. 23.
At step 1306, the number of entries recorded in the first two columns is 00 00 calculated and if larger than 0, which it is in this case, control passes to step 1308. If there were no entries recorded in the first two columns, the method 1300 would be tit complete and end. At step 1308 three columns of dimension data entries are completed according to the following methods: Columns Description, Method and Example Pair Description: All pairs of attribute indexes possible for the given navigation selection information.
Method: This is done by listing all combinations of attribute indexes listed in the Idx column of the proportion data 2300 in Fig. 23 in the Pair column of the proportion data in Fig. 23, excluding all values where: intersect[dx idx x max unique[Idx, unique[ Idx 2 <1 intersect[Idx Idx intersect[Idx 2 dx2 unique[a] The data at the entry in column a of the unique data 2220 in Fig. 22B.
intersect[a, b] The data at the entry in row a and column b of the intersect data 2230 in Fig. 22C.
Example: So, for the attribute indexes 0, 1, 2, 3, 4 and 7 taken from the Idx column of the proportion data 2300 in Fig. 23, all possible combinations will be (3, Now for the combination the calculation above results in: 707044.doc 58
F
I
A( unique[0] unique[1] 2 4 intersect[0,1] x max intersect[0,0] intersect[I 2 x max C 2 intersect[0,0] intersect[1,1] 2 l) 0This is greater than or equal to 1 making this combination a valid pair, so it is recorded in the Pair column of the proportion data 2300 in Fig. 23. Similarly the 00 combination the calculation above results in: 00 unique[0] unique[ 2 2 t mintersect[0,2] x ma 0x max 0 i c intersect[0,0] intersect[2,2] 2 14 NC This not greater than or equal to 1, which means this combination should be ignored. This is repeated for all combinations above resulting in the determination that the combinations 7) and 7) are valid pairs, which are added to the Pair column of the proportion data 2300 in Fig. 23.
Overlap Description: The portion of all overlapping data entry combinations indicated by the pair.
Method: This is done by calculating an Overlap value in the proportion data 2300 of Fig. 23, according to the corresponding attribute indexes in the Pair column of the proportion data 2300, where: Ov 100 x intersect [Idx, Idx 2 Overlap 2 intersect [Idx Idxl intersect [Idx 2 Idx 2 intersect[Idxl Idx 2 Example: For the first pair which is taken from the Pair column in the proportion data of Fig. 23, the corresponding value of Overlap will be calculated as: 100 x intersect[0,1] 100 x 2 intersect[0,0]+ intersect[I,1]- intersect[0,1]- 2 2 -2 This is recorded at the corresponding entry of the Overlap column in the 707044.doc -59proportion data of Fig. 23. Similarly, from the other values in the Pair column in the proportion data of Fig. 23, 7) and the corresponding overlap is calculated as 100, 100, 100, 100, 100 and 100 which are also recorded at the corresponding entries of the Overlap column in the proportion 00 00 data of Fig. 23.
Nbr Description: The number of intersecting data items in a pair.
Method: This is done by calculating a Nbr value in the proportion data 2300 according to the corresponding attribute indexes in the Pair column of the proportion data 2300, where: Nbr intersect[Idx,, Idx2 x max( unique[Idx- unique[Idx 2 Sintersect[Idx Idx,] intersect[Idx 2 Idx 2 Example: For the first pair which is taken from the Pair column in the proportion data 2300, the corresponding value for Nbr will be calculated as: max( i0]unique[0] unique[1] (2 2} intersect[0,1] x max uue[-] uniquel] 2 x max 2 cintersect[0,0] intersect[1,l]j= (2 2 This is recorded at the corresponding entry of the Nbr column in the proportion data 2300. Similarly, from the other values in the Pair column in the proportion data 2300, 7) and the corresponding entry of the Nbr column in the proportion data 2300 will be calculated as 4, 4, 4, 4, 4 and 4 which are also recorded.
At step 1310, the number of entries recorded in the above three columns is calculated and if larger than 0, which it is in this case, control passes to step 1312. If there were no entries recorded in the above three columns, the method 1300 would be complete and 707044.doc 60 -Z end. At step 1312, two columns of dimension data entries in the proportion data 2300 are completed according to the following methods: Columns Description, Method and Example Triplet Description: All triplets of attribute indexes possible for the given navigation 00 00 selection information.
O
Method: This is done by finding all possible combinations of the pairs taken (ti from the Pair column in the proportion data 2300, such that they form a valid triplet, each of which is recorded in a separate entry in the Triplet column of the proportion data 2300.
Example: Designating a primary pair from the first pair 1) in the Pair column, with a secondary pair starting the pair after primary, which is the two are compared checking for an identical first attribute index in the two pairs.
The secondary would move to the next pair and so on while there is an identical first attribute index in the two pairs until there are no more secondary pairs to move on to. But in this case, the secondary stops moving after the primary and secondary are compared since the first attribute indexes are not the same. The primary pair then moves to the next pair 3) and the secondary pair moves to start after the new primary, or which are compared checking for an identical first attribute index in the two pairs, which is the case for attribute index 2. An exact match for the final triplet pair is searched for in the Pair column, starting after the secondary pair, and found, thereby making 3, 4) a valid triplet. The secondary moves to the next pair 7) which is compared with the primary 3) for an identical first attribute index in the two pairs 707044.doc -61which is the case for attribute index 2, so an exact match for the final triplet pair 7) is searched for in the pair column, starting at the secondary pair and found making 3, 7) a valid triplet. The secondary moves to the next pair (3, 4) which is compared with the primary 3) for an identical first attribute index in the two pairs which is not the case so the primary is moved to the next pair 4) and secondary set to 7) accordingly. The secondary 7) is compared with the primary 4) for an identical first attribute index in the two pairs which is the case for attribute index 2, so the final triplet pair 7) is searched as above and found making 4, 7) a valid triplet. This continues until the primary reaches the last pair making the valid list of triplets 3, 4), 3, 4, 7) and 4, 7) which are recorded in the Triplet column of the proportion data 2300 in Fig. 23.
Description: The portion and number of data items intersecting is indicated by referencing or indexing the three applicable pairs from which the triplet is composed.
Method: This is done by finding the three indexes to the pairs identifying the three parts of the relationship corresponding to the Triplet column of the proportion data 2300 in Fig. 23. Where if (Idxl, Idx 2 Idx 3 is the triplet, the three Pair indexes will be in the following order (Idxl, Idx 2 (Idxl, Idx 3 and (Idx 2 Idx 3 Example: So, starting with the first triplet from the Triplet column of the proportion data 2300 or 3, the three relationships to be identified are (2, 4) and 4) which, using the 0 indexed rows from the Pair column, Triplet Relationship 707044.doc 62 indexes 1, 2, and 4 or 2, 4) which is recorded in the corresponding entry of the Triplet Relationship column of the proportion data 2300. This is repeated for all triplets of the proportion data 2300. This will result in 2, 3, 3 6) and 5, 6) being added to the Triplet Relationship column of the 00 proportion data 2300.
O
At step 1314, the number of entries recorded in the triplet and triplet relationship columns is calculated and if larger than 0, which it is in this case, control passes to step 1316. If there were no entries recorded in those two columns, the method 1300 would be complete and end. At step 1316 two final columns of dimension data entries are completed for the proportion data 2300 according to the following methods: Columns Description, Method, Example Quartet Description: All quartets of attribute indexes possible for the given navigation selection information.
Method: This is done by finding all possible combinations of the pairs taken from the Pair column in the proportion data 2300, such that they form a valid quartet, each of which is recorded in a separate entry in the Quartet column of the proportion data 2300.
Example: Starting by designating a primary triplet from the first triplet 3, 4) and a secondary triplet from the triplet after the primary, which is 3, the two triplets are compared checking if the second and third attribute indexes from the primary triplet and the first and second attribute indexes from the secondary triplet are identical. As 4) and 3) do not match, the secondary moves to the next triplet 4, Similarly here 4) and 4) do not match, 707044.doc 63 Quartet Relationship so the secondary moves to the next triplet 4, Here 4) matches 4), so an exact match for the pair from the first attribute index of the primary and the third attribute index of the secondary, is searched in the pair column and found making 3, 4, 7) a valid quartet. As there are no more triplets for the secondary to move to, the primary moves to the next triplet or 3, 7) and the secondary is set to the triplet after the primary or 4, which are compared checking whether the second and third attribute indexes of the primary triplet and the first and second attribute indexes of the secondary triplet are identical. As 7) and 4) do not match, the secondary is moved on until there is a match or the end is reached when the primary will be advanced until it reaches the last triplet making the valid list of quartets 3, 4, 7).
Description: The portion and number of data items intersecting is indicated by referencing or indexing the six applicable pairs from which the quartet is composed.
Method: This is done by finding the six indexes to the pairs identifying the six parts of the relationship corresponding to the Quartet column of the proportion data 2300 in Fig. 23. Where if (Idxl, Idx 2 Idx 3 Idx 4 is the quartet, the three Pair indexes will be in the following order (Idxl, Idx 2 (Idxl, Idx 3 (Idxl, Idx 4 (Idx 2 Idx 3 (Idx 2 Idx 4 and (Idx 3 Idx4).
Example: So, starting at the first quartet from the Quartet column of the proportion data in Fig. 23 or 3, 4, the relationships to be identified are (2, 7) and 7) which, using the 0 indexed rows from the Pair column, indexes 1, 2, 3, 4, 5 and 6 or 2, 3, 4, 5, 6) which is 707044.doc 64 recorded in the corresponding entry of the Quartet Relationship column of the proportion data 2300 in Fig. 23. This is repeated for all quartets of the proportion data in Fig. 23. In this case there are no more so only 2, 3, 4, 6) will be added to the Quartet Relationship column of the proportion data 00 2300.
O
The proportion calculations need not be carried out in the exact order or by the (Ni means described above. For example, improvements can be made to the basic calculations themselves and optimizations for the order in which they are carried out.
For example, closer inspection of the calculations described above shows a number of repeated calculations which could be performed only once. For instance, there is a certain bias in the given calculations which may be tailored for the specific applications or data expectations. Further, the described implementation only uses a Quartet Relationship due to the maximum number of spatial dimensions used in the visualizations described. The calculations may be readily expanded to encompass cases where more data is to be visualized in more extensive non-hierarchical visualizations, examples of which include a plot visualization with 3 axes X, Y and Z, a data series and the addition of wall data, floor data and coloration data which would require a septet relationship to be defined as so on.
The proportion data 2300 generated above, as shown in Fig. 23, can be used in many useful ways. One of these is to form a statistical feedback for each of the nontabular visualizations built and displayed. This will give the user an indication of how much of the specified data content has been visualized and or where in the utilized visualization there is no data presented. This can be done by using the percentage in the Overlap column of the proportional data 2300 and presenting it for each pair in the 707044.doc 65 relationships identified by the visualization. For example if a plot visualization with two axes X and Y were to be used with "Price" and "Totalltems" respectively, as per Fig.
34B, there is only one relevant relationship pair, that of X and Y axes. This is equivalent to the pair which in the proportion data 2300 of Fig. 23 has a corresponding 00 5 overlap of 100%. This could be displayed in a gauge or graph or using other means, to
O
let the user know that all of the selected data is being presented to them. As indicated previously in this section there are situations where a pair will not be included in the proportion data 2300 of Fig. 23, in this case it would have to be produced using the calculation specified previously. Similarly, for more complex visualizations, there will be many pair relationships which may each be displayed in the way specified above for each pair, or alternately, the overlap for each pair, may be combined to one value for presentation in a single gauge as described above.
An example of some type data 2130 and proportion data 2300 and their corresponding visualization order data 2400 are shown in Figs. 21C, 23 and 24 respectively, for the navigation selection information of depths 0 and 1. The second part of this data construction phase takes place in the visualization generation component 310 also at step 412 of Fig. 4.
In the application 300, the type data 2130 and proportion data 2300 are stored in the data cache component 326 and are used as input data along with the visualization rules for visualization generation component 310. The following rules and preferences may be applied: 1. For tabular visualizations: Proportionality Not applicable.
707044.doc 66 Type Order Range Preferences 2. For plot visualizations: Proportionality Not applicable.
Not applicable.
Not applicable.
Type Order 2D (X-Axis, Y-Axis) 3D (X-Axis, Y-Axis, Z-Axis) 2D+ (Data Series, X-Axis, Y-Axis) 3D+ (Data Series, X-Axis, Y-Axis, Z-Axis) X-Axis Number Currency Date-Time, String Y-Axis Number Currency, Date-Time I String Z-Axis Number I Currency I Date-Time, String Data Series String, Number I Currency Date-Time 1 X-Axis entries oo 1 Y-Axis entries oo 1 Z-Axis entries oo 1 Data Series entries <12 Data Series attribute index at the highest depth Data Series unique entries <Z-Axis unique entries Z-Axis unique entries X-Axis unique entries X-Axis unique entries <Y-Axis unique entries Range Preferences 3. For bar visualizations: Proportionality 2D 3D (X-Axis, Y-Axis) (X-Axis, Y-Axis, Z-Axis) 707044.doc 67 Type Order Range Preferences 2D+ (Data Series, X-Axis, Y-Axis) 3D+ (Data Series, X-Axis, Y-Axis, Z-Axis) X-Axis Number Currency Date-Time, String Y-Axis Number j Currency, Date-Time Z-Axis Number Currency[ Date-Time, String Data Series String, Number Currency Date-Time 1 X-Axis entries !12 1 Y-Axis entries oo 1 Z-Axis entries 00 1 Data Series entries <12 Data Series attribute index at the highest depth Data Series unique entries <5Z-Axis unique entries Z-Axis unique entries X-Axis unique entries X-Axis unique entries <Y-Axis unique entries 4. For pie visualizations: Proportionality Type Order Range 2D (Data Label, Data Value) 2D+ (Data Series, Data Label, Data Value) Data Label String, Number I Currency Date-Time Data Value Number I Currency Data Series String, Number Currency Date-Time 1 Data Label entries 12 1 Data Value entries <12 1 Data Series entries <12 707044.doc 68 SPreferences Data Series unique entries _Data Label unique entries Preferences Data Series unique entries _Data Value unique entries These rules and preferences indicated above are merely exemplary and indicative of the types of things that may be checked in the non-hierarchical visualizations of which 00 the system may be capable. Different implementations may permit the tailoring of
O
existing rules and preferences, or the addition of more rules and preferences for new visualizations types.
If the proportionality rules identified above are applied to the proportion data 2300, the visualization order data 2400 is generated such that the dimensioning information is identified for each of the possible visualizations. This means that if the tabular, plot, bar and pie visualization examples detailed were to be used, there would be ten sets of data possible: none for tabular; four for the plot visualization as there are two different spatial dimensions available "2D" and "3D" and the ability for both to have a data series as shown in the plot proportionality rules; four for the bar visualization as there are two different spatial dimensions available "2D" and "3D" and the ability for both to have a data series as shown in the bar proportionality rules; and two for the pie visualization as there is only two spatial dimensions available "2D" and the ability for it to have a data series as shown in the pie proportionality rules noted above.
The visualization order data 2400 resulting from the proportion data 2400 in Fig.
23 with the application of the rules using the type data 2130 in Fig. 21C is shown in Fig.
24. This is generated by copying the entries of the Pair column of the proportion data 2300 to the "2D" columns of the visualization order data, the Triplet column of the proportion data to the and "3D" columns of the visualization order data and the Quartet column of the proportion data to the columns of the visualization order 707044.doc 69 (data. The indexes are then validated for order and range by applying the rules and ensuring that the type of each attribute index is supported by the visualization, which can be determined from the type data 2130 in Fig. 21C and the corresponding Nbr column of the proportion data 2300 for the attribute index in Fig. 23. If the check fails, the attribute 00 00 5 indexes in the "3D" and columns of the visualization order data
O
2400 can be reordered so that the check can be carried out again until no further reordering is possible. When this is the case, the item is removed from the visualization ordering data 2400, indicating that it is not possible to visualize that set of data.
An example of this is taken from the "2D" column of visualization order data 2400 in the bar section thereof. The first item 1) will be copied from the first pair in the proportion data 2300. This relates to the bar visualization for (X-Axis, Y-Axis) where the type of the attribute index 0 is a "String" from the type data 2130 and the Nbr is 2 from proportion data 2300, which is a possible X-Axis type and within the possible X-Axis range according the proportionality rules for a bar visualization. Similarly the attribute index 1 is a "Number" from the type data and the Nbr is 2 from proportion data 2300, which is a possible Y-Axis type and within the possible Y-Axis range, making this order valid. The next item 3) will be copied from the second pair in the proportion data 2300. Here the type of the attribute index 2 is a "String" from the type data and the Nbr is 2 from proportion data 2300, which is a possible X-Axis type and within the possible X-Axis range according the rules. Similarly the attribute index 3 is a "String" from the type data 2130 and the Nbr is 2 from the proportion data 2300, which is not a possible Y-Axis type so a different order is tried This once again fails on the Y- Axis type check and there are no other possible orders to try, so the entry is removed from the visualization order data in Fig. 24. The remainder of the "2D" column is 707044.doc i completed this way, remembering that if an alternate order were possible where the original was not, the entry in the visualization order data would be updated to indicate the new order.
In the application 300, the flat data 2110, column depth data 2140 and row depth 00 5 data 2210 as shown in Figs. 21A, 21D and 22A, stored in the data cache component 326, N are used as input data along with navigation selection information, visualization type and visualization manner for the visualization generation component 310 to then build the appropriate visualization to be rendered by the visualization presentation component 304. To do this the flat data 2110 in Fig. 21A is limited by the data from the column depth data 2140 and row depth data 2210 in Figs. 21D and 22A when the navigation selection information is applied.
To illustrate this concept, tabular visualizations are used, as they are simply a reflection of the applicable flat data 2110. If the navigation selection information contained the depths 0 and 1, the intersection of these two depths with the column depth data 2140 in Fig. 21D results in the columns "Name", "Age", "Dept", "Product", "Price" and "Totalltems" which have the depth of 0 or 1. The intersection of the depths 0 and 1 with the row depth data 2210 in Fig. 22A, results in the rows 1, 2, 3, 7, 11 and 14 (indexed from 1) being identified. This is determined by checking if, in a row of the row depth data 2210, the selected depths lie within the minimum and maximum depths specified by the "Minimum" and "Maximum" columns respectively of the row depth data. This means that of the flat data 2110 in Fig. 21A, only the columns "Name", "Age", "Dept", "Product", "Price" and "Totalltems" have applicable data on the rows 1, 2, 3, 7, 11 and 14 for the depths 0 and 1. This can be seen in the tabular visualization for the depth selection 0 and 1 in Fig. 33A, which would be built and rendered for 707044.doc -71 c,1 presentation to the user. In this way, other depth selections can be built for tabular visualization such as a depth of 0 only in Fig. 33B, depths of 1 and 2 in Fig. 33C, depths of 0, 1 and 2 in Fig. 33D and depths of 0 and 2 in Fig. 33E, could be built and rendered for presentation to the user. The described arrangements are not limited to producing the 00 5 tabular visualization as a simple table. For example, Figs. 33A, 33B, 33C and 33E each
O
include the addition of indicators, in the form of a dashed edge and/or column separators to the represented table, which is intended to indicate that there is data hidden from view and which may be realised, should the appropriated levels be selected.
The data as identified for the tabular visualization may instead be used for building a plot, bar or pie visualization. If a plot visualization were to be produced for the navigation selection information of depths 0 and 1, the columns and rows identified above, where the columns are "Name", "Age", "Dept", "Product", "Price" and "Totalltems" and the rows are 1, 2, 3, 7, 11 and 14, a plot visualization can be built. The mechanism by which a user can determine the manner in which this data is visualized in a plot is either via manual selection of the columns to display in the axes and data series of the plot, as described in detail in Section 5, or semi-automatically, also described in detail in Section 5. As an example, if the plot were to have two axes an X and a Y, the "Product" and "Totalltems" column data could be used to build and render the plot visualization such as that shown in Fig. 34A, which would be presented to the user. As another example, if the plot were to have two axes an X and a Y, the "Prices" and "TotalItems" column data could be used to build and render the plot visualization such as that shown in Fig. 34B, which would be presented to the user. As another example, if the plot were to have a data series and two axes an X and a Y, the "Product", "Prices" 707044.doc 72 N and "TotalItems" column data could be used to build and render the plot visualization such as that shown in Fig. 34C, which would be presented to the user.
In a similar way, if a bar visualization were to be produced for the navigation selection information of depths 0 and 1, the columns and rows identified above, where 00 5 the columns are "Name", "Age", "Dept", "Product", "Price" and "Totalltems" and the
O
rows are 1, 2, 3, 7, 11 and 14, a bar visualization can be built. Once again, the mechanism by which a user can determine the manner in which this data is visualized in a bar is either via manual selection of the columns to display in the axes and data series of the bar, as described in detail in Section 5, or semi-automatically, also described in detail in Section 5. As an example, if the bar were to have two axes an X and a Y, the "Product" and "Totalltems" column data could be used to build and render the bar visualization such as that shown in Fig. 35, which would be presented to the user.
In a similar way, if a pie visualization were to be produced for the navigation selection information of depths 0 and 1, the columns and rows identified above, where the columns are "Name", "Age", "Dept", "Product", "Price" and "Totalltems" and the rows are 1, 2, 3, 7, 11 and 14, a plot visualization can be built. Once again, the mechanism by which a user can determine the manner in which this data is visualized in a pie is either via manual selection of the columns to display in the data value and data series of the pie, as described in detail in Section 5, or semi-automatically, also described in detail in Section 5. As an example, if the pie were to have a data label and a data value, the "Product" and "Totalltems" column data could be used to build and render the pie visualization such as that shown in Fig. 36A, which would be presented to the user.
As another example, if the pie were to have a data series, a data label and a data value, the "Dept", "Product" and "Totalltems" column data could be used to build and render 707044.doc 73 (the pie visualization such as that shown in Fig. 36B, which would be presented to the user.
Changing Visualization This section describes the ways in which the visualization alteration component 00 00 5 312 is capable of accepting user input by manipulating the visualization interface
O
provided by the graphical navigational component 306. The three distinct varieties of (ti graphical user interfaces generated for the user to aid in the execution of the following actions: navigation through the data content presented to the user; changing the type of non-hierarchical visualization used to present the data content to the user; altering the manner in which the data content of the non-hierarchical visualization is presented to the user.
Examples of graphical user interfaces which may be presented to the user to aid in the navigation of the data content presented are shown in Figs. 28A, 29, 30, 31 and 32, where the selected depths depicted are 0 and 1. The examples of Figs. 28B, 28C, 28D and 28E describe the progressive manipulation of the GUI of Fig. 28A in a different state due to a different portion of the complex data to be visualized or selection of depths being valid. The data used to generate the possible interfaces depicted by Figs. 28A, 28B, 28C, 28D, 28E, 29, 30, 31 and 32 is the column depth data 2140 as shown in Fig. 21D and the navigation selection information, both sets of data being stored in the data cache component 326. The data used to control the acceptable behaviour of the possible interfaces depicted by Figs. 28A, 28B, 28C, 28D, 28E, 29, 30, 31 and 32 is also 707044.doc 74 taken from the column depth data 2140 shown in Fig. 21D, stored in the data cache component 326.
Fig. 28A shows a dialog-style graphical user interface 2800, where the selection of depths or navigation selection information is indicated as 0 and 1 by means of the 00 00 5 window 2814 used to envelope one or more of the possible depths 0, 1 and 2 indicated at 2808, 2810 and 2812 respectively arranged on a depth line 2806, being a linearly structured representation of the various depths available for display.. Three text boxes 2820, 2818 and 2816 respectively relate to each possible depth 0 (2808), 1 (2810), and 2 (2812) respectively. Inside the text boxes 2820, 2818 and 2816 the data columns from the column depth data 2140 of Fig. 21D is reflected by indicating which column relates to which depth i.e. the text box 2820 related to depth 0 (2808) has the items "Name", "Age" and "Dept" which correspond with the data items in Fig. 21D with a depth of 0, the text box 2818 related to depth 1 (2810) has the items "Product", "Price" and "TotalItems" which correspond with the data items in Fig. 21D with a depth of 1, and the text box 2816 related to depth 2 (2812) has the items "Sales" and "Quarter" which correspond with the data items in Fig. 21D with a depth of 2. Since depth 2 (2812) is not selected, the corresponding text box 2816 is disabled, ghosted or greyed out. A split label 2802 and button 2804 are also disabled or greyed out (ie. ghosted) as the range of depths selected is not greater than or equal to 2.
The user navigates by changing the depths selected. This is done by manipulating the window 2814 or selecting the split button 2804 when enabled, using the keyboard 102 or mouse 103, or a combination of both. In Fig. 28A, if the right edge of the window 2814 were dragged one position to the left, the modified GUI 2830 of Fig.
28B would result, where the window 2814 has become smaller and the text box 2818 has 707044.doc 75 been disabled or greyed out. This corresponds to a selection of depth 0 (2808) only. In Fig. 28A, if the entire window 2814 were dragged one position to the right, the modified NGUI 2840 of Fig. 28C would result, where the window 2814 would be shifted one position to the right, the text box 2820 has become disabled or greyed out and the text 00 5 box 2816 has become enabled or not greyed out. This corresponds to a selection of N depths of 1 and 2. In the GUI 2840 of Fig. 28C, if the left edge of the window 2814 were dragged one position to the left, the GUI 2850 of Fig. 28D would result, where the window 2814 would encompass all depths, the text box 2820 would become enabled or not greyed out and the split label 2802 and button 2804 would become enabled or not greyed out as the range of depths selected is greater than or equal to 2. This corresponds to a selection of depths of 0, 1 and 2. In Fig. 28D, if the split button 2804 were pressed for example by selection using the mouse 103, the modified GUI 2860 of Fig. 28E would result, where two windows 2814 and 2866 would be separated by one depth in the middle of the original window of Fig. 28D, the text box 2818 would become disabled or greyed out and the split label 2802 and button 2804 would become a merge label 2882 and button 2884. This corresponds to a selection of depths of 0 and 2. The merge button 2884 of Fig. 28E operates in the opposite manner to the split button 2804, if it were selected, the two split windows 2814 and 2866 would merge becoming one, resulting in a return to the GUI of Fig. 28D.
Other possible interfaces that could be used to navigate through the data content or choose alternative depth selections are shown in Figs. 29, 30, 31 and 32, each of which illustrating the depth selection of 0 and 1. Fig. 29 shows another dialog-style graphical user interface 2900, similar to that of Fig. 28A, where the depths selectable are indicated by numerical icons 2902 and text boxes related to the depths 0, 1 and 2 are 707044.doc 76 N 2910, 2912 and 2914 respectively. Instead of using a sliding window 2814 as depicted in Figs. 28A through 28E, check boxes 2904, 2906 and 2908 are related to the depths 0, 1 and 2, respectively. The check boxes 2904, 2906 and 2908 are used to select the appropriate depths for navigation through the data content. In the example shown, the oo00 5 check boxes related to depths 0 and 1, which are 2904 and 2906, are shown checked.
NFig. 30 shows a toolbar-style graphical user interface 3000 where the depths are (Ni hidden but the emphasis lies on directions in which the data can be navigated expanded or shrunk. In terms of the dialog-style GUI in Fig. 28A, the direction in which a window can be dragged indicates which buttons of Fig. 30 will be enabled. As a depth of 0 is the smallest depth selected and a depth of 1 is the largest depth selected, the left side of a window may be dragged right to a depth of 1 but not left to a depth of hence a button 3002 is enabled and a button 3004 is disabled. Similarly, the right side of a window may be dragged left to a depth of 0 and right to a depth of 2, hence buttons 3006 and 3008 are both enabled. As the range of data selected is not 2 or more, various buttons 3010, 3012, 3014, 3016 and 3018 related to a split or merge are disabled or greyed out. If the range of data selected were to exceed 2, the buttons related to a split (ie. button 3010) would be enabled, similar to the situation described in Fig. 28D above, with the buttons 3012, 3014, 3016 and 3018 disabled. If the split button 3010 were to be pressed, it would then transform to a merge button and the buttons 3012, 3014, 3016 and 3018 would be enabled or disabled depending on whether the GUI window as in Fig. 28E could be dragged in the corresponding direction.
Fig. 31 shows a toolbar-style graphical user interface 3100, similar to that of the dialog shown in Fig. 29, where the possible depths which can be navigated through are indicated by labels 3104 and 3106, and those depths that have been chosen are indicated 707044.doc 77 1 by a label 3108 and a window 3110. In this case, depths 0 and 1 inthe window 3110 could be indicated by or 1".
NFig. 32 shows a toolbar-style graphical user interface 3200, similar to that of the dialog shown in Fig. 29, where the possible depths which can be selected are indicated 00 5 by labels 3208 and related check boxes 3202, 3204 and 3206. The depths which have N been selected are indicated by those check boxes corresponding with depths 0 and 1, being check boxes 3202 and 3204.
Examples of possible user interfaces 2500 and 2600 which could be presented to the user to aid in the alteration of the visualization type or manner are shown in Figs. and 26 respectively. The data which may be used to generate these interfaces includes the visualization order data 2400, such as that shown in Fig. 24, the column depth data 2140 such as that shown in Fig. 21D, the currently selected visualization type and the currently chosen visualization manner, each stored in the data cache component 326, as well as the visualization proportionality rules from Section 4. The currently selected visualization type, as described previously, is used to indicate what type of nonhierarchical visualization will be presented to the user such as tabular, pie, bar, plot and so forth. The currently selected visualization manner, as described previously, is used to indicate how the data forms the visualization type specified, such as what data is visualized in the x-axis, y-axis, z-axis, data series etc. as required by the visualization type chosen. The data used to control the acceptable behaviours of the possible interfaces depicted by Figs. 25 and 26 is also taken from the visualization order data, such as that shown in the example Fig. 24, as stored in the data cache component 326 and the visualization proportionality rules from Section 4.
707044.doc 78 Under closer examination, it can be seen that the dialog style user interface 2500 shown in Fig. 25 allows for the manual configuration of the data content to be visualized. There are three main parts to the interface: a visualization type selection part 2502; a data components part 2512; and manner definition parts 2504, 2506, 2508 and 00 00 5 2510. In the visualization type selection part the user is able to change the visualization
O
type to one of those possible, an example being tabular, plot, pie, bar etc., by means of a drop-down list box 2502. In the data components parts, given the current selection of depths, all data components which can be visualized are displayed to the user in the list box 2512. In the manner definition part the user is able to define the manner in which the data is to be visualized given the visualization type as chosen by type selection 2502.
This is done with the aid of the text boxes for the representation of the data components in the x-axis, y-axis, z- axis and data series which correspond to manner definitions 2504, 2506, 2508 and 2510 respectively and allow single data components from 2512 to be dragged and copied into the corresponding destination text box 2504, 2506, 2508 or 2510. Further, the data components selected in boxes 2504, 2506, 4508 and 2510 can be cleared by pressing the button 2514. When a different visualization type is selected in box 2502, the appropriate data component text boxes 2504, 2506, 2508 and 2510 may need to have their visibility status changed, since not all visualization types are capable of displaying data components in all axes and the data series, which is described by the proportionality rules in Section 4. In some cases, such as for the pie visualization type, the data value as described by the proportionality rules in Section 4 requires defining instead of any axes. For this case, the text 2503, which currently reads "X-axis", would be altered to "Data Value" and the text box 2504 would then represent the data component to be displayed as the data value in the pie chart. The user interface 2500 of 707044.doc 79 N Fig. 25 is not limited to the items 2502, 2503, 2504, 2506, 2508, 2510, 2512 and 2514 as described above, and other means of implementing the three parts of the interface could ,I be used.
Fig. 25 as shown is a possible user interface which may be presented to the user 00 5 to aid in the alteration of the visualization type or manner, where the given depths are 0 N and 1, the current visualization type is a plot 2502, the current manner is to use the data components "Product", "Price", and "Dept" in the X-Axis, Y-Axis, Z-Axis and Data Series or 2504, 2506, 2508 and 2510 respectively. This means that the plot visualization produced will have two axes, an X and a Y with the data components "Product" vs.
"Price" respectively. It also means the plot will have a Data Series with the data component "Dept". The content of the text box 2512 is determined by the column depth data 2140 in Fig. 21D and the currently selected depths, which are 0 and 1. The data components with a depth 0 are "Name", "Age" and "Dept" which corresponds with the data items in Fig. 21D with a depth of 0 and the data with a depth 1 are "Product", "Price" and "Totalltems" which corresponds with the data items in Fig. 21D with a depth of 1, are placed in 2512. The visualization types which are possible are all those with a set of proportionality rules in Section 4. This means that the selection of tabular, plot, bar and pie, should be possible from 2502.
If bar were selected in box 2502, according to the proportionality rules of Section 4 ("Type Order" for bar visualizations), the items required for the manner definition part are the X-Axis, Y-Axis, Z-Axis and Data Series which would be reflected by boxes 2504, 2506, 2508 and 2510 respectively being visible to the user. The current data components in boxes 2504, 2506, 2508 and 2510 would contain the data as per the previous visualization type, so the data components "Product", "Price", and "Dept" 707044.doc are displayed in 2504, 2506, 2508 and 2510 respectively. The user may then change the data components to graph by the drag and copy action from boxes 2514 to 2504, 2506, N 2508 or 2510 as appropriate. In this way, user tailored bar visualizations can be created.
If pie were selected in box 2502, according to the proportionality rules of Section o00 5 4 ("Type Order" for pie visualizations), the items required for the manner definition part are the Data Value and Data Series, which would be reflected by boxes 2504 and 2510 respectively being visible to the user and box 2503 being displayed as "Data Value" instead of X-Axis". The current data components in boxes 2504 and 2510 would contain the data as per the previous visualization type, so the data components "Product" and "Dept" are displayed in boxes 2504 and 2510 respectively. The user may then clear all data components, which in this case are represented in boxes 2504 and 2510, by pressing a clear icon 2514. The user may then select desired data components to graph by the drag and copy action from box 2514 to 2504 or 2510 as appropriate. In this way user tailored pie visualizations can be created.
If tabular were selected in box 2502, according to the proportionality rules of Section 4 ("Type Order" for tabular visualizations), the "All" indicates that all data components are taken by default irrespective of their type. This just means that boxes 2504, 2506, 2508 and 2510 as well as 2514 are made non-visible as there are no data component choices to make in the tabular visualization.
The interface 2500 allows for any possible combination of the data components or data columns taken from 2512. Specific implementations may place a limitation upon what combination of data components are possible, if so preferred, such that the visualizations presented to user always have some data content to visualize. The visualization order data 2400 shown in Fig. 24 could be used to decide what 707044.doc -81 ,I combinations have data content. An alternative could be to provide the user with a visual gauge or preview of the amount of intersecting data for each pair of data N components in the specified visualization. The proportion data 2300 as shown in Fig. 23 could be used to generate such a gauge or preview.
o00 5 The toolbar style user interface 2600 shown in Fig. 26 allows for a semi-
O
automatic configuration of the data content to be visualized. Here, only two of the three main parts described above are required by this interface 2600, those being a visualization type selection part 2602; and a manner definition part 2604, 2606, 2608 and 2610. In the visualization type selection part 2602, the user is able to change the visualization type to one of those possible, an example being tabular, plot, pie, bar etc., by means of a drop-down list box 2602. In the manner definition part 2606-2610, the user is able to define the manner in which the data is to be visualized given the visualization type as chosen by the selection part 2602. This is done with the aid of the check box 2604 for the representation of "Data Series" use a drop down list box 2606 for the selection of the number of axes or sets of data values to use, and the alternate manner selection or the two buttons 2608 and 2610 for previous and next defined manner respectively. When a different visualization type is selected from the list 2602, the enabled disabled status of 2604, 2606, 2608 and 2610 may have to be changed depending on whether the activities represented by 2604, 2606, 2608 and 2610 are possible, since not all visualization types are capable of displaying data components in multiple axes or data values. The user interface 2600 of Fig. 26 is not limited to the items 2602, 2604, 2606, 2608 and 2610 as described above, and other means of implementing the two parts of the interface may be used.
707044.doc 82
O
N Fig. 26 as shown is a possible user interface which could be presented to the user to aid in the alteration of the visualization type or manner, where the given depths are 0 N and 1, the current visualization type is a plot as can be seen in selectable list 2602, the current manner is to use a data series as shown by 2604 and 2 axes as shown by "2D" in
O
00 5 2606. By default, from the proportionality rules of Section 4 (plot with a data series and N two axes that corresponds with the third item [Data Series, X-Axis, Y-Axis]), the related third column or in the plot section of the visualization order data, which at depths 0 and 1 will be that shown in Fig. 24, will use the previously selected entry or by default, the first entry in the column of the plot section in the visualization order data.
Assuming that no previous selection was made, the entry containing the indices 3, 4" will be used. This means a plot produced will have a Data Series containing the data component "Dept" (index 2) and two axes, an X and a Y containing the data components "Product" (index 3) and "Price" (index 4) respectively. The visualization types selectable are tabular and those with content in the visualization order data Fig. 24, where a plot, a bar and a pie visualization all have content and are therefore possible.
This means that the selection of tabular, plot, bar and pie, should be possible from the list 2602.
If bar were selected from the list 2602, according to the proportionality rules of Section 4 ("Type Order" for bar visualizations), the items required for the manner definition part are the Data Series which means the check box 2604 is enabled and the three axes X, Y and Z which means the list 2606 is enabled with a choice of "2D" or "3D" (the indicating a spatial dimension). The current values in the check box 2604 and the list 2606 would be as per the previous visualization where possible, in this case the previously used indices combination 3, has a corresponding entry in the 707044.doc 83 r, column of the bar section in the visualization order data, which is used as the initial selection. The alternate selection buttons 2608 and 2610 are then enabled since (multiple entries exist in the bar section: column of the visualization order data.
The user could however prevent the visualization from using a Data Series by un- 00 00 5 checking the box 2604. This would change the visualization order data being used from
O
Sthe bar section: column: 3, 4" indices to the bar section: "2D" column: 4" indices (the indices corresponding with the axes and not the data series). In this way, the data components visualized in the X and Y axes are the same with or without the data series. A change in the box 2604 in some instances, may cause a change in the number of axes or sets of data values possible in the list 2606 and whether there are alternate selections possible indicated by 2608 and 2610. This means that the selectors 2606, 2608 and 2610 are checked and re-generated if required after a change occurs to the check box 2604. In this case there are entries in both the "2D" and "3D" columns of the bar section in the visualization order data so "2D" and "3D" are valid selections at selector 2606 and since the "2D" is selected in selector 2606 and the "2D" column of the bar section in the visualization order data has more than one entry, 2608 and 2610 are enabled.
If the user were to change the selection of axes used at selector 2606 from "2D" to either the first entry starting with the two indices 4" or the first entry containing the two indices 4" or the default entry which is the first would be used. In this case the bar section: "3D" column contains a set of indices 4, 7" which would be used for the data components in the X, Y and Z axes. The buttons 2608 and 2610 are checked and enabled or disabled if required after a change occurs to 2606. The selection changed which means the "3D" column of the bar section in the visualization order data 707044.doc 84- N1 which has more than one entry, thus 2608 and 2610 should be enabled. The buttons 2608 and 2610 operate by selecting the previous and next entry in the current column of the visualization order data. Hence if the button 2610 were to be pressed, the next set of indices would be moved to the next set of indices 4, 3" from 4, 7" in the "3D" 00 00 5 column of the bar section in the visualization order data. In this way a user could NI quickly obtain useful visualizations for the data content being observed.
If pie were selected in the drop down selector list 2602, according to the proportionality rules of Section 4 ("Type Order" for pie visualizations), the items required for the manner definition part are the Data Series which means the check box 2604 is enabled and the single set of data values which means the selector list 2606 is disabled. The use of the data series is taken from the previous visualization's settings and the data label and data value to use is taken from the previous axes choices or default. In this case the pie visualization, with no data series it is represented by the "2D" column in the pie section of the visualization order data and the value as close as possible to the previous would be any pair combination, irrespective of order, from the three previously used indices 4, 7".
Firstly 4) would be checked against those in the "2D" column of the bar section in the visualization order data, which exists so it is used. The alternate selection buttons 2608 and 2610 are then enabled since multiple entries exist in the "2D" column of the visualization order data. The user may however make the visualization use a Data Series by checking the box 2604. This would change the visualization order data being used from the index 4" in the "2D" column of the pie section to the index 3, 4" in the column of the pie section, which is the first entry in the In this way, the data components visualized in the data value is the same with or without the data 707044.doc 85 N series. There are multiple entries in the column of the pie section in the visualization order data so the buttons 2608 and 2610 are enabled. If the user were to N press the button 2608 the data visualized will change from the indices 3, 4" to 3, 7" as shown in the column of the pie section in the visualization order data.
00 5 Alternatively, if the user were instead to press the button 2610 the data visualized will
O
NI change from the indices 3, 4" to 4, 7" as shown in the column of the pie section in the visualization order data. In this way a user may quickly obtain useful visualizations for the data content being observed.
If tabular were selected in the drop down selector list 2602, according to the proportionality rules of Section 4 ("Type Order" for tabular visualizations), the "All" indicates that all data components are taken by default irrespective of their type. This just means that the selectors 2604, 2606, 2608 and 2610 are disabled as there are no choices to make in the tabular visualization.
Figs. 37A and 37B show an arrangement of using indicators to condense information from a data source within a particular view, and thus represents an alternative to the dashed line of Figs. 33A to 33E discussed above. Fig. 37A and 37B show an arrangement of information for Figs. 33B and 33A, for depth 0 and depth 0 1 respectively. An indicator is provided (subscripted and bracketed) to represent the number of duplicate representations hidden from the user. Specifically, in this implementation in Fig. 37A, the value A(s) means that there are 9 replications, 1 being unique, being that represented, and 8 duplicates hidden from view. This results in a showing or display of the number of representations while removing redundant display information. This form of representation may be varied. Fro example, the user may desire an indication of what an expansion to one depth will be, or alternatively for a full 707044.doc -86- N1 expansion. The difference between the two is that at depth 0, would be illustrated as
"A(
2 or respectively. Similarly, at depth 0 1, could appear as "A(4):A4) or the latter being more accurate but the formed may be more user friendly.
00 It follows that the above provides a method of displaying data from a data source, 00 5 where data attributes for display are selected by the user. The data attributes can have
O
N values in a one to many relationship. The method is able to determine unique values in the data for the selected data attribute, and then display those data values determined to be unique. For example, in Fig. 37C, each of the given names listed is unique, regardless of the indicator referring to the number of instances each name appears (for the same family name). This functionality is also depicted in Figs. 28A to 32, discussed above.
The described arrangements offer a significant navigational advantage with structured databases over traditional approaches. For example, with traditional "drilldown" methods, each user interaction permits only a 1 traverse through the structure.
That is, you can select one attribute and drill down to the next level within that attribute.
In the presently described arrangements, the "depth" is absolute in the sense that the user can jump from depth 0 to depth 3, for example. Thus progressive or sequential traversal is not a constraint on navigation.
Industrial Applicability The arrangements described are applicable to the computer and data processing industries and particularly where data is desired to be viewed from data sources.
The foregoing describes only some embodiments of the present invention, and modifications and/or changes can be made thereto without departing from the scope and spirit of the invention, the embodiments being illustrative and not restrictive.
707044.doc 87- N, (Australia Only) In the context of this specification, the word "comprising" means "including principally but not necessarily solely" or "having" or "including", and not "consisting only of". Variations of the word "comprising", such as "comprise" and "comprises" have correspondingly varied meanings.
00 o o
(N
707044.doc
Claims (2)
- 88- N The claims defining the invention are as follows: N 1. A method of displaying data from at least one data source, said method 00 comprising the steps of: 00 5 selecting data attributes desired to be duplicated from said data sources O IN for display, wherein said data attributes have values in a one to many relationship; (ii) determining unique values in the data from said data sources for each said selected data attribute by removing redundant representations of the data; and (iii) displaying those data values determined to be unique. 2. A method according to claim 1 wherein the display of the unique data values further comprises displaying at least one indicator related to information available to be represented in an alternate display of the data based upon selection of alternate data attributes. 3. A method according to claim 2 wherein said indicator is selected from the group consisting of: a numerical value; a modified edge to a representation of the data; a ghosting of at least one section of a representation of the data; and a mark within a single column representation; summary data within at least one column representation, said summary data being selected from the group consisting of: (ea) non-selected ones of said data attributes;
- 707044.doc 89 O 1 (eb) depth values associated with the data attributes; (ec) the number of unique values not displayed; (ed) the number of unique values not displayed per data attribute; 00 (ee) the number of unique values not displayed per displayed data 00 5 value per data attribute; and NI (ef) the number of unique values not displayed per displayed data value; and a visible depth indication representative of missing information. 4. A method according to claim 1 wherein said at least one data source comprises plural attributes associated with a plural data records, and is selected from the group consisting of a hierarchical data base and a relational data base. A method according to claim 1 wherein step (ii) comprises the sub-step of: (ii-a) determining unique correlations of values of data within the one to many relationships to facilitate display for each said unique correlation. 6. A method according to claim 5 wherein step (iii) comprises the sub-step of: (iii-a) determining displayable data from numbers of said unique correlations, a value associated with each said unique correlation, and a number of said attributes associated with each said correlation; (iii-b) interpreting said displayable data to identify at least one display type by which said displayable data may be represented; and 707044.doc "1 (iii-c) displaying said displayable data according to a selected one of said display types. 7. A method according to claim 6 wherein step (iii-c) comprises the steps of: 00 00 5 (iii-ca) displaying a graphical user interface incorporating at least a list of each O N said display type; and (iii-cb) selecting one said display type from said list. 8. A method according to claim 7 wherein said list comprises a single said display type and said selecting comprises automatically selecting said single display type. 9. A method according to claim 7 wherein said list comprises plural display types and step (iii-cb) comprises detecting user selection via said graphical user interface of one of said display types. A method according to claim 6 wherein said display types are selected from the group consisting of a table, a plot, a bar chart, and a pie chart. 11. A method according to claim 1 wherein step comprises the sub-steps of: displaying a graphical user interface representing unique correlations of the data attributes; and detecting a selection of one of said unique correlations of data attributes. 707044.doc -91 N 12. A method according to claim 11 wherein said graphical user interface displays groups of said attributes, each said group being associated with a depth of data with said data source, such that each said depth is selectable to define the corresponding group of attributes for determining according to step (ii). 00 O S13. A method according to claim 12 wherein said depths are selectable individually or as groups of depths, such that a display type for presentation of data according to step (iii) varies according to the available depths of attributes available for selection. 14. A method according to claim 1 further comprising the step, preceding step of: calculating a depth measure of said data attributes according to the corresponding one to many relationship; and step comprises selecting at least one of the data attributes based upon the depth measures. A method for displaying data from at least one data source, said method comprising the steps of: analysing the data sources to identify attributes thereof that are related by a one-to-many relationship; assigning a measure to each of the identified attributes according to the one-to-many relationship therebetween; forming a display representation including at least one selectable indicator of the assigned measures; and 707044.doc 92 1 for at least one selected measure, forming a corresponding representation of the corresponding ones of said attributes. 16. A method according to claim 15 wherein steps and comprise representing 00 5 said selectable indicator and representation in a first graphical user interface. 17. A method according to claim 16 wherein step comprises forming a structured representation of said measures and using one said selectable indicator to envelope at least one said measure, and step comprises modifying the indicator to alter an enveloping of said structurally represented measures. 18. A method according to claim 17 wherein said structured representation comprises a linear representation of said measures and said indicator comprises a variable sized windows manipulable via said first graphical user interface tho envelop one or more of the linearly represented measure values. 19. A method according to claim 17 wherein said modifying comprises splitting one said indicator into plural said indicators each enveloping at least one said measure. 20. A method according to claim 17 wherein said modifying comprises merging plural said indicators to form a single said indicator enveloping plural of said measures. 21. A method according to claim 16 wherein each said measure is associated with a corresponding said selectable indicator. 707044.doc 93 22. A method according to claim 16 wherein the representation of the attributes comprises a text box listing the attributes for the corresponding one of the measures. 00 00 5 23. A method according to claim 16, said method further comprising the steps of: O presenting a second graphical user interface by which the identified attributes are represented associated with at least one display type by which unique values associated with said attributes are displayable, detecting selection of one said display type; and displaying the unique values according to the selected display type. 24. A method according to claim 23 wherein said display types are selected from the group consisting of a table, a graph, a bar chart, and a pie chart. 25. A method according to claim 15 wherein said measure is a depth value associated with a number of one-to-many relationships associated with a particular attribute. 26. A method of displaying data from at least one data source substantially as described herein with reference to any one of the embodiments as that embodiment is illustrated in the drawings. 27. A computer readable medium having a computer program recorded thereon and executable to make a computer apparatus display data from at least one data source according to the method of any one of claims 1 to 26. 707044.doc 94 28. Computer apparatus adapted to perform the method of any one of claims 1 to 26. Dated this TWENTY-NINTH day of APRIL, 2005 00 00 5 CANON KABUSHIKI KAISHA Patent Attorneys for the Applicant Spruson&Ferguson (N 707044.doc
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| AU2005201818A AU2005201818A1 (en) | 2005-04-29 | 2005-04-29 | Method for Navigating and Displaying Complex Information |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| AU2005201818A AU2005201818A1 (en) | 2005-04-29 | 2005-04-29 | Method for Navigating and Displaying Complex Information |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| AU2005201818A1 true AU2005201818A1 (en) | 2006-11-16 |
Family
ID=37461251
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| AU2005201818A Abandoned AU2005201818A1 (en) | 2005-04-29 | 2005-04-29 | Method for Navigating and Displaying Complex Information |
Country Status (1)
| Country | Link |
|---|---|
| AU (1) | AU2005201818A1 (en) |
-
2005
- 2005-04-29 AU AU2005201818A patent/AU2005201818A1/en not_active Abandoned
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US7428705B2 (en) | Web map tool | |
| US8010910B2 (en) | Breadcrumb list supplementing for hierarchical data sets | |
| Dadzie et al. | Approaches to visualising linked data: A survey | |
| KR101402444B1 (en) | Method and system for navigating in a database of a computer system | |
| US8452776B2 (en) | Spatial data portal | |
| JP4945708B2 (en) | Computer input control to specify ranges by explicit exclusion | |
| US20150113023A1 (en) | Web application for debate maps | |
| US8990717B2 (en) | Context-aware charting | |
| US9342908B2 (en) | Information retrieval and presentation methods and systems | |
| JP2013510378A (en) | System, method, and computer program for generating and manipulating data structures using an interactive graphical interface | |
| KR20080002815A (en) | Explore, navigate and search electronic information | |
| US20090158178A1 (en) | Graphically navigating tree structures | |
| US8259114B2 (en) | System and method for visualizing parameter effective data sets | |
| AU2005201818A1 (en) | Method for Navigating and Displaying Complex Information | |
| KR20070091351A (en) | How to display the topology using visual objects | |
| Sheth et al. | Treemap, radial tree, and 3d tree visualizations | |
| US11550805B2 (en) | Compact display of matching results | |
| Rahman et al. | Extending spreadsheets to support seamless navigation at scale | |
| US11409762B2 (en) | Interactively constructing a query against a dataset | |
| De Greef | Generating Web Query Interfaces Based on Conceptual Schemas | |
| Kristiansen | A Visual Language for Nested Visualization Design | |
| Plaisant et al. | Interactive Information Visualization of a Million Items | |
| Tillett | Unpackable Treemaps as Web History Graphs | |
| ihar Sheth et al. | Visualizing MeSH Dataset using Radial Tree Layout | |
| Bergström | Augmenting digital libraries using web-based visualizations |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| MK1 | Application lapsed section 142(2)(a) - no request for examination in relevant period |