[go: up one dir, main page]

WO2015017886A1 - Method and system for managing and sharing working files in a document management system: - Google Patents

Method and system for managing and sharing working files in a document management system: Download PDF

Info

Publication number
WO2015017886A1
WO2015017886A1 PCT/AU2014/000788 AU2014000788W WO2015017886A1 WO 2015017886 A1 WO2015017886 A1 WO 2015017886A1 AU 2014000788 W AU2014000788 W AU 2014000788W WO 2015017886 A1 WO2015017886 A1 WO 2015017886A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
working
sidecar
files
metadata
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/AU2014/000788
Other languages
French (fr)
Inventor
Jonathan Robert Burnett
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2013902993A external-priority patent/AU2013902993A0/en
Application filed by Individual filed Critical Individual
Publication of WO2015017886A1 publication Critical patent/WO2015017886A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/176Support for shared access to files; File sharing support

Definitions

  • the present disclosure is generally related to document management systems and, more particularly, to methods and system for managing metadata in shared access document management systems.
  • DMS document management system
  • Metadata is data defining data in a digital file and typically includes fields such as file type, date created, date modified, author, artist, and/or title, for example.
  • the DMS determines the fife type and compares a previous modified date with a new modified date to determine any changes in a document. Further, based on one or more metadata fields, the DMS classifies the documents for representatio on a user interface.
  • a method for managing a working file to permit sharing of th working file includes the steps of retrieving the working file from a source, extracting metadata associated with the retrieved working file, creating at least one sidecar file for the retrieved working file based on the extracted metadata, linking the at least one sidecar file wit the retrieved working file, and storing the at least one sidecar file in association with the retrieved working file.
  • a system for managing working files includes a retrieval unit configured to retrieve the working files from one or more sources, an extractor configured to extract metadata associated with the retrieved working files, and an authentication unit configured to authenticate a user.
  • the system further includes a processing unit configured to classify the ex tracted metadata of each, working file from the retrieved working files into at least one of private metadata, public metadata, or shared metadata, create at least one sidecar file for each of the retrieved working files based on the classified metadata, the at least one sidecar file comprising at least one of a private sidecar file, a public sidecar file, or a shared sidecar file, link the at least one sidecar file wit each of the corresponding retrieved working files, store the at least one sidecar file in association with each of the corresponding workin files, access one or more of the working files and corresponding private sidecar files if the authenticated user is an author of the working files, access one or more of the working files and correspondin shared sidecar files if the authenticated user is part of an authorized group, access one or more of the working files and corresponding public sidecar tiles for any user, and classify the accessed working files based on the corresponding accessed sidecar files.
  • a processing unit configured to classify the ex tracted metadata of each,
  • FIG. 1 is a block diagram illustrating a computing environment where the presently disclosed arrangements are employed.
  • FTG. 2 is a block diagram illustrating an exemplary document management system.
  • F IG. 3 is a flowchart illustrating an exemplary method for sharing working files on a shared network
  • a system and a method for managing and sharing documents in a document management system are presented.
  • the system and method are configured to generate one or more forms of metadata associated with a waiting file, and store the one or more forms of metadata i a sidecar file associated with the working file in a distributed arrangement
  • the methods and systems allow a user to access the working file and one or more of the associated metadata sidecar files based on an access level granted to the user. It is also anticipated that the present disclosure could be used to create distributed databases that are not intended to be used as the basis for a document management system. I this situation, the sidecar becomes the equivalent of a record in a conventional relational database.
  • the techniques disclosed herein allow management of sidecar files even if a working file is not associated with the sidecar file.
  • the metadata stored in a. sidecar file is generated contemporaneously or after the creation of the corresponding source file. 1 iowever.
  • metadata can be defined before the existence of a working file or in the absence of a working file.
  • the working file could be an. image of the label , in this example, it is likely that the wine enthusiast would catalogue his/her collection (i.e., create metadata) before the images (i.e., the working files) are collected. Similar scenarios can also exist in a legal review context, where evidence is identified and catalogued before the corresponding documents ' have been located and imaged.
  • FIG. 1 is a block diagram of a distributed computing environment 100 where an exemplary document management system (DMS) 102 is implemented.
  • the environment 1.00 includes the DMS 102 installed and executing on a server computer within a communications network 106.
  • the network 106 interconnects and operatively couples one or more user devices 104a 104b, 104c, 104d ...
  • the environment 100 further includes one or more storage devices 1 10, servers 1 12, and application platforms 1 14 operatively coupled to the network 106 and/or the user devices 104.
  • the network 1 6 can be a pri vate network, such as a local area network (LAN) or a virtual private network (VPN), hosted by a single organization; a community network, such as a wide area network (WAN), hosted by multiple organizations; a public network, such as the Internet; or a hybrid combination of these networks.
  • the user devices 104 are operatively coupled to the network 106 through wired or wireless means.
  • Wired connections include Ethernet cables, LAN cables, and the tike; while, wireless connections include Bluetooth ' ⁇ , 802.11 standard WLAN, Wi-Fi, cellular telecommunications, and the like.
  • the user devices 104 can be any electronic device that is capable of exchanging data over the network 106. Examples of such electronic devices include personal computers, laptops, tablets, or handheld devices. Moreover, as referred to in the present disclosure, a 'user' 108 of the environment. 100 includes an individual person or an organization including multiple persons.
  • the storage devices 1 10 include storage servers on the network 106. such as shared databases maintained by one or more people or an organisation.
  • the storage devices 110 also include user-maintained databases on the network 106 such as email accounts, and cloud storage devices such, as Google* Drive accounts, Dropbox* accounts and the like.
  • Cloud storage devices refer to a network of multiple physical devices that each store and maintain a copy of a working file with an associated mechanism for replicating and synchronizin the various copies of the working files on ail the physical devices present in the cloud network.
  • Users 108 typically store data, such as text files, music files, videos, images, program files, system flies, or executable files on one or more of these storage devices 1 10. Collectively, such data in any known format is termed as 'working files' throughout this disclosure. Over time, as users continue to create and store working files, the number of working files distributed across the network 106 becomes gargantuan. Often, users face difficult in managing and/or tracking such large number of files. To aid users 108 in efficient file management, the DMS 102 may be employed. The DMS 102 as described in the present disclosure enables users 1 8 to view, track, search, access, modify and potentially share their documents from multiple storage devices 1 10 in a fast and efficient manner.
  • the DMS 102 retrieves working files and their associated metadata from one or more sources. Further, the DMS customizes and classifies the metadata and creates one or more sidecar- files for the metadata. Thereafter, the DMS 102 links the sidecar metadata files to each of the corresponding working files and stores both the working file and the sidecar files in a repository-.
  • a sidecar file is therefore configured to exist alongside and be communicated together with an associated working file and, according to the present disclosure, act as a repository for the metadata of the associated working fi le. Because the metadata is stored in sidecar files associated with the working files, rather than an aggregated metadata catalogue, the metadata can be replicated and synchronised using the same mechanism used to replicate and synchronise the working files. This is particularly beneficial in situations where the storage devices 110 are cloud based. Detailed ilinctionality of the DMS will be described with reference to FIGS. 2 and 3.
  • the DMS 102 is executed within the network 106. Because of this arrangement, the DMS 102 can be accessed from any user device 104 that is capable of connecting to the network 106, irrespective of the device's configuration. However, because the DMS is executed through the network 106, it is inaccessible without a network connection. In other implementations, instances of the DMS platform 102 can be installed on each of the user devices 104. In this case, users can access the DMS 102 without a network connection. However, the DMS 302 is inaccessible from user devices 104 on which the DMS 102 is not installed. Alternately, the DMS 102 is executed on one or more of the user devices 104 and on the network 106.
  • FIG. 2 is a block diagram 200 of the exemplary DMS 102, which includes a retrieval unit 202 coupled to one or more sources to retrieve working files 214, an extractor 204 operably coupled to the retrieving unit 202 and to one or more metadata repositories 21.8 to extract metadata associated with the working files 214.
  • the DMS 102 further includes a processing unit 206 operably coupled to the extractor 204 and a user authentication unit 208.
  • the processing unit 206 is configured to classify the metadata, create sidecar files 220, and link the sidecar files 220 with the working files 214.
  • a user interface 210 is provided to request inputs from the users 108, or display outputs to the users 108.
  • the DMS 102 is also associated with at least one database 21.2 in which the linked working files 214 and sidecar files 220 are stored. Functionalit of these units and modules will be described in detail in the following sections.
  • the retrieval unit 202 is configured to retrieve one or more working files 214 from one or more sources.
  • the sources included storage devices 1 10, network servers 112 such as email or fax servers, or one or more applications platforms 1 14 such as document editors or viewers.
  • the DMS 102 may be employed by organizations to manage, track, and share their working files at an organizational level or by individuals to manage and share their personal working files.
  • the DMS 102 is integrated into the organization's network 106 and coupled to the organization's servers, and applications. Accordingly, in this application, each time a file is created or received by the network 102, the retrieval unit 202 retrieves die working file 214.
  • the DMS 102 can be installed on one or more user devices 104 or executed as a network application.
  • the retrieval unit 202 is configured to retrieve the working files 214 from one or more storage devices 1 10, servers 1 12 or application platforms 1 14 at predetermined times.
  • the retrieval may be periodic (e.g., every day. every week, or every 6 hours) or at specific times (e.g., on activation of the DMS 102 application, or before shutting down).
  • the retrieval can be automated or manual. In the automated implementation, based on the predetermined retrieval time, the DMS 102 scans the sources to retrieve any working files 214 that have been added to the sources since the last retrieval.
  • the retrieval unit 202 displays a list of the available sources on the user interface 210 and. requests the user J08 to select one or more sources or one or more working files 214 from that list for retrieval. Subsequently, the retrieval, unit 202 retrieves the selected working files 214.
  • the extractor 204 is operatively coupled to at least one of metadata repositories 21.8 and the retrieval unit 202 to retrieve metadata associated with, the working files 214.
  • metadata typically includes values associated with fields such as name, creation date, author, access rights, file format, and so on.
  • applications may regularly update the metadata as and when the working fi] es 214 are updated. For instance, every time a document is modified, the metadata field 'date modified' is updated.
  • metadata is either embedded within a working file, or aggregated and stored in a metadata repository 218, Accordingly, based on the location of the metadata, the extractor 204 retrieves the metadata either from the workin files 214 or from the metadata repository 218.
  • the extracted metadata can be insufficient. For instance, organizations often desire customized metadata fields such as 'department', 'supervisor', 'internal reference number", and so on, which are not populated by generic applications. Similarly, individual users often desire customized metadata fields such as 'genre', 'locations where images were captured' , 'image tags', 'type of study material * , and so on. in both these eases, the extractor 204 is configured to request the user 108 to manually update, add or remove one or more of the metadata fields via operation of the user interface 210. To this end, the extractor 204 displays the retrieved, working files 214 and their extracted metadata on the user interface 210. In addition, the extractor 204 presents the user 108 with options to modify, add, or remove metadata fields and their values at this stage,
  • the processing unit 206 may be configured to generate metadata. Occasionally, the relationship between a working file and its associated metadata may be altered or destroyed while modifying or saving the working file 214, For example, time stam information will be updated when a working file 214 is modified. Similarly, file path metadata can be changed if a working file 214 is moved or copied. Deri ved metadata, such as MDS hash values, ca also change when a file is modified. There are also situations where some applications overwrite or corrupt metadata previously inserted into a working file. In these eases, it can be difficult for the extractor 204 to locate the original metadata associated with a particular working file 214. In other cases, the metadata may not include values for one or more fields, such as the customized fields.
  • the processing unit 206 may be configured to automatically generate metadata for the working files 214 based on the content of the working files 214, For instance, the processing unit 206 can generate the case reference number from the subject line of an email, or generate a document number from a header or footer of the working file, and so on.
  • the processing unit 206 is configured to classify the metadata into different classes. Users 1 OS may desire some metadata to be private, while they might not mind sharing other metadata with the public at large. Accordingly, based on a set of predetermined rules, the processing unit 208 is configured to classify the metadata associated with working files 214. In one example, for eac retrieved working file, the processing unit 206 is configured to classify the metadata into one or more of three categories - private metadata, shared metadata, and public metadata. Private metadata is the metadata that only the author of the working file can access. This metadata typically include fields such as track changes, edits, or comments in a working file, review time of the working file, and so on.
  • Shared metadata includes metadata that can be accessed by an authorized working group, such as a team working on a project, people in a study group, and so on. As the author is typically a part of the group, the author is also allowed access to the shared metadata. This metadata typically includes version number, edits, and comments that may be shared within tire group, and so on. Public metadata on the other hand, includes metadata that can be accessed by anyone. This metadata often includes creation date, author, file type, and the like.
  • Metadata classes are merely exemplary and that the processing unit 206 may be configured to classify the metadata according to any other classification or rules without departing from the scope of the present disclosure.
  • the metadata may be classified as 'open to public', 'private', 'internal', 'privileged', 'confidential', or 'external 1 .
  • the processing unit 206 is configured to classify the metadata based on predetermined rules. These rules can be configured for a set of working files 214 based on certain parameters associated with the working files 214. For instance, according to one rule, the metadata for all musk files is classified as public metadata. Similarly, according to another rule, for text files, some metadata fields such as filename, file type, file size, and creation, date are classified as public metadata, while other fields such as edits, comments, and review time are classified as private metadata.
  • the predetermined rules may be configured by an authorized user. For instance, a user wit administrator rights is allowed to configure the classification rules. Alternatively, an author of a working file 214 is allowed to configure the classification rules for that working file.
  • the DMS 102 includes the authentication module 208. This module 208 retains authorization information of all the users 208 of the DMS 102. Moreover, the authentication module 208 is configured to request users to enter their authentication information (such as a user name, a password, a secret answer and so on) and based on a lookup of this information with the retained authorization information, the authentication module 208 determines the authorization level of a user 108. This authentication information is then communicated to the processing unit 206.
  • the user 108 may be permitted- to configure the classification rules for those working files 214.
  • the authenticated user is an. administrator, the user 108 may be allowed to configure the classification rules of any working file 214.
  • the authenticated user is not the administrator or the author of any of the working fi les, the user 208 is prohibited from setting up classification rules.
  • the processing unit 206 is configured to create sidecar files 220 for the metadata.
  • sidecar files also known as buddy files or connected files, are files that store data (usually metadata) which is not supported by the source file format. Commonly, these files employ the Extensible Markup Language (XML). However, other formats may be utilized for sidecar files.
  • the processing unit 206 creates one or more sidecar files 220 for each retrieved working file 214. Moreover, in most cases, the relationship between the working file 214 and the sidecar file 220 is based on the file name.
  • the working file 14 and the sidecar file 220 may have the same name, but different extensions, in some arrangements, the processing unit 206 may store the sidecar files 220 in the same folder as the working file 214. However, in other arrangements, the sidecar files may be stored in any other location different from tire storage location of the working file 214.
  • the processing unit 206 creates individual sidecar files for the classified metadata. For instance, a private sidecar file is created fo the private metadata, a shared sidecar file is created for the shared metadata, and a public sidecar file is created for the public metadata. Moreover, the processing unit 206 may link these sidecar files 220 with their associated working files 214.
  • Metadata from multiple working files is aggregated and stored in a separate metadata repository, '
  • metadata repository By storing metadata in this fashion, some difficulties may arise when different systems are utilized to manage the metadata and the working files.
  • a. centralised metadata repository can face difficulty in monitoring changes in working files that are replicated across multiple physical devices and that are not accessible to the centralized metadata repository.
  • the metadata is classified and stored in multiple sidecar files 220, which are directly linked with their associated working file 214 and stored in the same database 212. Because of this direct association when the workin file 214 is shared over the network 106, the metadata is also accessed with the working file 214. Moreover, if the working file 214 is simultaneously shared by two or more users 108 based on tire authentication of the two or more users 108, and the two or more users 108 simultaneousl modify the shared working file, the metadata sidecar file 220 accessible to the users 108 is automatically updated as well. In one arrangement, for instance, the processing unit. 206 may monitor the working file 214 accessed by multiple users 1 8 and detect when two or more users simultaneously modify the working file 214.
  • the processing unit 208 may be configured to notif the users of the simultaneous modification, and resolve the simultaneous modification such that the resolved modification is captured or updated in the associated sidecar file as well.
  • the processing unit 208 may be configured to "check out” and lock the "modify” properties of a sidecar file 220 while updates are being made, thereby preventing another user from making updates at the same time.
  • the processing unit 206 is also configured to encrypt the working files and/or the sidecar files. Any encryption format may be utilized without departing from the scope of the present disclosure. Further, in some arrangements, the processing unit is also configured to manage different versions of the working files and their sidecar, files and also resolve any conflicts between different versions of the working files.
  • the linked working files 214 and their associated sidecar files 220 are stored in the database 212.
  • this database 212 is illustrated as part of the DMS 102.
  • this database 212 may be present on one or more of the user devices 104 or o the network 106.
  • the DMS stores the working files 214 with their sidecar files 220 in the respective storage devices 1 10 from which the working files 214 were retrieved,
  • the DMS 102 allows users 108 to manage working files 214 by collating working files 214 from a number of disparate formats and storage devices into a central location for users to view and modify.
  • the processing unit 206 may be configured to generate hierarchical structures for the working files based on the classifications of these working files.
  • FIG. 3 is a flowchart illustrating an exemplary method 300 for sharing working files on a network. This method 300 will be described with reference to FIGS, 1 and 2.
  • the method 300 begins at step 302 where one or more working files, such as the working files 214 are retrieved from one or more sources, in one arrangement, the retrieval unit 202 retrieves the working files 214 from one or more storage devices 1 10, servers 112, or application platforms 1 14 each time a orking file 214 is received or created.
  • the retrieval unit 202 may retrieve working files 214 at predetermined times from the sources or request a user 1.08 to manually select one or more working files 214 for retrieval.
  • step 304 metadata associated with the one or more retrieved workin files 214 is extracted.
  • metadata is typically stored within the working file 214 or in a metadata repository 218.
  • the extractor 204 is configured to extract this metadata from eithe the retrieved working file 214 or the metadata repository 218. It will be noted that the extractor 204 is further configured to communicate with the user 108 through the user interlace 210 to displa the extracted metadata, and/or request the user to .modify, add or remove metadata fields from the extracted metadata.
  • the extracted metadata for each of the retrieved working files is classified.
  • the processing unit 206 classifies the extracted metadata based on one or more predetermined rules.
  • the rules may he configured by an authorized user or automatically set by the DMS 102.
  • the rules may vary for different working file formats. For instance, the rules may be different for text files and music files. Alternatively, the same rules may be applied to all the working files 214.
  • the metadata is classified as private metadata, public metadata, or shared metadata. These classifications are merely exemplary and any other classification system may he employed without departing from the scope of the present disclosure.
  • one or more sidecar files 220 are created for each of the retrieved working files 214 based on the classified metadata. For instance, if the metadata for a working file 214 is classified as private metadata and public metadata, two sidecar files 220 are created for that particular file - a private sidecar file and a public sidecar file. Alternatively, if the metadata, for a working file 214 is classified as private metadata, shared metadata, and public metadata, three sidecar files 220 are created for that working file 214 - a private sidecar file, a shared sidecar file, and a public sidecar file.
  • the processing unit 206 is configured to link the one or more sidecar files 220 with their associated working file 214 at step 310.
  • the sidecar files are linked to the associated working file by adopting the same file name as that, of the working file, but with different extensions,
  • the sidecar files 220 are stored in association with their respective working file 214 in a repository, at step 312.
  • the repository may be an on-board database 212 of the DMS 1.02, a database 212 present on the user device 104 or a database coupled to the network 1 6.
  • the sidecar files 214 are stored in association with their respective working files 214 in the storage devices 1 10 from which the vvorkrag files 214 were retrieved.
  • the sidecar files 220 can be stored in the same folder as the working file 214, in one arrangement. Alternatively, the sidecar files 220 can be stored in a different folder or repository from the working file 214.
  • the various sidecar files 220 associated with a particular working file 214 are stored in three designated locations, which may be different or may overlap to a degree.
  • the DMS 102 can be deployed in an environment where working files are distributed across multiple shared folders, with eac shared folder having a different level of access control.
  • Various combinations of shared folders for both the working documents and the metadata sidecar files can then be defined to cater for a broad range of document, sharing scenarios (See example scenarios).
  • the DMS 102 is configured to manipulate and or manage the working files 214 based on the sidecar files 220.
  • the user 108 may input a search query to retrieve specific vvorking files.
  • the processing unit 206 determines the authorization level of the user 108 at step 3 14.
  • the authentication unit 208 is employed. In one arrangement, the authentication unit 208 may determine whether the user is the author of the requested working file 214, a part of an authorized group associated with the working rile 214, or just any other user. In case the user is the author, the user is allowed to access the working file 2 14 and the corresponding private sidecar file.
  • the user is allowed access to the working file 214 and the associated shared sidecar file.
  • the user is classified as any user and is allowed access to the working file and the associated public sidecar file.
  • processing unit 206 is configured to access the sidecar files 220 that the user is authorized to access, to retrieve specific working files 214 (step 316). For instance, a user may wish to access all the music files created by him/her from the year 1950 that are stored on the storage devices 1 10. hi this case, as the user is the author, the processing unit 206 accesses all the metadata sidecar files 220 associated with music files to determine the release year of the music files. Subsequently, all the music files from the year 1950 will be displayed for the user 1 8 on the user interface 210.
  • the DMS 102 would be unable to search for the release year and consequently the DMS 102 would not be able to return accurate search results.
  • the DMS 302 may display the metadata fields available to the user based on the user's authorization level. Thereafter, the user 208 can filter or search working files 214 based on the metadata fields available to him her,, thus saving time.
  • the DMS 102 can resolve any simultaneous modifications to working files shared by two or more users 108.
  • the method proceeds by notifying the two or more users 108 of the simultaneous modification, and resolving the simultaneous modification such that the resolved modifications may be updated/captured in the associated sidecar file as well.
  • the DMS 102 may retrieve a list of the working files at step 302.
  • working files are stored in their native location, such as any logical or physical device associated with the DMS 102.
  • the DMS 102 maintains a list of the working files that are stored across multiple logical and physical devices.
  • the DMS 102 may detect changes such as addition, deletio or updates to the working files.
  • the DMS 102 may also be able to maintain an audit trail of such changes.
  • the DMS 102 may be unable to recover deleted files in this arrangement.
  • the DMS 102 when the DMS 102 retrieves the working files from their respective locations at step 302, the DMS 102 can store a copy of the working files in one or more folders that are essentially part of the DMS 102. Maintaining copies of the working files is desirable when a snapshot of the working files in a particular location at a particular time is required, ft is then relatively easy to synchronise the working files in a native location agamst the copies retained in the DMS folder. Multiple versions of a. working file can also be retained and managed, making it possible to recover files or restore an earlier version. Cha ges made to a working file by multiple users at the same time can be detected and any conflicts resolved. If modification of working files is made through the DMS then it is also possible to use a check out and lock mechanism to prevent concurrent access to the same working file.
  • the working files may be stored on remote servers that are not directly accessible by the DMS 102.
  • technologies such as FTP and web services are utilized to transfer the working files 214 from the remote server to a local database.
  • the DMS 102 may store a copy of the working files in an associated database.
  • the working files can be transformed and combined with their associated metadata sidecar files into a single physical file using filing formats such as Zip*. Such combination of the working file with the metadata sidecar files prevent separation the working files and their associated metadata.
  • the files may be compressed and/or encrypted.
  • users may wish to manage their own personal data present on various remote, local and virtual physical devices. Accordingly, users may utilise the DMS 102 platform described above. Further, the DMS 102 may retrieve working files from multiple locations and store copies of these working files in a cloud database. Further, as the user w ishes to manage their personal documents, the DMS 102 classifies all the working files as private documents. Further, for these personal documents, the DMS 102 creates private metadata sidecar files and in some instances public metadata sidecar flies. These sidecar files are then stored with the working filed in the cloud database. The metadata may be classified as public and private metadata so that, people other than the authorized user may access some of the metadata while accessing a working file. Table 1 summarizes tire location of the working fi les and then associated sidecar files for this case.
  • the DMS 102 may store the working files i two separate cloud folders/databases. For instance, personal working files may be stored in a personal cloud folder, while shared working files may be stored in another folder/database ⁇ e.g., shared cloud folder).
  • the DMS 102 creates metadata sidecar files for the working files.
  • metadata for private files may include private sidecar files.
  • the private sidecar metadata files are stored in a personal cloud folder for metadata. This folder may or may not be the same folder as the personal cloud folder for documents. If the folders are the same, the metadata sidecar files are stored with the private working files.
  • the DMS 102 may generate public, shared and private metadata sidecar files. Moreover, the public and shared metadata sidecar files may be stored with the documents in the shared cloud folder, while the private metadata sidecar file may be stored in the personal cloud folder for documents or metadata. Table 2 summarizes the location of various working files and their metadata for this case.
  • the DMS 102 elassifies documents as either shared documents or public documents.
  • the shared documents may be stored in a shared cloud for documents, while the public working files may be stored in a public cloud folder for documents.
  • the DMS 102 may generate public, shared and private metadata for the working files.
  • the public metadata sidecar files for shared documents ma be stored in the shared cloud folder, while the public metadata sidecar file for public documents may be stored in the public cloud folder.
  • the shared metadata for shared and public working files may be stored in the shared cloud folder, while the private metadata for shared and public working files may be stored in a personal cloud folder. Table 3 summarizes the location of the working files and their metadata for this case.
  • the DMS 102 retrieves the working files and generates associated metadata for the working files.
  • Working files related to all the bid teams may be treated as public documents. However, they may be stored in a shared cloud folder.
  • the metadata may be classified as public, shared and private and the metadata sidecar files may be stored with the working file, in the shared network folder, or in a personal folder respectively. Table 5 illustrates a summary of this information.
  • the DMS 102 is configured to extract existing metadata or create metadata for the working files, derive a hierarchical structure, such as classification trees, from the metadata to facilitate browsing of the working files, define additional metadata fields through the user interface or by loading data dictionaries. Furthermore, the DMS 102 is configured to classify the metadata into different categories, apply permissions to the metadata fields, allocate metadata to distinct sidecar files based on the applied permissions, and store metadata in the DMS 102 or any other databases associated with the DMS 102. Moreover, the DMS 102 is configured to prevent and/or resolve concurrent updates to metadata sidecar files, and monitor updates to working files.
  • the DMS 102 updates the metadata immediately to reflect any changes in the working file 214. This is because ' unlike the traditional document management systems, the DMS 102 of the present disclosure links and stores the metadata associated with a working tile in a sidecar file. This file is always activated when the working file is accessed and accordingly any changes applied to the working file and are also applied to this sidecar file.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Storage Device Security (AREA)

Abstract

A method for managing a working file to permit sharing of the working file is presented. The method includes the steps of retrieving the working file from a source, extracting metadata associated with the retrieved working file, creating at least one sidecar file for the retrieved working file based on the extracted metadata, linking the at least one sidecar file with the retrieved working tile, and storing the at least one sidecar tile in association with the retrieved working file.

Description

METHOD AND SYSTEM FOR MANAGING AND SHARING WORKING FILES I A DOCUMENT MANAGEMENT SYSTEM:
CROSS-REFERENCE TO RELATED APPLICATION
The present application claims priority from- Australian Provisional Patent Application No. 20 3902993 filed on 9 August 2013, tire entire contents of which are incorporated herein by reference.
TECHNICAL FIELD
100011 The present disclosure is generally related to document management systems and, more particularly, to methods and system for managing metadata in shared access document management systems.
BACKGROUND
[0002] Wit the advent of digital data, digital storage, and the Internet, more and more users are creating digital data and storing this data on their electronic devices. For instance, users may create and store multiple text files, images, audio files, videos, and the like. With such large volumes of disparate file types, managing a user's or an organization's data becomes cumbersome.
10003 J One platform that allows users to manage their disparate data is a document management system (DMS). Document management systems are often computer systems o a set of computer programs used to track and store electronic documents. The DMS is usually also capable of tracking different versions of a document modified by different users (i.e., history tracking). Further, some DMS integrate document management directly into other applications. This way, users retrieve existing documents directly from a repository associated with the DMS, edit the documents, and save the edited document back in the repository as a new version without leaving the DMS application. To perform these operations, the DMS relies on metadata associated with the documents. Metadata is data defining data in a digital file and typically includes fields such as file type, date created, date modified, author, artist, and/or title, for example. By reviewing the metadata, the DMS determines the fife type and compares a previous modified date with a new modified date to determine any changes in a document. Further, based on one or more metadata fields, the DMS classifies the documents for representatio on a user interface.
[0004] Typically, applications that create documents generate some metadata associated with the document. The DMS system extracts this metadata and .in some applications, adds its own metadata and stores the metadata in an associated repository. Often, documents are stored in one database, while their metadata is stored in a separate metadata repository. .Particularly, metadata for multiple documents is aggregated into a master metadata file, which is stored in a centralized metadata repository. When a user accesses a document from the repository and modifies the document, the modified document is saved in the repository and the DMS alters the metadata associated with that document in the metadata repository.
[00051 Such an operation works efficiently when a single user manages his/her dat from a single electronic device. However, difficulties arise when users share their documents with other users on. a shared network, such as the Internet, a local area network (LAN), or the like. Moreover, the difficulties increase when the documents are shared using cloud computing.
[0006] With the emergence of cloud computing, new document management systems, suc as Dropbox* and Google Drive* were introduced. These systems allow users to replicate and synchronize multiple copies of the same document across different physical devices. However, their ability to s nchronize documents is generally limited because these systems allow the creation of "conflicting copies" of a. particular document when multiple users access the document at the same instance. For example, if two or more users concurrently modify a document on a cloud-computing document management system, the system ma fail to correctly update all the modifications in the metadata. This is because the metadata stored in the metadata repository may not be synchronized for all the user modifications until the document is closed and stored back on the cloud.
SUMMARY
[0007] According to one aspect of the present disclosure, a method for managing a working file to permit sharing of th working file is presented. The method includes the steps of retrieving the working file from a source, extracting metadata associated with the retrieved working file, creating at least one sidecar file for the retrieved working file based on the extracted metadata, linking the at least one sidecar file wit the retrieved working file, and storing the at least one sidecar file in association with the retrieved working file.
[0008] According to another aspect of the present disclosure, a system for managing working files is presented. The system includes a retrieval unit configured to retrieve the working files from one or more sources, an extractor configured to extract metadata associated with the retrieved working files, and an authentication unit configured to authenticate a user. The system further includes a processing unit configured to classify the ex tracted metadata of each, working file from the retrieved working files into at least one of private metadata, public metadata, or shared metadata, create at least one sidecar file for each of the retrieved working files based on the classified metadata, the at least one sidecar file comprising at least one of a private sidecar file, a public sidecar file, or a shared sidecar file, link the at least one sidecar file wit each of the corresponding retrieved working files, store the at least one sidecar file in association with each of the corresponding workin files, access one or more of the working files and corresponding private sidecar files if the authenticated user is an author of the working files, access one or more of the working files and correspondin shared sidecar files if the authenticated user is part of an authorized group, access one or more of the working files and corresponding public sidecar tiles for any user, and classify the accessed working files based on the corresponding accessed sidecar files.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] At least one embodiment of the present invention will now be described with reference to the drawings,
[0010| FIG. 1 is a block diagram illustrating a computing environment where the presently disclosed arrangements are employed.
[0011] FTG. 2 is a block diagram illustrating an exemplary document management system.
[0012] F IG. 3 is a flowchart illustrating an exemplary method for sharing working files on a shared network,
[0013] While the systems and methods described herein are amenable to various modifications and alternative forms, specific implementations are shown by way of example in the drawings and are described in detail. It should be understood, however, that the drawings and detailed description thereto are not intended to limit the systems and methods described herei to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present invention as defined by the appended claims.
DETAILED DESCRIPTION
[0014] A system and a method for managing and sharing documents in a document management system (DMS) are presented. The system and method are configured to generate one or more forms of metadata associated with a waiting file, and store the one or more forms of metadata i a sidecar file associated with the working file in a distributed arrangement Moreover, the methods and systems allow a user to access the working file and one or more of the associated metadata sidecar files based on an access level granted to the user. It is also anticipated that the present disclosure could be used to create distributed databases that are not intended to be used as the basis for a document management system. I this situation, the sidecar becomes the equivalent of a record in a conventional relational database.
[0015] Furthermore, the techniques disclosed herein allow management of sidecar files even if a working file is not associated with the sidecar file. Generally, the metadata stored in a. sidecar file is generated contemporaneously or after the creation of the corresponding source file. 1 iowever. in certain situations, metadata can be defined before the existence of a working file or in the absence of a working file. For example, if a wine enthusiast creates a database of his/her wine collection, the working file could be an. image of the label , in this example, it is likely that the wine enthusiast would catalogue his/her collection (i.e., create metadata) before the images (i.e., the working files) are collected. Similar scenarios can also exist in a legal review context, where evidence is identified and catalogued before the corresponding documents' have been located and imaged.
[0016| The DMS as described herein finds utility in various applications. In one example, the DMS is be utilized by organizations to manage, track, and share their organizational, data. Alternatively, the DMS can be utilized by individual users to manage and share their personal data. These- applications are merely illustrative and the DMS may be utilized in numerous other applications. [0017| FIG. 1 is a block diagram of a distributed computing environment 100 where an exemplary document management system (DMS) 102 is implemented. The environment 1.00 includes the DMS 102 installed and executing on a server computer within a communications network 106. The network 106 interconnects and operatively couples one or more user devices 104a 104b, 104c, 104d ... 104n (collectively referred to as user devices 104) to the DMS 102. Further, one or more users 108a, 108b... 108c (collectively referred to as users 108) operate the corresponding user devices 104. The environment 100 further includes one or more storage devices 1 10, servers 1 12, and application platforms 1 14 operatively coupled to the network 106 and/or the user devices 104.
[0018| The network 1 6 can be a pri vate network, such as a local area network (LAN) or a virtual private network (VPN), hosted by a single organization; a community network, such as a wide area network (WAN), hosted by multiple organizations; a public network, such as the Internet; or a hybrid combination of these networks. Furthermore, the user devices 104 are operatively coupled to the network 106 through wired or wireless means. Wired connections include Ethernet cables, LAN cables, and the tike; while, wireless connections include Bluetooth'^, 802.11 standard WLAN, Wi-Fi, cellular telecommunications, and the like.
[0019] The user devices 104 can be any electronic device that is capable of exchanging data over the network 106. Examples of such electronic devices include personal computers, laptops, tablets, or handheld devices. Moreover, as referred to in the present disclosure, a 'user' 108 of the environment. 100 includes an individual person or an organization including multiple persons. In addition, the storage devices 1 10 include storage servers on the network 106. such as shared databases maintained by one or more people or an organisation. The storage devices 110 also include user-maintained databases on the network 106 such as email accounts, and cloud storage devices such, as Google* Drive accounts, Dropbox* accounts and the like. Cloud storage devices refer to a network of multiple physical devices that each store and maintain a copy of a working file with an associated mechanism for replicating and synchronizin the various copies of the working files on ail the physical devices present in the cloud network.
(0020| Users 108 typically store data, such as text files, music files, videos, images, program files, system flies, or executable files on one or more of these storage devices 1 10. Collectively, such data in any known format is termed as 'working files' throughout this disclosure. Over time, as users continue to create and store working files, the number of working files distributed across the network 106 becomes gargantuan. Often, users face difficult in managing and/or tracking such large number of files. To aid users 108 in efficient file management, the DMS 102 may be employed. The DMS 102 as described in the present disclosure enables users 1 8 to view, track, search, access, modify and potentially share their documents from multiple storage devices 1 10 in a fast and efficient manner.
[0021 To this end, the DMS 102 retrieves working files and their associated metadata from one or more sources. Further, the DMS customizes and classifies the metadata and creates one or more sidecar- files for the metadata. Thereafter, the DMS 102 links the sidecar metadata files to each of the corresponding working files and stores both the working file and the sidecar files in a repository-. A sidecar file is therefore configured to exist alongside and be communicated together with an associated working file and, according to the present disclosure, act as a repository for the metadata of the associated working fi le. Because the metadata is stored in sidecar files associated with the working files, rather than an aggregated metadata catalogue, the metadata can be replicated and synchronised using the same mechanism used to replicate and synchronise the working files. This is particularly beneficial in situations where the storage devices 110 are cloud based. Detailed ilinctionality of the DMS will be described with reference to FIGS. 2 and 3.
100221 In the implementation illustrated in FIG. 1, the DMS 102 is executed within the network 106. Because of this arrangement, the DMS 102 can be accessed from any user device 104 that is capable of connecting to the network 106, irrespective of the device's configuration. However, because the DMS is executed through the network 106, it is inaccessible without a network connection. In other implementations, instances of the DMS platform 102 can be installed on each of the user devices 104. In this case, users can access the DMS 102 without a network connection. However, the DMS 302 is inaccessible from user devices 104 on which the DMS 102 is not installed. Alternately, the DMS 102 is executed on one or more of the user devices 104 and on the network 106. In such a case, the DMS 102 can be accessed from the network 1 6 when the user device 104 is connected to the network 106 and from the user device itself, when the user device 104 is disconnected from the network 106. Based on individual requirements, any of these computing network implementations may be selected. [00231 FIG. 2 is a block diagram 200 of the exemplary DMS 102, which includes a retrieval unit 202 coupled to one or more sources to retrieve working files 214, an extractor 204 operably coupled to the retrieving unit 202 and to one or more metadata repositories 21.8 to extract metadata associated with the working files 214. The DMS 102 further includes a processing unit 206 operably coupled to the extractor 204 and a user authentication unit 208. The processing unit 206 is configured to classify the metadata, create sidecar files 220, and link the sidecar files 220 with the working files 214. A user interface 210 is provided to request inputs from the users 108, or display outputs to the users 108. The DMS 102 is also associated with at least one database 21.2 in which the linked working files 214 and sidecar files 220 are stored. Functionalit of these units and modules will be described in detail in the following sections.
[0024] The retrieval unit 202 is configured to retrieve one or more working files 214 from one or more sources. The sources includ storage devices 1 10, network servers 112 such as email or fax servers, or one or more applications platforms 1 14 such as document editors or viewers. As described previously, the DMS 102 may be employed by organizations to manage, track, and share their working files at an organizational level or by individuals to manage and share their personal working files. In case the DMS 102 is employed by an organization, the DMS 102 is integrated into the organization's network 106 and coupled to the organization's servers, and applications. Accordingly, in this application, each time a file is created or received by the network 102, the retrieval unit 202 retrieves die working file 214.
[0025] In case the DMS 102 is utilized to manage personal working files, the DMS 102 can be installed on one or more user devices 104 or executed as a network application. In this case, the retrieval unit 202 is configured to retrieve the working files 214 from one or more storage devices 1 10, servers 1 12 or application platforms 1 14 at predetermined times. For instance, the retrieval may be periodic (e.g., every day. every week, or every 6 hours) or at specific times (e.g., on activation of the DMS 102 application, or before shutting down). Further, the retrieval can be automated or manual. In the automated implementation, based on the predetermined retrieval time, the DMS 102 scans the sources to retrieve any working files 214 that have been added to the sources since the last retrieval.
[0026] In the manual implementation, whenever retrieval is required, the user 108 manually selects sources and the working files 214 to be retrieved from these sources. In this case, the retrieval unit 202 displays a list of the available sources on the user interface 210 and. requests the user J08 to select one or more sources or one or more working files 214 from that list for retrieval. Subsequently, the retrieval, unit 202 retrieves the selected working files 214.
[0027] The extractor 204 is operatively coupled to at least one of metadata repositories 21.8 and the retrieval unit 202 to retrieve metadata associated with, the working files 214. As described previously, most applications generate metadata associated with working files 234 when the files are created. This metadata typically includes values associated with fields such as name, creation date, author, access rights, file format, and so on. Moreover, applications may regularly update the metadata as and when the working fi]es 214 are updated. For instance, every time a document is modified, the metadata field 'date modified' is updated. Traditionally, metadata is either embedded within a working file, or aggregated and stored in a metadata repository 218, Accordingly, based on the location of the metadata, the extractor 204 retrieves the metadata either from the workin files 214 or from the metadata repository 218.
[0028J Often, the extracted metadata can be insufficient. For instance, organizations often desire customized metadata fields such as 'department', 'supervisor', 'internal reference number", and so on, which are not populated by generic applications. Similarly, individual users often desire customized metadata fields such as 'genre', 'locations where images were captured' , 'image tags', 'type of study material*, and so on. in both these eases, the extractor 204 is configured to request the user 108 to manually update, add or remove one or more of the metadata fields via operation of the user interface 210. To this end, the extractor 204 displays the retrieved, working files 214 and their extracted metadata on the user interface 210. In addition, the extractor 204 presents the user 108 with options to modify, add, or remove metadata fields and their values at this stage,
[00291 Further, the processing unit 206 may be configured to generate metadata. Occasionally, the relationship between a working file and its associated metadata may be altered or destroyed while modifying or saving the working file 214, For example, time stam information will be updated when a working file 214 is modified. Similarly, file path metadata can be changed if a working file 214 is moved or copied. Deri ved metadata, such as MDS hash values, ca also change when a file is modified. There are also situations where some applications overwrite or corrupt metadata previously inserted into a working file. In these eases, it can be difficult for the extractor 204 to locate the original metadata associated with a particular working file 214. In other cases, the metadata may not include values for one or more fields, such as the customized fields. Accordingly, when all or a portion of the metadata is missing, the processing unit 206 may be configured to automatically generate metadata for the working files 214 based on the content of the working files 214, For instance, the processing unit 206 can generate the case reference number from the subject line of an email, or generate a document number from a header or footer of the working file, and so on.
[0030] Moreover, the processing unit 206 is configured to classify the metadata into different classes. Users 1 OS may desire some metadata to be private, while they might not mind sharing other metadata with the public at large. Accordingly, based on a set of predetermined rules, the processing unit 208 is configured to classify the metadata associated with working files 214. In one example, for eac retrieved working file, the processing unit 206 is configured to classify the metadata into one or more of three categories - private metadata, shared metadata, and public metadata. Private metadata is the metadata that only the author of the working file can access. This metadata typically include fields such as track changes, edits, or comments in a working file, review time of the working file, and so on. Shared metadata includes metadata that can be accessed by an authorized working group, such as a team working on a project, people in a study group, and so on. As the author is typically a part of the group, the author is also allowed access to the shared metadata. This metadata typically includes version number, edits, and comments that may be shared within tire group, and so on. Public metadata on the other hand, includes metadata that can be accessed by anyone. This metadata often includes creation date, author, file type, and the like.
[0031] It will be understood that these metadata classes are merely exemplary and that the processing unit 206 may be configured to classify the metadata according to any other classification or rules without departing from the scope of the present disclosure. For instance, the metadata may be classified as 'open to public', 'private', 'internal', 'privileged', 'confidential', or 'external1.
(0032| As described previously, the processing unit 206 is configured to classify the metadata based on predetermined rules. These rules can be configured for a set of working files 214 based on certain parameters associated with the working files 214. For instance, according to one rule, the metadata for all musk files is classified as public metadata. Similarly, according to another rule, for text files, some metadata fields such as filename, file type, file size, and creation, date are classified as public metadata, while other fields such as edits, comments, and review time are classified as private metadata.
[0033] The predetermined rules may be configured by an authorized user. For instance, a user wit administrator rights is allowed to configure the classification rules. Alternatively, an author of a working file 214 is allowed to configure the classification rules for that working file. To determine the authorization level of a user 108, the DMS 102 includes the authentication module 208. This module 208 retains authorization information of all the users 208 of the DMS 102. Moreover, the authentication module 208 is configured to request users to enter their authentication information (such as a user name, a password, a secret answer and so on) and based on a lookup of this information with the retained authorization information, the authentication module 208 determines the authorization level of a user 108. This authentication information is then communicated to the processing unit 206.
[00341 If the authenticated 'user is the author of one or more working files 214, the user 108 may be permitted- to configure the classification rules for those working files 214. Alternatively, if the authenticated user is an. administrator, the user 108 may be allowed to configure the classification rules of any working file 214. In case the authenticated user is not the administrator or the author of any of the working fi les, the user 208 is prohibited from setting up classification rules.
[0035] In addition to generating and classifying metadata, the processing unit 206 is configured to create sidecar files 220 for the metadata. Typically, sidecar files, also known as buddy files or connected files, are files that store data (usually metadata) which is not supported by the source file format. Commonly, these files employ the Extensible Markup Language (XML). However, other formats may be utilized for sidecar files. The processing unit 206 creates one or more sidecar files 220 for each retrieved working file 214. Moreover, in most cases, the relationship between the working file 214 and the sidecar file 220 is based on the file name. Accordingly, the working file 14 and the sidecar file 220 may have the same name, but different extensions, in some arrangements, the processing unit 206 may store the sidecar files 220 in the same folder as the working file 214. However, in other arrangements, the sidecar files may be stored in any other location different from tire storage location of the working file 214.
[0036] The processing unit 206 creates individual sidecar files for the classified metadata. For instance, a private sidecar file is created fo the private metadata, a shared sidecar file is created for the shared metadata, and a public sidecar file is created for the public metadata. Moreover, the processing unit 206 may link these sidecar files 220 with their associated working files 214.
[00371 In traditional document management systems, metadata from multiple working files is aggregated and stored in a separate metadata repository, 'By storing metadata in this fashion, some difficulties may arise when different systems are utilized to manage the metadata and the working files. For example, a. centralised metadata repository can face difficulty in monitoring changes in working files that are replicated across multiple physical devices and that are not accessible to the centralized metadata repository.
[0038] In accordance with the present disclosure, however, the metadata is classified and stored in multiple sidecar files 220, which are directly linked with their associated working file 214 and stored in the same database 212. Because of this direct association when the workin file 214 is shared over the network 106, the metadata is also accessed with the working file 214. Moreover, if the working file 214 is simultaneously shared by two or more users 108 based on tire authentication of the two or more users 108, and the two or more users 108 simultaneousl modify the shared working file, the metadata sidecar file 220 accessible to the users 108 is automatically updated as well. In one arrangement, for instance, the processing unit. 206 may monitor the working file 214 accessed by multiple users 1 8 and detect when two or more users simultaneously modify the working file 214. Subsequently, the processing unit 208 may be configured to notif the users of the simultaneous modification, and resolve the simultaneous modification such that the resolved modification is captured or updated in the associated sidecar file as well. In another arrangement, the processing unit 208 may be configured to "check out" and lock the "modify" properties of a sidecar file 220 while updates are being made, thereby preventing another user from making updates at the same time.
[003.9] The processing unit 206 is also configured to encrypt the working files and/or the sidecar files. Any encryption format may be utilized without departing from the scope of the present disclosure. Further, in some arrangements, the processing unit is also configured to manage different versions of the working files and their sidecar, files and also resolve any conflicts between different versions of the working files.
[0040] In one implementation, the linked working files 214 and their associated sidecar files 220 are stored in the database 212. to FIG. 2, this database 212 is illustrated as part of the DMS 102. Alternatively, this database 212 may be present on one or more of the user devices 104 or o the network 106. In yet another implementation, the DMS stores the working files 214 with their sidecar files 220 in the respective storage devices 1 10 from which the working files 214 were retrieved,
[0041 ] Once the working files 214 and their sidecar files 220 are stored in the repository, users 108 can access the working files 214 along with the metadata sidecar files 220 they have access to, classify the working files 214, or modify the working files 214 directly from the DMS 102. Thus* the DMS 102 allows users 108 to manage working files 214 by collating working files 214 from a number of disparate formats and storage devices into a central location for users to view and modify. Moreover, as the metadata for the working files 214 is readily available, the working files 214 may be classified, sorted or filtered based on any of the metadata fields accessible to a user. Moreover, the processing unit 206 may be configured to generate hierarchical structures for the working files based on the classifications of these working files.
10042] FIG. 3 is a flowchart illustrating an exemplary method 300 for sharing working files on a network. This method 300 will be described with reference to FIGS, 1 and 2. The method 300 begins at step 302 where one or more working files, such as the working files 214 are retrieved from one or more sources, in one arrangement, the retrieval unit 202 retrieves the working files 214 from one or more storage devices 1 10, servers 112, or application platforms 1 14 each time a orking file 214 is received or created. Alternativel , the retrieval unit 202 may retrieve working files 214 at predetermined times from the sources or request a user 1.08 to manually select one or more working files 214 for retrieval.
[00431 Subsequently, at step 304, metadata associated with the one or more retrieved workin files 214 is extracted. As described previously, most applications generate metadata when a working file 214 is created. This metadata is typically stored within the working file 214 or in a metadata repository 218. The extractor 204 is configured to extract this metadata from eithe the retrieved working file 214 or the metadata repository 218. It will be noted that the extractor 204 is further configured to communicate with the user 108 through the user interlace 210 to displa the extracted metadata, and/or request the user to .modify, add or remove metadata fields from the extracted metadata.
[0044] At step 306, the extracted metadata for each of the retrieved working files is classified. Particularly, the processing unit 206 classifies the extracted metadata based on one or more predetermined rules. The rules may he configured by an authorized user or automatically set by the DMS 102. Moreover, the rules may vary for different working file formats. For instance, the rules may be different for text files and music files. Alternatively, the same rules may be applied to all the working files 214.
[0045] In one example, the metadata is classified as private metadata, public metadata, or shared metadata. These classifications are merely exemplary and any other classification system may he employed without departing from the scope of the present disclosure.
[0046] Next, one or more sidecar files 220 are created for each of the retrieved working files 214 based on the classified metadata. For instance, if the metadata for a working file 214 is classified as private metadata and public metadata, two sidecar files 220 are created for that particular file - a private sidecar file and a public sidecar file. Alternatively, if the metadata, for a working file 214 is classified as private metadata, shared metadata, and public metadata, three sidecar files 220 are created for that working file 214 - a private sidecar file, a shared sidecar file, and a public sidecar file.
[0047] The processing unit 206 is configured to link the one or more sidecar files 220 with their associated working file 214 at step 310. in one example, the sidecar files are linked to the associated working file by adopting the same file name as that, of the working file, but with different extensions,
[0048] Furthermore, the sidecar files 220 are stored in association with their respective working file 214 in a repository, at step 312. The repository may be an on-board database 212 of the DMS 1.02, a database 212 present on the user device 104 or a database coupled to the network 1 6. In one implementation, the sidecar files 214 are stored in association with their respective working files 214 in the storage devices 1 10 from which the vvorkrag files 214 were retrieved. Moreover, the sidecar files 220 can be stored in the same folder as the working file 214, in one arrangement. Alternatively, the sidecar files 220 can be stored in a different folder or repository from the working file 214. Further, the various sidecar files 220 associated with a particular working file 214 are stored in three designated locations, which may be different or may overlap to a degree. Furthermore, the DMS 102 can be deployed in an environment where working files are distributed across multiple shared folders, with eac shared folder having a different level of access control. Various combinations of shared folders for both the working documents and the metadata sidecar files can then be defined to cater for a broad range of document, sharing scenarios (See example scenarios).
[0049] Once the working files are stored with their associated sidecar files, the DMS 102 is configured to manipulate and or manage the working files 214 based on the sidecar files 220. To this end, the user 108 may input a search query to retrieve specific vvorking files. The processing unit 206 then determines the authorization level of the user 108 at step 3 14. For this determination, the authentication unit 208 is employed. In one arrangement, the authentication unit 208 may determine whether the user is the author of the requested working file 214, a part of an authorized group associated with the working rile 214, or just any other user. In case the user is the author, the user is allowed to access the working file 2 14 and the corresponding private sidecar file. Instead, if the user is part of an authorized group, the user is allowed access to the working file 214 and the associated shared sidecar file. Alternatively, if the user is neither the author nor part of an authorized group, the user is classified as any user and is allowed access to the working file and the associated public sidecar file.
[0050| Once the user is authenticated, processing unit 206 is configured to access the sidecar files 220 that the user is authorized to access, to retrieve specific working files 214 (step 316). For instance, a user may wish to access all the music files created by him/her from the year 1950 that are stored on the storage devices 1 10. hi this case, as the user is the author, the processing unit 206 accesses all the metadata sidecar files 220 associated with music files to determine the release year of the music files. Subsequently, all the music files from the year 1950 will be displayed for the user 1 8 on the user interface 210.
[00511 It will be understood that in this example if the user was not the author or an authorized member of a group, and the metadata corresponding to the field 'release year5 was stored in a private or shared sidecar file, the DMS 102 would be unable to search for the release year and consequently the DMS 102 would not be able to return accurate search results. To overcome this issue, once the user is authenticated, the DMS 302 may display the metadata fields available to the user based on the user's authorization level. Thereafter, the user 208 can filter or search working files 214 based on the metadata fields available to him her,, thus saving time.
[0052] Furthermore, as described previously, the DMS 102 can resolve any simultaneous modifications to working files shared by two or more users 108. To this end, the method proceeds by notifying the two or more users 108 of the simultaneous modification, and resolving the simultaneous modification such that the resolved modifications may be updated/captured in the associated sidecar file as well.
[00531 In another arrangement, instead of retrieving the working files at step 302, the DMS 102 may retrieve a list of the working files at step 302. In this arrangement, working files are stored in their native location, such as any logical or physical device associated with the DMS 102. In this case, the DMS 102 maintains a list of the working files that are stored across multiple logical and physical devices. Moreover, the DMS 102 may detect changes such as addition, deletio or updates to the working files. The DMS 102 ma also be able to maintain an audit trail of such changes. However, the DMS 102 may be unable to recover deleted files in this arrangement.
[0054| In yet another arrangement, when the DMS 102 retrieves the working files from their respective locations at step 302, the DMS 102 can store a copy of the working files in one or more folders that are essentially part of the DMS 102. Maintaining copies of the working files is desirable when a snapshot of the working files in a particular location at a particular time is required, ft is then relatively easy to synchronise the working files in a native location agamst the copies retained in the DMS folder. Multiple versions of a. working file can also be retained and managed, making it possible to recover files or restore an earlier version. Cha ges made to a working file by multiple users at the same time can be detected and any conflicts resolved. If modification of working files is made through the DMS then it is also possible to use a check out and lock mechanism to prevent concurrent access to the same working file.
[00551 In a further implementation, the working files may be stored on remote servers that are not directly accessible by the DMS 102. In such scenarios, technologies such as FTP and web services are utilized to transfer the working files 214 from the remote server to a local database. Subsequently, the DMS 102 may store a copy of the working files in an associated database. [0056| Moreover., in case the DMS 102 maintains a copy of the working fries, the working files can be transformed and combined with their associated metadata sidecar files into a single physical file using filing formats such as Zip*. Such combination of the working file with the metadata sidecar files prevent separation the working files and their associated metadata. Moreover during combination, the files may be compressed and/or encrypted.
Example, Acc m Scenarios
[00571 The following illustrate various exemplary situations and scenarios where the above described arrangements may be implemented. It will be understood that these examples are not exhaustive and thai variations may be utilized in multiple other situations and scenarios without departing from the scope of the present disclosure.
Personal Data Management
[0058) In this case, users ma wish to manage their own personal data present on various remote, local and virtual physical devices. Accordingly,, users may utilise the DMS 102 platform described above. Further, the DMS 102 may retrieve working files from multiple locations and store copies of these working files in a cloud database. Further, as the user w ishes to manage their personal documents, the DMS 102 classifies all the working files as private documents. Further, for these personal documents, the DMS 102 creates private metadata sidecar files and in some instances public metadata sidecar flies. These sidecar files are then stored with the working filed in the cloud database. The metadata may be classified as public and private metadata so that, people other than the authorized user may access some of the metadata while accessing a working file. Table 1 summarizes tire location of the working fi les and then associated sidecar files for this case.
Table 1 Shared Data Management
[0059J In this case, an organization may wish to share some of their internal documents with their employees or a user may wish to share some working files in their personal library with friends. In this case, the DMS 102 may store the working files i two separate cloud folders/databases. For instance, personal working files may be stored in a personal cloud folder, while shared working files may be stored in another folder/database {e.g., shared cloud folder).
[0060] Moreover, the DMS 102 creates metadata sidecar files for the working files. Particularly, metadata for private files may include private sidecar files. The private sidecar metadata files are stored in a personal cloud folder for metadata. This folder may or may not be the same folder as the personal cloud folder for documents. If the folders are the same, the metadata sidecar files are stored with the private working files.
[0061] For the shared working files, the DMS 102 may generate public, shared and private metadata sidecar files. Moreover, the public and shared metadata sidecar files may be stored with the documents in the shared cloud folder, while the private metadata sidecar file may be stored in the personal cloud folder for documents or metadata. Table 2 summarizes the location of various working files and their metadata for this case.
Figure imgf000018_0001
Table 2
Public Data Management
[0062] In this case, organizations may create a document management system for sharing some documents with all their employees and clients (business reports, updates, memos and the like) and some documents with a set of employees (e.g., HR related files. financial files, employee records, and so on). To this end, the DMS 102 elassifies documents as either shared documents or public documents. The shared documents may be stored in a shared cloud for documents, while the public working files may be stored in a public cloud folder for documents. Furthermore, the DMS 102 may generate public, shared and private metadata for the working files. The public metadata sidecar files for shared documents ma be stored in the shared cloud folder, while the public metadata sidecar file for public documents may be stored in the public cloud folder. Further, the shared metadata for shared and public working files may be stored in the shared cloud folder, while the private metadata for shared and public working files may be stored in a personal cloud folder. Table 3 summarizes the location of the working files and their metadata for this case.
j Document Public Shared Private
Repository Metadata Metadata Metadata I
Private j (None) (None) (None) (None) I
Documents 1
Shared Shared Cloud Shared Cloud Shared Cloud Personal |
Documents Folder for Folder for Folder for Cloud Folder j Documents metadata metadata for metadata j
Public j Public Cloud With Shared Cloud Personal j
Documents Folder for Documents Folder for Cloud Folder j Documents metadata for metadata |
Table 3
Litigation Review
[0063| During litigation, large numbers of documents are reviewed to ascertain whether they are relevant for a particular case. Typically a team of reviewers review portions of the total number of documents. In this case, all the documents are retrieved by the DMS 102 and classified as shared documents. Moreover, the DMS 102 may generate shared, public, and private metadata sidecar files for the working files. The shared and public metadata files may be stored along with the working file in the shared network folder, while private sidecar files may be stored in a personal folder or database. Table 4 summarizes the location of working files and their associated metadata in this case. Document Public Shared Private I
1 Repository Metadata Metadata Metadata I
Private j (None) (None) (None) (None) I
Documents
Shared Shared Network With with In persona!
Documents Folder for Documents Documents Library
J Documents
Public (None) (None) (None) (None)
Documents 1
Table 4
Data Room/Due Diligence review
[0064| Similarly, in due diligence reviews, a large number of working files related to an organization are retrieved and reviewed, by multiple reviewers. Accordingly, in this case the DMS 102 retrieves the working files and generates associated metadata for the working files. Working files related to all the bid teams, for example, may be treated as public documents. However, they may be stored in a shared cloud folder. Furthermore, the metadata may be classified as public, shared and private and the metadata sidecar files may be stored with the working file, in the shared network folder, or in a personal folder respectively. Table 5 illustrates a summary of this information.
Figure imgf000020_0001
Table 5
[0065| Systems and methods described in the present disclosure allow multiple users to view or modify a working file. To this end, the DMS 102 is configured to extract existing metadata or create metadata for the working files, derive a hierarchical structure, such as classification trees, from the metadata to facilitate browsing of the working files, define additional metadata fields through the user interface or by loading data dictionaries. Furthermore, the DMS 102 is configured to classify the metadata into different categories, apply permissions to the metadata fields, allocate metadata to distinct sidecar files based on the applied permissions, and store metadata in the DMS 102 or any other databases associated with the DMS 102. Moreover, the DMS 102 is configured to prevent and/or resolve concurrent updates to metadata sidecar files, and monitor updates to working files.
[0066] Furthermore, the DMS 102 updates the metadata immediately to reflect any changes in the working file 214. This is because 'unlike the traditional document management systems, the DMS 102 of the present disclosure links and stores the metadata associated with a working tile in a sidecar file. This file is always activated when the working file is accessed and accordingly any changes applied to the working file and are also applied to this sidecar file.
[0067J It will be understood that although at least one embodiment of the present disclosure is described with reference to a DMS 102, the methods and systems described herein may be utilized in other systems, platforms, or applications as well without departmg from the scope of the present disclosure. For instance, the methods and systems described herein may be utilized in contract management systems, e-discovery systems, and so on.

Claims

1. A method for managing a working file to permit sharing of the working file, the method comprising; retrieving the working file from a source;
extracting metadata associated with the retrieved working file;
creating at least one sidecar file for the retrieved working file based on the extracted metadata;
linking the at least one sidecar file with the retrieved working file; and
storing the at least one sidecar file in association with the retrieved working file.
2. The method of claim J , further comprising classifying the extracted metadata into at least one of private metadata, shared metadata, or public metadata.
3. The method of claim 2, wherein creating at least one sidecar file comprises creating at least one of a private sidecar file, a s ared sidecar file, or a public sidecar file.
4. The method of claim 3, further comprising accessing the at least one sidecar file associated with the retrieved file based on an authentication of a user.
5. The method of claim 4, wherein the method comprises: accessing the working file and a corresponding private sidecar file if the authenticated use is an author of the working file;
accessing the working file and a corresponding shared sidecar file if the authenticated user is part of an authorized gro up; and
accessing the working file and a corresponding public sidecar file for any user.
6. The method of claim 5, further comprising classifying one or more working files based on their corresponding accessed sidecar file.
7. The method of claim 6, further comprising generating a hierarchical structure of the one or more working files based on the classification.
8. The method of claim 1 , further comprising managing and resolving conflicts between two or more versions of the working file.
9. The method of claim i, further comprising encrypting the working file and the at least one sidecar file.
10. The method of claim 1 further comprising generating one or more fields of the metadata associated with the working file.
1 1. A method of sharing a working file, the working file being stored in association with at least one corresponding sidecar file formed according to the method of claim 4, the sharing method comprising: sharing the working file and the associated sidecar file with two or more users based on the authentication of the two or more users ; simultaneously modifying the working file by the two or more users; notifying the two or more users of the simultaneous modifications in the shared working file; and resolvin the modifications to the shared working file such that resolved modifications are captured in the associated sidecar file.
12. A system for managing working files, the system comprising: a retrieval unit configured to retrieve the working files from one or more sources;
an extractor configured to extract metadata associated with the retrieved working files; an authentication unit configured to authenticate a user; and
a processing unit configured to:
classify the extracted metadata of each working file from the retrieved working files into at least one of private metadata, public metadata, or shared metadata;
create at least one sidecar iile for each of the retrieved working files based on the classified metadata, the at least one sidecar file comprising at least one of a private sidecar file, a public sidecar file, and a shared sidecar file- link the at least one sidecar file with each of the corresponding retrieved working flies;
store the at least one sidecar file in association with each of the corresponding working files;
access one or more of the working files and corresponding private sidecar files where the authenticated user is an author of the working flies;
access one or more of the working files and corresponding shar ed sidecar files where the authenticated user is part of an authorized group;
access one or more of the working files and corresponding public sidecar files for any user; and
classify the accessed working files based on the corresponding accessed sidecar files,
13. The system of claim 12, wherein the processing unit is configured to generate a hierarchical structure of the one or more working files based on the classification.
14. The system of claim .12, wherein the processing unit is configured to manage and resol ve conflicts between two or more versions of the working file.
.
15. The system of claim 12, wherein the processing uni is configured to encrypt the working file and/or the at least one sidecar file.
16. The system of claim 12 further comprising a user interface coupled to the extractor, wherein the extractor in combination with tire user interface is configured to request a user to input one or more fields of the metadata associated with the working file.
17. The system of claim 12, wherein the processing unit is further configured to automatically generate metadata associated with the working file based on the content of the working file.
18. A system of claim 12, wherein the processing unit is further configured to: share the working file and the associated sidecar file with two or more users based on the authentication of the two or more users; and notify the two or more users of any simultaneous modifications to the shared working file; and resolve the simultaneous modifications to the shared working file such that resolved modifications are captured in the associated sidecar file.
19. A computer readable medium having a computer program stored on the medium for managing a working file to permit sharing of the working file, the program comprising: code for retrieving the working file from a source; code for extracting metadata associated with the retrieved working file; code for creating at least one sidecar file for the retrieved working fil e based on the extracted metadata; code for linking the at least one sidecar fi le with the retrieved working file; code for storing the at least one sidecar file in association with the retrieved working file; and code for accessing the at least one sidecar file associated with the retrieved t le based on an authentication of a user.
20, A computer readable medium having a computer program stored on the medium for sharing a working file, the working file being stored in association with at least one corresponding sidecar file formed according to claim 19, the program comprising: code for sharing the working file and the associated sidecar file with two or more users based on the authentication of the two or more users; code for notifying the two or more users of simultaneous modifications in the shared working file by the two or more users; and code for resolving the modifications to the shared working file such that the resolved modifications are captured i the associated sidecar file.
PCT/AU2014/000788 2013-08-09 2014-08-08 Method and system for managing and sharing working files in a document management system: Ceased WO2015017886A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AU2013902993 2013-08-09
AU2013902993A AU2013902993A0 (en) 2013-08-09 Method and system for managing and sharing working files in a document mangement system

Publications (1)

Publication Number Publication Date
WO2015017886A1 true WO2015017886A1 (en) 2015-02-12

Family

ID=52460423

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2014/000788 Ceased WO2015017886A1 (en) 2013-08-09 2014-08-08 Method and system for managing and sharing working files in a document management system:

Country Status (1)

Country Link
WO (1) WO2015017886A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109766320A (en) * 2018-12-04 2019-05-17 深圳供电局有限公司 Method and system for displaying shared label in network file
US11743218B2 (en) * 2021-12-21 2023-08-29 LeapXpert Limited Message capture in a multi channel communication environment
US12050705B2 (en) 2021-12-29 2024-07-30 Microsoft Technology Licensing, Llc Enhanced security features for controlling access to shared content and private content of a shared document

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050138034A1 (en) * 2003-12-17 2005-06-23 International Business Machines Corporation System and method for sharing resource properties in a multi-user environment
US20060004699A1 (en) * 2004-06-30 2006-01-05 Nokia Corporation Method and system for managing metadata
US20070073776A1 (en) * 2005-09-19 2007-03-29 Kalalian Steven P Digital file management
US20070124319A1 (en) * 2005-11-28 2007-05-31 Microsoft Corporation Metadata generation for rich media
US20100278453A1 (en) * 2006-09-15 2010-11-04 King Martin T Capture and display of annotations in paper and electronic documents
US20120060082A1 (en) * 2010-09-02 2012-03-08 Lexisnexis, A Division Of Reed Elsevier Inc. Methods and systems for annotating electronic documents

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050138034A1 (en) * 2003-12-17 2005-06-23 International Business Machines Corporation System and method for sharing resource properties in a multi-user environment
US20060004699A1 (en) * 2004-06-30 2006-01-05 Nokia Corporation Method and system for managing metadata
US20070073776A1 (en) * 2005-09-19 2007-03-29 Kalalian Steven P Digital file management
US20070124319A1 (en) * 2005-11-28 2007-05-31 Microsoft Corporation Metadata generation for rich media
US20100278453A1 (en) * 2006-09-15 2010-11-04 King Martin T Capture and display of annotations in paper and electronic documents
US20120060082A1 (en) * 2010-09-02 2012-03-08 Lexisnexis, A Division Of Reed Elsevier Inc. Methods and systems for annotating electronic documents

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109766320A (en) * 2018-12-04 2019-05-17 深圳供电局有限公司 Method and system for displaying shared label in network file
US11743218B2 (en) * 2021-12-21 2023-08-29 LeapXpert Limited Message capture in a multi channel communication environment
US20230353522A1 (en) * 2021-12-21 2023-11-02 LeapXpert Limited Message capture in a multi channel communication environment
US12363058B2 (en) * 2021-12-21 2025-07-15 LeapXpert Limited Message capture in a multi channel communication environment
US12050705B2 (en) 2021-12-29 2024-07-30 Microsoft Technology Licensing, Llc Enhanced security features for controlling access to shared content and private content of a shared document

Similar Documents

Publication Publication Date Title
US20240143551A1 (en) Suggesting content items to be accessed by a user
US10567484B2 (en) Identifying content items for inclusion in a shared collection
US10075527B2 (en) Information management of data associated with multiple cloud services
US9122750B2 (en) Classifying objects
US9710502B2 (en) Document management
US9942121B2 (en) Systems and methods for ephemeral eventing
DE69902749T2 (en) ENCAPSULATION, DATA DISPLAY AND TRANSMISSION OF CONTENT-ADDRESSABLE DATA
US20170200122A1 (en) Information organization, management, and processing system and methods
US9465856B2 (en) Cloud-based document suggestion service
US20140279893A1 (en) Document and user metadata storage
US20140282901A1 (en) Managing shared content with a content management system
US20140281875A1 (en) Multi-user layer annotation
US20100306180A1 (en) File revision management
US20080270462A1 (en) System and Method of Uniformly Classifying Information Objects with Metadata Across Heterogeneous Data Stores
US9298797B2 (en) Preserving content item collection data across interfaces
US20140358868A1 (en) Life cycle management of metadata
WO2020111197A1 (en) Document arrangement support system
Khan et al. Document management system: An explicit knowledge management system
WO2015017886A1 (en) Method and system for managing and sharing working files in a document management system:
Seymour The modern records management program: an overview of electronic records management standards
US20130212118A1 (en) System for managing litigation history and methods thereof
Ochoa et al. A transformational model for Organizational Memory Systems management with privacy concerns
Emery Document and records management: Understanding the differences and embracing integration
CN120849357A (en) An enterprise intelligent data platform and its file search method and device
KR20080009107A (en) Personalizable Information Network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14834513

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14834513

Country of ref document: EP

Kind code of ref document: A1