RU2527201C1

RU2527201C1 - Data conversion method, data conversion device and data conversion system

Info

Publication number: RU2527201C1
Application number: RU2013103176/08A
Authority: RU
Inventors: Алексей Эрнстович Ананьин; Владислав Николаевич Бабин
Original assignee: Общество с ограниченной ответственностью "Системное моделирование и анализ"
Priority date: 2013-01-24
Filing date: 2013-01-24
Publication date: 2014-08-27
Also published as: RU2013103176A

Abstract

FIELD: information technology.

SUBSTANCE: method includes steps of: 1) identifying data having a first table structure; 2) identifying a data bypass region; 3) performing bypass for each data cell from said data bypass region; 4) creating, in the internal database for each data cell, a buffer for identified context; 5) identifying context for a first data cell; 6) identifying context for the next data cell; 7) iteratively performing steps 5 and 6 until bypass is performed for each said data cell from said data region; 8) serialising the obtained database into an internal XML data format; 9) applying a taxonomy template to the obtained XML file, wherein the taxonomy template is selected depending on a second data table structure; 10) performing conversion of the obtained XML file to data having a second table structure in accordance with the applied taxonomy template.

EFFECT: high accuracy, enabling use of the present method for various data structures.

34 cl, 4 dwg

Description

Изобретение относится к решениям в области конверсии (преобразования) первых данных, имеющих первую структуру, во вторые данные, имеющие вторую структуру. В частности, заявленная группа изобретений относится к способу конверсии данных, устройству конверсии данных и системе конверсии данных, причем первые данные имеют первую табличную структуру, а вторые данные имеют вторую табличную структуру.The invention relates to solutions in the field of conversion (conversion) of first data having a first structure into second data having a second structure. In particular, the claimed group of inventions relates to a data conversion method, a data conversion device and a data conversion system, the first data having a first tabular structure and the second data having a second tabular structure.

УРОВЕНЬ ТЕХНИКИBACKGROUND

Из уровня техники известны различные способы выполнения конверсии для различных структур данных.Various methods for performing conversion for various data structures are known in the art.

Известен источник WO 2005/076155 А2, G06F 17/30, 18.08.2005, который описывает систему обработки данных, имеющую графический пользовательский интерфейс, которая позволяет преобразовать данные, имеющие первую структуру, в данные, имеющие вторую структуру.The source WO 2005/076155 A2, G06F 17/30, 08/18/2005, which describes a data processing system having a graphical user interface that allows you to convert data having a first structure into data having a second structure, is known.

Недостатком данного решения является то, что оно описывает систему, в которой осуществляется преобразование данных, не применимое для преобразования именно различных структур табличных данных, и не обеспечивает достаточной точности преобразования в случае несанкционированного воздействия на исходные данные.The disadvantage of this solution is that it describes a system in which data conversion is performed that is not applicable for the conversion of precisely different structures of tabular data, and does not provide sufficient conversion accuracy in case of unauthorized exposure to the source data.

Известны также различные конвертеры документов формата MS Excel в формат, поддерживаемый различными базами данных в зависимости от распознаваемого ими расширения (например, Конвертер данных MS Excel в АБДД Титул-2005,Various converters of MS Excel format documents to a format supported by various databases depending on the extension they recognize are also known (for example, MS Excel Data Converter to ABDD Title-2005,

http://www.titul2005.ru/imaqes/titulimg/manuals/Konverter iz Excel v BD Titul-2005.pdf). Однако данные приложения не позволяют достаточно широко применять конверсию данных, поскольку привязаны, как правило, к одному или нескольким конкретным форматам, а также не обеспечивают достаточной точности преобразования данных в случае несанкционированного воздействия на исходные данныеhttp://www.titul2005.ru/imaqes/titulimg/manuals/Konverter iz Excel v BD Titul-2005.pdf). However, these applications do not allow widespread use of data conversion, since they are tied, as a rule, to one or several specific formats, and also do not provide sufficient accuracy of data conversion in case of unauthorized exposure to the source data

Известны также различные способы применения процедуры мэппинга, которая позволяет отображать позиции, содержащиеся в структуре одного каталога (исходного каталога), в структуру другого каталога (целевого каталога). Такие решения, например, раскрываются в источнике US 2006/0184539 А1, G06F 17/30, 17.08.2006, который описывает способ создания XBRL Instance Document, в котором один или более атрибутов XBRL таксономии ассоциированы с одним или более атрибутом бизнес-документа (в частности, документа MS Excel). Создание XBRL Instance Document осуществляется путем конверсии данных из исходной структуры в целевую структуру при помощи переноса данных из соответствующих ячеек исходной структуры в назначенные соответствующие ячейки конечной структуры.There are also various ways of applying the mapping procedure, which allows you to map the positions contained in the structure of one directory (source directory) to the structure of another directory (target directory). Such solutions, for example, are disclosed in US 2006/0184539 A1, G06F 17/30, 08/17/2006, which describes a method for creating an XBRL Instance Document in which one or more XBRL taxonomy attributes are associated with one or more attributes of a business document (in particular MS Excel document). XBRL Instance Document is created by converting data from the source structure to the target structure by transferring data from the corresponding cells of the source structure to the assigned corresponding cells of the final structure.

Недостатком такого решения является то, что получаемый в результате конверсии документ может быть неточно отображен, за счет того что на стадии подготовки бизнес-документов для последующего извлечения их преобразования в ячейки табличной структуры бизнес-документов могут быть внесены неправомерные и недобросовестные изменения, нарушающие структуру таксономии при генерировании конечного документа.The disadvantage of this solution is that the document obtained as a result of the conversion may not be accurately displayed, due to the fact that at the stage of preparing business documents for subsequent extraction of their conversion into cells of the tabular structure of business documents, unlawful and unscrupulous changes can be made that violate the structure of the taxonomy when generating the final document.

Известно также решение, разработанное компанией Oracle, в частности, для управления созданием XBRL таксономииAlso known is a solution developed by Oracle, in particular for managing the creation of an XBRL taxonomy

(http://docs.oracle.com/cd/E17236 01/epm.1112/disclosure mgmt admin.pdf), данное решение также, как и предыдущее известное решение, описывает применения мэппинга для реализации конверсии формата данных, в частности MS Excel, в формат данных XBRL. (http://docs.oracle.com/cd/E17236 01 / epm.1112 / disclosure mgmt admin.pdf), this solution, as well as the previous known solution, describes the use of mapping to implement data format conversion, in particular MS Excel , into the XBRL data format.

Однако также, как и предыдущее известное решение, данное решение не обеспечивает достаточной защищенности целевой структуры данных от несанкционированного воздействия на структуру данных исходного документа, что приводит к недостаточно точному, часто не пригодному результату конверсии.However, as well as the previous known solution, this solution does not provide sufficient protection for the target data structure from unauthorized influence on the data structure of the source document, which leads to an insufficiently accurate, often unsuitable conversion result.

РАСКРЫТИЕ ИЗОБРЕТЕНИЯSUMMARY OF THE INVENTION

Таким образом, задачей настоящего изобретения является обеспечение способа, устройства и системы, которые позволили бы с высокой точностью осуществлять конверсию данных, имеющих первую табличную структуру данных, в данные, имеющие вторую табличную структуру данных. Другой задачей настоящего изобретения является обеспечение способа, устройства и системы, которые бы обеспечили защищенность процедуры конверсии данных от воздействия на исходные данные.Thus, it is an object of the present invention to provide a method, device, and system that enables high precision conversion of data having a first tabular data structure to data having a second tabular data structure. Another objective of the present invention is the provision of a method, device and system that would ensure the security of the data conversion procedure from exposure to the source data.

Техническим результатом, на которое направлено заявленное решение, является повышение точности конверсии, а также обеспечение возможности применения данного способа для множества различных структур данных.The technical result, to which the claimed solution is directed, is to increase the accuracy of conversion, as well as providing the possibility of applying this method for many different data structures.

Варианты осуществления настоящего изобретения относятся к способам, устройствам, системам и машиночитаемому носителю данных для конверсии (преобразования) данных, имеющих первую табличную структуру, в данные, имеющие вторую табличную структуру, при выполнении последовательных этапов, на которых:Embodiments of the present invention relate to methods, devices, systems and a computer-readable storage medium for converting (converting) data having a first tabular structure to data having a second tabular structure, in successive steps in which:

1) идентифицируют данные, имеющие первую табличную структуру;1) identify data having a first tabular structure;

2) идентифицируют область обхода данных, имеющих первую табличную структуру;2) identify the crawl area of the data having the first table structure;

3) совершают проход для каждой ячейки данных из упомянутой области обхода данных, при этом выполняются этапы, на которых:3) make a pass for each data cell from the mentioned data crawl area, while the steps are performed in which:

4) формируют во внутренней базе данных для каждой ячейки данных буфер для идентифицированного контекста;4) form a buffer for the identified context in the internal database for each data cell;

5) идентифицируют контекст для первой ячейки данных, для которой был совершен проход, причем до начала прохода для следующей ячейки данных идентифицированный контекст записывают в упомянутый буфер, причем с началом прохода для упомянутой следующей ячейки данных переносят идентифицированный для упомянутой первой ячейки данных контекст в упомянутый буфер для следующей ячейки данных;5) the context for the first data cell for which passage has been identified, and before the start of the passage for the next data cell, the identified context is written into the buffer, and with the start of the passage for the next data cell, the context identified for the first data cell is transferred to the buffer for the next data cell;

6) идентифицируют контекст для упомянутой следующей ячейки данных, причем в случае когда идентифицированный контекст для данной ячейки отличается от упомянутого идентифицированного контекста, записанного в упомянутый буфер для данной ячейки данных, то идентифицированный контекст для данной ячейки данных заменяет идентифицированный контекст в буфере данной ячейки данных, причем над ним совершаются действия как над идентифицированным контекстом, указанные на этапе 5);6) the context for said next data cell is identified, and in the case where the identified context for a given cell is different from said identified context recorded in said buffer for a given data cell, the identified context for this data cell replaces the identified context in the buffer of this data cell, moreover, actions are performed on it as on the identified context specified in step 5);

7) итеративно выполняют этапы 5) и 6) до тех пор, пока не будет совершен проход для каждой упомянутой ячейки данных из упомянутой области данных;7) iteratively perform steps 5) and 6) until a passage has been completed for each said data cell from said data area;

8) сериализуют полученную базу данных во внутренний формат данных XML;8) serialize the resulting database into an internal XML data format;

9) применяют к полученному после процесса сериализации файлу XML шаблон таксономии, причем шаблон таксономии выбирается в зависимости от второй табличной структуры данных;9) apply a taxonomy template to the XML file received after the serialization process, and the taxonomy template is selected depending on the second tabular data structure;

10) осуществляют конверсию полученного файла XML в данные, имеющие вторую табличную структуру, в соответствии с примененным к нему шаблоном таксономии.10) they convert the resulting XML file into data having a second table structure, in accordance with the taxonomy template applied to it.

Данная сущность предоставлена для того, чтобы представить ряд концепций в упрощенной форме, которые далее описываются в подробном описании. Данная сущность не предназначена для того, чтобы определить ключевые признаки или существенные признаки заявленного объекта изобретения, а также не предназначена для того, чтобы ее использовали в качестве вспомогательного средства при определении объема заявленного объекта изобретения.This entity is provided in order to present a number of concepts in a simplified form, which are further described in the detailed description. This entity is not intended to identify key features or essential features of the claimed subject matter of the invention, nor is it intended to be used as an aid in determining the scope of the claimed subject matter of the invention.

КРАТКОЕ ОПИСАНИЕ ЧЕРТЕЖЕЙBRIEF DESCRIPTION OF THE DRAWINGS

Иллюстративные варианты осуществления настоящего изобретения описываются далее подробно со ссылкой на прилагаемые чертежи, которые включены в данный документ посредством ссылки и на которых:Illustrative embodiments of the present invention are described below in detail with reference to the accompanying drawings, which are incorporated herein by reference and in which:

фиг.1, 2 графически изображают примерный способ конверсии данных, имеющих первую структуру, в данные, имеющие вторую структуру, в соответствии с одним вариантом осуществления изобретения;1, 2 graphically depict an exemplary method of converting data having a first structure into data having a second structure, in accordance with one embodiment of the invention;

фиг.3 графически изображает примерное вычислительное устройство, пригодное для осуществления вариантов осуществления настоящего изобретения;3 graphically depicts an exemplary computing device suitable for implementing embodiments of the present invention;

фиг.4 графически изображает примерную систему, пригодную для осуществления вариантов осуществления настоящего изобретения.4 graphically depicts an exemplary system suitable for implementing embodiments of the present invention.

РАСКРЫТИЕ АСПЕКТОВ ИЗОБРЕТЕНИЯSUMMARY OF THE INVENTION

Объект изобретения согласно вариантам осуществления настоящего изобретения описан с его особенностями в данном документе для соответствия предусмотренным требованиям. Тем не менее само описание не предназначено для ограничения объема данного патента. Скорее, следует исходить из того, что заявленный объект изобретения также может быть осуществлен другими способами таким образом, что будет включать в себя отличающиеся этапы или комбинации этапов, аналогичных этапам, описанным в данном документе, в сочетании с другими существующими и будущими технологиями.An object of the invention according to the variants of implementation of the present invention is described with its features in this document to meet the requirements. However, the description itself is not intended to limit the scope of this patent. Rather, it should be assumed that the claimed subject matter of the invention can also be implemented in other ways in such a way that it will include different steps or combinations of steps similar to the steps described in this document, in combination with other existing and future technologies.

В первом аспекте настоящее изобретение обеспечивает способ конверсии (преобразования) данных, имеющих первую табличную структуру, в данные, имеющие вторую табличную структуру, при выполнении последовательных этапов, на которых:In a first aspect, the present invention provides a method for converting (converting) data having a first tabular structure into data having a second tabular structure, in successive steps, in which:

10) осуществляют конверсию полученного файла XML в данные, имеющие вторую табличную структуру, в соответствии с примененным к нему шаблоном таксономии. При этом данные, имеющие первую табличную структуру, могут быть, но не ограничиваться данными: данные формата MS Excel, табличные и содержащие табличные данные документы форматов MS Word, MS PowerPoint, Open Document Format for Office Applications (ODF), файлы формата PDF, содержащие поддающиеся точному распознаванию табличные формы, любые веб-формы, сканируемые печатные документы, имеющие табличные формы и любые другие табличные формы предоставления информации. Данные, имеющие вторую табличную структуру могут быть, но не ограничиваться, например, данными описываемыми языком деловой отчетности eXtensible Business Reporting Language (XBRL). Кроме того, на этапе идентификации области обхода данных может быть идентифицирована более чем одна область обхода данных. При этом проход может совершаться не для каждой ячейки данных из упомянутой области обхода данных, а только по тем ячейкам, которые не являются скрытыми. При этом существует возможность задать количество обходов упомянутой области данных на тот случай, если за один обход невозможно идентифицировать все необходимые атрибуты идентифицируемого контекста. При этом на этапе прохода для упомянутой ячейки данных существует возможность идентифицировать контекст, относящийся также и к ячейкам, граничащим и/или расположенным на расстоянии от упомянутой текущей ячейки данных. При этом на этапе обхода идентифицированной области данных идентификация контекста может быть осуществлена также и для ячеек из областей данных, расположенных за пределами упомянутой идентифицированной области обхода данных. При этом существует возможность записать совершенные действия в файл регистрации (журнал, лог-файл). Кроме того, существует возможность задать направление обхода указанной области данных: по столбцам, по строкам; как по столбцам, так и по строкам в течение одного обхода.10) they convert the resulting XML file into data having a second table structure, in accordance with the taxonomy template applied to it. Moreover, data having the first tabular structure may be, but not limited to data: MS Excel format data, tabular and tabular data documents of MS Word, MS PowerPoint, Open Document Format for Office Applications (ODF) formats, PDF files containing Recognizable tabular forms, any web forms, scanned printed documents having tabular forms and any other tabular forms for providing information. Data having a second table structure may be, but is not limited to, for example, data described by the eXtensible Business Reporting Language (XBRL). In addition, in the step of identifying a crawl area, more than one crawl area can be identified. In this case, the passage can be performed not for each data cell from the mentioned data crawl area, but only for those cells that are not hidden. At the same time, it is possible to set the number of walks of the mentioned data area in case it is impossible to identify all the necessary attributes of the identified context in one walk. Moreover, at the stage of passage for said data cell, it is possible to identify a context that also applies to cells adjacent to and / or located at a distance from said current data cell. In this case, at the stage of traversing the identified data area, context identification can also be carried out for cells from data areas located outside of the identified identified data traversal area. At the same time, it is possible to record the completed actions in a log file (log, log file). In addition, it is possible to specify the direction of traversal of a specified data area: in columns, in rows; both in columns and in rows during one crawl.

Во втором аспекте настоящее изобретение обеспечивает устройство для конверсии данных, имеющих первую табличную структуру, в данные, имеющие вторую табличную структуру, которое может представлять собой, но не ограничиваться: персональный компьютер, портативный компьютер, планшетный компьютер, карманный компьютер, смартфон и тому подобное. Устройство обязательно содержит один или более процессоров, машиночитаемый носитель данных (память) и модули ввода/вывода (I/O). В качестве примера, а не ограничения машиночитаемый носитель данных может включать в себя оперативную память (RAM); постоянное запоминающее устройство (ROM); электрически стираемое программируемое постоянное запоминающее устройство (EEPROM); флэш-память или другие технологии памяти; CDROM, цифровой универсальный диск (DVD) или другие оптические или голографические носители данных; магнитные кассеты, магнитную пленку, запоминающее устройство на магнитных дисках или другие магнитные запоминающие устройства, несущие волны или другой носитель данных, который может быть использован для кодирования требуемой информации и к которому может быть осуществлен доступ посредством описываемого устройства. Память включает в себя носитель данных на основе запоминающего устройства компьютера в форме энергозависимой или энергонезависимой памяти или их комбинации. Примерные аппаратные устройства включают в себя твердотельную память, накопители на жестких дисках, накопители на оптических дисках и т.д. В памяти хранится примерная среда, в которой при помощи компьютерных команд или кодов, хранящихся в памяти устройства, может быть осуществлена процедура конверсии. Устройство содержит один или более процессоров, которые предназначены для выполнения компьютерных команд или кодов, хранящихся в памяти устройства с целью обеспечения выполнения процедуры конверсии. Модули I/O представляют собой, но не ограничиваются типичные и известные из уровня техники средства управления устройством: манипулятор типа «мышь», клавиатура, джойстик, тачпад, трекбол, электронное перо, стилус, сенсорный дисплей и тому подобное. Также модули I/O представляют собой, но не ограничиваются типичные и известные из уровня техники средства демонстрирования информации: монитор, проектор, принтер, графопостроитель и тому подобное. Компьютерные команды или коды, хранящиеся в памяти, предназначены для выполнения способа конверсии данных, имеющих первую табличную структуру, в данные, имеющие вторую табличную структуру, и представляют собой, по меньшей мере, команды идентификации данных, имеющих первую табличную структуру; команды идентификации области обхода данных, имеющих первую табличную структуру; команды совершения прохода для каждой ячейки данных из упомянутой области обхода данных; команды формирования во внутренней базе данных для каждой ячейки данных буфера для идентифицированного контекста; команды идентификации контекста для первой ячейки данных, для которой была совершена команда прохода, причем до начала выполнения команд прохода для следующей ячейки данных выполняются команды записи идентифицированного контекста в упомянутый буфер, причем с началом выполнения команд прохода для упомянутой следующей ячейки данных выполняются команды переноса идентифицированного для упомянутой первой ячейки данных контекста в упомянутый буфер для следующей ячейки данных; команды идентификации контекста для упомянутой следующей ячейки данных, команды перезаписи для случаев, когда идентифицированный контекст для данной ячейки отличается от упомянутого идентифицированного контекста, записанного в упомянутый буфер для данной ячейки данных, и при выполнении этих команд идентифицированный контекст для данной ячейки данных заменяет идентифицированный контекст в буфере данной ячейки данных; команды итеративного выполнения команд идентификации контекста до тех пор, пока не будут выполнены команды прохода для каждой упомянутой ячейки данных из упомянутой области данных; команды формирования внутренней базы данных; команды сериализации полученной базы данных во внутренний формат данных XML; команды применения к полученному после процесса сериализации файлу XML шаблона таксономии, причем применяется команда выбора шаблона таксономии в зависимости от второй табличной структуры данных. При этом данные, имеющие первую табличную структуру, могут быть, но не ограничиваться данными: данные формата MS Excel, табличные и содержащие табличные данные документы форматов MS Word, MS PowerPoint, Open Document Format for Office Applications (ODF), файлы формата PDF, содержащие поддающиеся точному распознаванию табличные формы, любые веб-формы, сканируемые печатные документы, имеющие табличные формы и любые другие табличные формы предоставления информации. Данные, имеющие вторую табличную структуру, могут быть, но не ограничиваться данными, например, описываемыми языком деловой отчетности extensible Business Reporting Language (XBRL). Кроме того, могут быть команды, осуществляющие при выполнении команд идентификации области обхода данных идентификацию более чем одной области обхода данных. При этом команды прохода могут совершаться не для каждой ячейки данных из упомянутой области обхода данных, а только для тех ячеек, которые не являются скрытыми. При этом могут быть команды, которые осуществляют возможность задать количество команд обхода упомянутой области данных, на тот случай, если за выполнение одной команды обхода невозможно идентифицировать все необходимые атрибуты идентифицируемого контекста. При этом при выполнении команд прохода для упомянутой ячейки данных существуют команды идентификации контекста, относящегося также и к ячейкам, граничащим и/или расположенным на расстоянии от упомянутой текущей ячейки данных. При этом при выполнении команд обхода идентифицированной области данных команды идентификации контекста могут быть осуществлены также и для ячеек из областей данных, расположенных за пределами упомянутой идентифицированной области обхода данных. При этом существуют команды записи выполненных команд в файл регистрации (журнал, лог-файл). Кроме того, существуют команды, которые задают направление обхода указанной области данных: по столбцам, по строкам; как по столбцам, так и по строкам в течение выполнения одной команды обхода.In a second aspect, the present invention provides an apparatus for converting data having a first tabular structure into data having a second tabular structure, which may be, but not limited to: a personal computer, a laptop computer, a tablet computer, a palmtop computer, a smartphone, and the like. The device necessarily contains one or more processors, a computer-readable storage medium (memory) and input / output modules (I / O). By way of example, and not limitation, a computer-readable storage medium may include random access memory (RAM); read-only memory device (ROM); Electrically Erasable Programmable Read-Only Memory (EEPROM); flash memory or other memory technologies; CDROM, digital versatile disc (DVD) or other optical or holographic storage media; magnetic cassettes, magnetic tape, magnetic disk storage device or other magnetic storage devices, wave carriers or other storage medium that can be used to encode the required information and which can be accessed by the described device. The memory includes a storage medium based on a computer storage device in the form of volatile or non-volatile memory, or a combination thereof. Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, etc. An example environment is stored in the memory in which a conversion procedure can be carried out using computer instructions or codes stored in the device’s memory. The device contains one or more processors that are designed to execute computer instructions or codes stored in the device’s memory in order to ensure that the conversion procedure is completed. I / O modules are, but are not limited to, typical and prior art device controls: a mouse, keyboard, joystick, touchpad, trackball, electronic pen, stylus, touch screen, and the like. Also, I / O modules are, but are not limited to, typical and known from the prior art means of displaying information: a monitor, a projector, a printer, a plotter, and the like. Computer instructions or codes stored in the memory are intended to perform a method of converting data having a first table structure into data having a second table structure, and are at least data identification commands having a first table structure; instructions for identifying a crawl area having a first table structure; pass commands for each data cell from said data crawl area; formation commands in the internal database for each data cell of the buffer for the identified context; context identification commands for the first data cell for which the pass command was executed, and before the start of the execution of the pass commands for the next data cell, instructions to write the identified context to the buffer are executed, and with the start of the execution of the pass commands for the next data cell, the transfer commands identified for said first context data cell in said buffer for the next data cell; context identification commands for said next data cell, rewriting commands for cases when the identified context for a given cell is different from said identified context written to said buffer for a given data cell, and when these commands are executed, the identified context for this data cell replaces the identified context in the buffer of this data cell; instructions for iteratively executing context identification commands until pass instructions for each said data cell from said data area are executed; teams forming the internal database; serialization commands of the received database into the internal XML data format; commands to apply a taxonomy template to the XML file received after the serialization process, and the command to select a taxonomy template depending on the second tabular data structure is applied. Moreover, data having the first tabular structure may be, but not limited to data: MS Excel format data, tabular and tabular data documents of MS Word, MS PowerPoint, Open Document Format for Office Applications (ODF) formats, PDF files containing Recognizable tabular forms, any web forms, scanned printed documents having tabular forms and any other tabular forms for providing information. Data having a second table structure may be, but is not limited to, data, for example, described by the extensible Business Reporting Language (XBRL). In addition, there may be commands that, when executing commands for identifying a crawl area, identify more than one crawl area. At the same time, pass commands can be performed not for each data cell from the mentioned data crawl area, but only for those cells that are not hidden. In this case, there may be commands that provide the ability to set the number of traversal commands for the mentioned data area, in case it is impossible to identify all the necessary attributes of the identified context for the execution of one traversal command. At the same time, when executing pass commands for the mentioned data cell, there are commands for identifying the context, which also applies to cells adjacent to and / or located at a distance from the said current data cell. At the same time, when executing traversal commands of the identified data area, context identification commands can also be implemented for cells from data regions located outside the mentioned identified traversal area. At the same time, there are commands for writing executed commands to the log file (log, log file). In addition, there are commands that specify the direction of traversal of a specified data area: in columns, in rows; both columns and rows during the execution of a single crawl command.

Несмотря на то что в примерном варианте осуществления изобретения перечисленные компьютерные команды написаны на языке JAVA, не следует считать, что данный пример осуществления изобретения ограничивает написание перечисленных компьютерных команд только данным языком программирования. В действительности перечисленные команды могут быть написаны на любом известном или вновь созданном языке программирования.Despite the fact that in an exemplary embodiment of the invention, the listed computer instructions are written in the JAVA language, it should not be considered that this embodiment of the invention limits the writing of the listed computer commands to only this programming language. In fact, the listed commands can be written in any well-known or newly created programming language.

В третьем аспекте настоящее изобретение обеспечивает устройство для конверсии данных, имеющих первую табличную структуру, в данные, имеющие вторую табличную структуру, имеющее в своем составе один или более процессоров, память, модули I/O и отличающееся тем, что содержит блок идентификации, который осуществляет идентификацию данных, имеющих первую табличную структуру, идентификацию области обхода данных, при этом может быть осуществлена идентификация более чем одной области обхода данных, блок анализа и сбора данных, который осуществляет проход для каждой ячейки данных из упомянутой области данных или только для тех ячеек данных из упомянутой области данных, которые не являются скрытыми, формирует во внутренней базе данных для каждой ячейки данных буфер для идентифицированного контекста, осуществляет идентификацию контекста для первой ячейки данных, для которой был совершен проход, причем до начала осуществления прохода для следующей ячейки данных, блок сбора и анализа данных записывает идентифицированный контекст в упомянутый буфер, а с началом осуществления прохода для упомянутой следующей ячейки данных блок сбора и анализа данных переносит идентифицированный для упомянутой первой ячейки данных контекст в упомянутый буфер для следующей ячейки данных, осуществляет идентификацию контекста для упомянутой следующей ячейки данных, причем в случае когда идентифицированный контекст для данной ячейки отличается от упомянутого идентифицированного контекста, записанного в упомянутый буфер для данной ячейки данных, блок сбора и анализа данных заменяет идентифицированный контекст в буфере данной ячейки данных на идентифицированный контекст для данной ячейки данных и обеспечивает выполнение перечисленных действий для данной ячейки данных, осуществляет итеративное выполнение перечисленных действий с идентифицированным контекстом до тех пор, пока не будет осуществлен проход для каждой ячейки данных из упомянутой области обхода данных, при этом блок сбора и анализа данных дополнительно может осуществлять идентификацию контекста, относящегося также и к ячейкам, граничащим и/или расположенным на расстоянии от упомянутой текущей ячейки данных, осуществлять идентификацию контекста также и для ячеек из областей данных, расположенных за пределами упомянутой идентифицированной области обхода данных, осуществлять запись совершенных действий в файл регистрации (журнал, лог-файл), указывать направление обхода упомянутой области обхода данных: по столбцам, по строкам и как по столбцам, так и по строкам. При этом устройство также содержит блок конверсии, который осуществляет сериализацию полученной блоком сбора и анализа данных базы данных во внутренний формат данных XML, выбирает шаблон таксономии в зависимости от второй табличной структуры данных и осуществляет применение шаблона таксономии к полученному путем сериализации файлу XML.In a third aspect, the present invention provides an apparatus for converting data having a first tabular structure into data having a second tabular structure having one or more processors, memory, I / O modules and characterized in that it comprises an identification unit that implements identification of data having a first tabular structure, identification of a data crawl area, more than one data crawl area can be identified, an analysis and data acquisition unit that passage for each data cell from the mentioned data region or only for those data cells from the mentioned data region that are not hidden, forms a buffer for the identified context in the internal database for each data cell, performs context identification for the first data cell for which a pass was made, and before the start of the passage for the next data cell, the data collection and analysis unit writes the identified context to the buffer, and with the start of the passage and for said next data cell, the data collection and analysis unit transfers the context identified for said first data cell to said buffer for the next data cell, performs context identification for said next data cell, and in the case when the identified context for this cell is different from the identified identified context written to the buffer for the given data cell, the data collection and analysis unit replaces the identified context in the buffer of this cell and data on the identified context for the given data cell and ensures the execution of the listed actions for the given data cell, iteratively performs the listed actions with the identified context until the passage for each data cell from the mentioned data bypass area is performed, while the collection unit and data analysis can additionally carry out the identification of the context, also related to cells bordering and / or located at a distance from the said current cell and data, to identify the context also for cells from data areas located outside the identified identified data bypass area, record the actions performed in the log file (log, log file), indicate the direction of the bypass of the data bypass area: in columns, by rows and both columns and rows. Moreover, the device also contains a conversion unit that serializes the database received by the data collection and analysis unit into the internal XML data format, selects a taxonomy template depending on the second tabular data structure, and applies the taxonomy template to the XML file obtained by serialization.

В четвертом аспекте настоящее изобретение обеспечивает систему для конверсии данных, имеющих первую табличную структуру, в данные, имеющие вторую табличную структуру. Примерная система для конверсии данных включает в себя сеть. Сеть может включать в себя, но не ограничиваться одну или более локальных сетей (LAN) и/или глобальных сетей (WAN). Такие сетеобразующие среды обычно используются в офисах, корпоративных компьютерных сетях, внутрикорпоративных сетях и Интернете. Соответственно, упомянутая сеть далее дополнительно не описывается. Примерная система для конверсии данных дополнительно включает в себя базу данных и множество устройств для конверсии данных, которые представляют собой одни из устройств, или их комбинацию, описанные во втором и/или третьем аспектах настоящего изобретения, которые, соответственно, дополнительно не описываются. Примерная система для конверсии данных дополнительно содержит серверное вычислительное устройство, которое сохраняет и содействует манипуляции компьютерными командами или кодами, описанными во втором аспекте настоящего изобретения, которые, соответственно, дополнительно не описываются.In a fourth aspect, the present invention provides a system for converting data having a first table structure into data having a second table structure. An exemplary system for converting data includes a network. A network may include, but is not limited to, one or more local area networks (LANs) and / or wide area networks (WANs). Such network-forming environments are commonly used in offices, corporate computer networks, corporate networks and the Internet. Accordingly, said network is not further described further. An exemplary system for converting data further includes a database and a plurality of devices for converting data, which are one of the devices, or a combination thereof, described in the second and / or third aspects of the present invention, which, accordingly, are not further described. An exemplary system for converting data further comprises a server computing device that stores and facilitates the manipulation of computer instructions or codes described in the second aspect of the present invention, which, accordingly, are not further described.

В пятом аспекте настоящее изобретение обеспечивает машиночитаемый носитель данных, содержащий код программы, который побуждает процессор и/или процессоры выполнять действия по способу, описанному в первом аспекте настоящего изобретения, и который, соответственно, дополнительно не описывается. В качестве примера, а не ограничения, машиночитаемый носитель данных может включать в себя оперативную память (RAM); постоянное запоминающее устройство (ROM); электрически стираемое программируемое постоянное запоминающее устройство (EEPROM); флэш-память или другие технологии памяти; CDROM, цифровой универсальный диск (DVD) или другие оптические или голографические носители данных; магнитные кассеты, магнитную пленку, запоминающее устройство на магнитных дисках или другие магнитные запоминающие устройства, несущие волны или другой носитель данных, который может быть использован для кодирования требуемой информации и к которому может быть осуществлен доступ посредством устройства, описываемого во втором, третьем и четвертом аспектах настоящего изобретения, которое, соответственно, дополнительно не описывается.In a fifth aspect, the present invention provides a computer-readable storage medium comprising program code that causes a processor and / or processors to perform actions according to the method described in the first aspect of the present invention, and which, accordingly, is not further described. By way of example, and not limitation, a computer-readable storage medium may include random access memory (RAM); read-only memory device (ROM); Electrically Erasable Programmable Read-Only Memory (EEPROM); flash memory or other memory technologies; CDROM, digital versatile disc (DVD) or other optical or holographic storage media; magnetic cassettes, magnetic tape, magnetic disk storage device or other magnetic storage devices, wave carriers or other storage medium that can be used to encode the required information and which can be accessed through the device described in the second, third and fourth aspects of the present invention, which, accordingly, is not further described.

ДЕТАЛЬНОЕ ОПИСАНИЕ ЧЕРТЕЖЕЙDETAILED DESCRIPTION OF THE DRAWINGS

Описанное в данном разделе возможное осуществление аспектов настоящего изобретения представлено на примере конверсии (преобразования) данных формата Microsoft Excel в данные формата XBRL Instance Document. Несмотря на то что дальнейшее детальное описание чертежей составлено в отношении этих двух форматов, способа конверсии, устройства и системы для конверсии указанных форматов необходимо отметить, что в действительности ни один из вариантов осуществления аспектов настоящего изобретения не ограничивается конверсией данных формата Microsoft Excel в данные формата XBRL Instance Document. Вместо данных формата Microsoft Excel могут быть использованы, но не ограничиваться, любые данные, имеющие табличную структуру, например: данные формата MS Excel, табличные и содержащие табличные данные документы форматов MS Word, MS PowerPoint, Open Document Format for Office Applications (ODF), файлы формата PDF, содержащие поддающиеся точному распознаванию табличные формы, любые веб-формы, сканируемые печатные документы, имеющие табличные формы и любые другие табличные формы предоставления информации. Вместо целевых данных формата XBRL Instance Document могут быть использованы, но не ограничиваться, любые данные, имеющие табличную структуру, например, данные, описываемые языком деловой отчетности extensible Business Reporting Language (XBRL). Ячейки данных документа формата Microsoft Excel, составленного для отчетности, например по энергетическому предприятию, типично содержат, но не ограничиваются в этом, данные, представляющие собой показатели и аналитики. Показатели представляют собой, например, объем отпуска тепловой энергии за отчетный месяц (год, время), стоимость отпущенной тепловой энергии за отчетный месяц (год, время), степень износа, фонд заработной платы основного персонала. Аналитики представляют собой, например, регионы, категории потребителей, виды топлива, уровни напряжения. При конверсии эти данные должны быть помещены в соответствующие ячейки данных формата XBRL.The possible implementation of aspects of the present invention described in this section is presented by the example of the conversion (conversion) of Microsoft Excel format data to XBRL Instance Document format data. Although a further detailed description of the drawings is made in relation to these two formats, the conversion method, apparatus and system for converting said formats, it should be noted that in reality none of the embodiments of the aspects of the present invention is limited to converting Microsoft Excel format data to XBRL format data Instance Document. Instead of data in the Microsoft Excel format, any data having a tabular structure can be used, but not limited to, for example: MS Excel format data, tabular and tabular data documents in MS Word, MS PowerPoint, Open Document Format for Office Applications (ODF) formats, PDF files containing accurately recognizable tabular forms, any web forms, scanned printed documents having tabular forms and any other tabular forms for providing information. Instead of target data of the XBRL Instance Document format, any data that has a tabular structure, for example, data described by the Extensible Business Reporting Language (XBRL), can be used, but not limited to. The data cells of a Microsoft Excel format document compiled for reporting, for example by an energy company, typically contain, but are not limited to, data representing metrics and analytics. Indicators represent, for example, the volume of heat energy supplied for the reporting month (year, time), the cost of heat released for the reporting month (year, time), the degree of depreciation, and the salary fund of key personnel. Analysts are, for example, regions, consumer categories, fuels, voltage levels. When converting, this data should be placed in the appropriate XBRL format data cells.

На фиг.1 показан примерный вариант осуществления способа конверсии данных, имеющих первую табличную структуру, в данные, имеющие вторую табличную структуру. В качестве примера, а не ограничения, описывается способ конверсии данных формата Microsoft Excel в данные формата XBRL Instance Document. На этапе 110 идентификации данных осуществляется идентификация данных, имеющих первую табличную структуру, а именно идентификация данных формата Microsoft Excel. На этапе 120 идентификации области обхода данных осуществляется идентификация (выбор) области данных из области данных формата Microsoft Excel. В данном примерном варианте осуществления способа область обхода данных может быть задана отдельными ячейками, строками, столбцами. Дополнительно может быть выбрано множество областей обхода данных и направление обхода области: по строкам, по столбцам, по строкам и столбцам. Этап 130 сбора и анализа данных дополнительно показан на фиг.2.Figure 1 shows an exemplary embodiment of a method for converting data having a first table structure into data having a second table structure. By way of example, and not limitation, a method for converting Microsoft Excel format data to XBRL Instance Document format data is described. At step 110, the identification of data is the identification of data having a first tabular structure, namely the identification of data in Microsoft Excel format. At step 120, the identification of the crawl area is the identification (selection) of the data area from the data area format Microsoft Excel. In this exemplary embodiment of the method, the crawl area can be defined by individual cells, rows, columns. Additionally, a variety of crawl areas and a crawl direction can be selected: row, column, row and column. Step 130 data collection and analysis is additionally shown in figure 2.

На фиг.2 показан примерный вариант осуществления этапа 130 сбора и анализа данных в способе конверсии данных. На этапе 131 формирования буфера для каждой ячейки данных из идентифицированной на этапе 120 области обхода данных во внутренней базе данных компьютерного приложения, выполняемого посредством устройства или системы для конверсии данных, подробно описанных во втором, третьем и четвертом аспектах настоящего изобретения, и которые будут более подробно описаны далее, формируется буфер, в который на этапе 132 идентификации контекста текущей ячейки и до начала этапа 133 идентификации контекста следующей ячейки записывается идентифицированный на этапе 132 контекст. С началом этапа 133 идентификации контекста следующей ячейки контекст, идентифицированный на этапе 132 переносят в буфер, сформированный для следующей ячейки данных. Далее на этапе 133 идентификации контекста осуществляют идентификацию контекста для следующей ячейки данных, причем в случае когда идентифицированный на этапе 133 контекст отличается от записанного до начала этапа 133 в текущий буфер контекста, то этот идентифицированный контекст заменяет записанный ранее контекст, и далее с этим контекстом осуществляются те же действия, что были осуществлены с идентифицированным на этапе 132 контекстом, т.е. каждая следующая ячейка становится текущей ячейкой и способ повторяется. На этапе 134 итерации осуществляется повторение этапов 132, 133 до тех пор, пока не будет совершен проход для каждой ячейки данных. Дополнительно, во время выполнения этапа 130 сбора и анализа данных проход может совершаться не для каждой ячейки данных из области данных, указанной на этапе 120 (см. фиг.1). Например, ячейки, которые являются «скрытыми» могут не идентифицироваться, как ячейки, содержащие контекст, подлежащий идентификации. Дополнительно, на этапах 132 и 133 идентификации контекста может быть идентифицирован контекст, относящийся также и к ячейкам, граничащим и/или расположенным на расстоянии от текущей ячейки данных, и для ячеек из областей данных, расположенных за пределами упомянутой идентифицированной области обхода данных. Процедура записи изменившегося идентифицированного контекста на этапе 133 идентификации контекста вместо прежнего идентифицированного контекста, хранящегося в буфере текущей ячейки, позволяет упростить XML-описание конвертера. Если в отчете единица измерения, аналитика или показатель действует на всю строку, то достаточно задать их только один раз для всей строки. В свою очередь это позволяет осуществлять более точную и эффективную конверсию данных в требуемый формат из-за меньшего объема конвертируемых данных. После завершения этапа 130 сбора и анализа данных (см. фиг.1), способ переходит к этапу 140 сериализации, на котором полученную базу данных сериализуют в формат данных XML, тем самым получая файл формата XML. На этапе 150 таксономии выбирается шаблон таксономии, соответствующий целевой табличной структуре данных. В описываемом примерном варианте осуществления одного аспекта настоящего изобретения таким шаблоном таксономии является типичный шаблон XBRL-таксономии, подходящий для применения к полученному файлу формата XML. В отличие от шаблона мэппинга шаблон таксономии применяется к сформированному файлу формата XML, что позволяет обеспечить высокую точность конверсии данных. На этапе 160 конверсии происходит конверсия файла формата XML, к которому на этапе 150 таксономии был применен шаблон таксономии, которая заключается в том, что содержащиеся в файле XML данные, имеющие соответствующие атрибуты, записывают в соответствующие ячейки шаблона XBRL, тем самым обеспечивается достаточная точность конверсии данных, даже если исходные данные были несанкционированно изменены.Figure 2 shows an exemplary embodiment of step 130 of collecting and analyzing data in a data conversion method. In step 131 of generating a buffer for each data cell from the data crawl area identified in step 120 in the internal database of a computer application executed by a device or system for converting data described in detail in the second, third and fourth aspects of the present invention, and which will be described in more detail described further, a buffer is formed in which, at the stage 132 of identifying the context of the current cell, and prior to the beginning of the stage of identifying the context of the next cell, 133 identified identified stage 132 context. At the beginning of the next cell context identification step 133, the context identified in step 132 is transferred to a buffer formed for the next data cell. Next, at step 133 of the context identification, the context is identified for the next data cell, and in the case when the context identified in step 133 differs from that recorded before the start of step 133 in the current context buffer, this identified context replaces the previously recorded context, and then with this context the same actions that were carried out with the context identified at step 132, i.e. each next cell becomes the current cell and the method repeats. At iteration step 134, steps 132, 133 are repeated until a pass for each data cell has been completed. Additionally, during the execution of step 130 of data collection and analysis, passage may not occur for each data cell from the data region indicated in step 120 (see FIG. 1). For example, cells that are “hidden” may not be identified as cells containing the context to be identified. Additionally, in context identification steps 132 and 133, a context can also be identified that relates also to cells adjacent to and / or located at a distance from the current data cell, and to cells from data regions located outside said identified data walk region. The procedure for recording the changed identified context in the context identification step 133 instead of the previously identified context stored in the buffer of the current cell makes it possible to simplify the XML description of the converter. If a unit of measure, analytics, or a metric in a report acts on the entire line, then it is enough to set them only once for the entire line. In turn, this allows for more accurate and efficient conversion of data to the required format due to the smaller amount of convertible data. After completing the data collection and analysis step 130 (see FIG. 1), the method proceeds to serialization step 140, where the resulting database is serialized into an XML data format, thereby obtaining an XML format file. At taxonomy step 150, a taxonomy template corresponding to the target tabular data structure is selected. In the described exemplary embodiment of one aspect of the present invention, such a taxonomy template is a typical XBRL taxonomy template suitable for application to the resulting XML file. Unlike the mapping template, the taxonomy template is applied to the generated XML file, which allows for high accuracy of data conversion. At a conversion step 160, an XML file is converted to which a taxonomy template was applied at a taxonomy step 150, which consists in writing the data containing the corresponding attributes in the XML file to the corresponding cells of the XBRL template, thereby ensuring sufficient conversion accuracy data, even if the original data was unauthorized changed.

На фиг.3 изображено примерное осуществление одного из второго или третьего аспекта осуществления настоящего изобретения, а именно устройства 200 для конверсии данных, имеющих первую табличную структуру, в данные, имеющие вторую табличную структуру, которое может представлять собой, но не ограничиваться: персональный компьютер, портативный компьютер, планшетный компьютер, карманный компьютер, смартфон и тому подобное. Устройство выполнено с возможностью доступа в сеть и обязательно содержит один или более процессоров 210, машиночитаемый носитель данных (память) 220, модули ввода/вывода (I/O) 230 и порты ввода/вывода (I/O) 240. В качестве примера, а не ограничения, машиночитаемый носитель данных 220 может включать в себя оперативную память (RAM); постоянное запоминающее устройство (ROM); электрически стираемое программируемое постоянное запоминающее устройство (EEPROM); флэш-память или другие технологии памяти; CDROM, цифровой универсальный диск (DVD) или другие оптические или голографические носители данных; магнитные кассеты, магнитную пленку, запоминающее устройство на магнитных дисках или другие магнитные запоминающие устройства, несущие волны или другой носитель данных, который может быть использован для кодирования требуемой информации и к которому может быть осуществлен доступ посредством описываемого устройства. Память 220 включает в себя носитель данных на основе запоминающего устройства компьютера в форме энергозависимой или энергонезависимой памяти или их комбинации. Примерные исполнения аппаратных устройств памяти включают в себя твердотельную память, накопители на жестких дисках, накопители на оптических дисках и т.д. В памяти хранится примерная среда 250, в которой при помощи компьютерных команд или кодов, хранящихся в памяти 220 устройства, может быть осуществлена процедура конверсии. Устройство содержит один или более процессоров 210, которые предназначены для выполнения компьютерных команд или кодов, хранящихся в памяти устройства с целью обеспечения выполнения процедуры конверсии. Модули I/O 230 представляют собой, но не ограничиваются, типичные и известные из уровня техники средства управления устройством: манипулятор типа «мышь», клавиатура, джойстик, тачпад, трекбол, электронное перо, стилус, сенсорный дисплей и тому подобное. Также модули I/O 230 представляют собой, но не ограничиваются типичные и известные из уровня техники средства представления демонстрирования и воспроизведения информации: монитор, проектор, принтер, графопостроитель и тому подобное. Порты 240 I/O позволяют логически соединять вычислительное устройство 200 с другими устройствами, включая модули (240) I/O, которые могут быть как встроенными, так и внешними. Компьютерные команды представляют собой команды, описанные во втором аспекте осуществления настоящего изобретения, и, соответственно, дополнительно не описываются. В одном аспекте осуществления настоящего изобретения устройство 200 представляет собой в качестве примера, а не ограничения, персональный компьютер пользователя. Данный персональный компьютер имеет операционную среду 250, которая в качестве примера, а не ограничения представляет собой операционную систему семейства Windows. Кроме того, устройство содержит память, выполненную в одном из перечисленных вариантов осуществления, в которой хранится компьютерное приложение, представляющее собой набор перечисленных во втором аспекте настоящего изобретения компьютерных команд или кодов, которыми может манипулировать пользователь. Процедура конверсии осуществляется посредством этого приложения путем выбора исходных данных, имеющих первую структуру, в качестве примера, а не ограничения этими данными могут быть файлы формата Microsoft Excel. В данном примерном варианте осуществления изобретения пользователь сам задает область обхода данных, которую необходимо конвертировать, сам определяет количество обходов, сам задает направления обходов, сам задает параметры, характеризующие процесс сбора и анализа данных. Однако необходимо отметить, что пользователь также имеет возможность создать шаблон конверсии (сценарий конверсии) либо он может быть сгенерирован автоматически с помощью компьютерного приложения или блока генерирования сценария конверсии по заданным параметрам.Figure 3 shows an exemplary implementation of one of the second or third aspect of the implementation of the present invention, namely, a device 200 for converting data having a first tabular structure into data having a second tabular structure, which may be, but not limited to: a personal computer, laptop computer, tablet computer, PDA, smartphone and the like. The device is configured to access the network and necessarily contains one or more processors 210, a computer-readable storage medium (memory) 220, input / output (I / O) modules 230 and input / output (I / O) ports 240. As an example, and not limitation, computer-readable storage medium 220 may include random access memory (RAM); read-only memory device (ROM); Electrically Erasable Programmable Read-Only Memory (EEPROM); flash memory or other memory technologies; CDROM, digital versatile disc (DVD) or other optical or holographic storage media; magnetic cassettes, magnetic tape, magnetic disk storage device or other magnetic storage devices, wave carriers or other storage medium that can be used to encode the required information and which can be accessed by the described device. Memory 220 includes a storage medium based on a computer storage device in the form of volatile or non-volatile memory, or a combination thereof. Exemplary executions of hardware memory devices include solid state memory, hard disk drives, optical disk drives, etc. An example environment 250 is stored in memory, in which a conversion procedure can be carried out using computer instructions or codes stored in the device memory 220. The device contains one or more processors 210, which are designed to execute computer instructions or codes stored in the device memory in order to ensure the implementation of the conversion procedure. I / O 230 modules include, but are not limited to, typical and prior art device controls: a mouse, keyboard, joystick, touchpad, trackball, electronic pen, stylus, touch screen, and the like. Also, the I / O 230 modules are, but are not limited to, typical and known from the prior art means of presenting the demonstration and reproduction of information: monitor, projector, printer, plotter, and the like. The I / O ports 240 allow the computing device 200 to be logically connected to other devices, including I / O modules (240), which can be either internal or external. Computer instructions are instructions described in a second embodiment of the present invention, and accordingly are not further described. In one aspect of the implementation of the present invention, the device 200 is an example, and not limitation, of a user's personal computer. This personal computer has an operating environment of 250, which as an example, and not limitation, is an operating system of the Windows family. In addition, the device comprises a memory made in one of the listed embodiments, in which a computer application is stored, which is a set of computer instructions or codes listed in the second aspect of the present invention that the user can manipulate. The conversion procedure is carried out through this application by selecting the source data having the first structure as an example, and not limitation to these data can be Microsoft Excel format files. In this exemplary embodiment of the invention, the user himself sets the data crawl area that needs to be converted, he determines the number of crawls, he sets the crawl directions, he sets the parameters characterizing the process of data collection and analysis. However, it should be noted that the user also has the ability to create a conversion template (conversion script) or it can be generated automatically using a computer application or a conversion script generation unit according to the specified parameters.

На фиг.4 представлено примерное исполнение системы 300 для конверсии данных, имеющих первую табличную структуру, в данные, имеющие вторую табличную структуру, которая содержит устройство 200 для конверсии данных, имеющих первую табличную структуру, в данные, имеющие вторую табличную структуру, сеть 310, один или несколько серверов 320 и базу данных 330. Сеть 310 может включать в себя, но не ограничиваясь этим, одну или более локальных сетей (LAN) и/или глобальных сетей (WAN) или может представлять собой сеть Интернет или Интранет, может также представлять собой виртуальную частную сеть (VPN) и тому подобное. Система 300 включает в себя упомянутое устройство 200 для конверсии данных. Устройство 200 для конверсии данных, как было указано выше, применяется для выполнения конверсии данных, имеющих первую табличную структуру, в данные, имеющие вторую табличную структуру. В примерном варианте осуществления устройство 200 для конверсии данных представляет собой устройство 200 для конверсии данных, которое было описано ранее со ссылкой на фиг.3. Дополнительно система 300 включает в себя сервер 320, который может представлять собой, также как и устройство 200 для конверсии данных: персональный компьютер, портативный компьютер, планшетный компьютер, карманный компьютер, смартфон и тому подобное.FIG. 4 shows an exemplary embodiment of a system 300 for converting data having a first tabular structure into data having a second tabular structure that includes a device 200 for converting data having a first tabular structure into data having a second tabular structure, network 310, one or more servers 320 and a database 330. Network 310 may include, but is not limited to, one or more local area networks (LANs) and / or wide area networks (WANs), or may be an Internet or an Intranet, may also be a virtual private network (VPN) and the like. System 300 includes said device 200 for converting data. A device 200 for converting data, as indicated above, is used to convert data having a first table structure to data having a second table structure. In an exemplary embodiment, the data conversion device 200 is a data conversion device 200, which has been described previously with reference to FIG. 3. Additionally, the system 300 includes a server 320, which may be, like a device 200 for data conversion: a personal computer, a laptop computer, a tablet computer, a PDA, a smartphone, and the like.

Сервер 320, также как и устройство 200, может представлять собой, но не ограничиваясь: суперкомпьютер, персональный компьютер, портативный компьютер, планшетный компьютер, карманный компьютер, смартфон и тому подобное. Сервер 320 обеспечивает регулирование обменом данных в системе 300, а также обеспечивает обработку данных при условии подключения к нему более чем одного устройства 200 для конверсии данных или когда устройство 200 для конверсии данных представляет собой тонкий клиент (thin client), и все вычислительные мощности по обеспечению выполнения процедуры конверсии данных расположены на сервере 320. В этом случае обеспечение выполнение процедуры конверсии осуществляется сервером 320 со ссылкой на фиг.1-3. Сервер 320 также имеет возможность обеспечивать виртуальную вычислительную среду (Virtual Machine) для обеспечения взаимодействия между устройством 200 для конверсии и базой данных 330 (БД). БД 330 может представлять собой, но не ограничиваясь: иерархическую БД, сетевую БД, реляционную БД, объектную БД, объектно-ориентированную БД, объектно-реляционную БД, пространственную БД и тому подобное. БД 330 хранит данные в памяти, которая может представлять собой, но не ограничиваясь: постоянное запоминающее устройство (ROM), электрически стираемое программируемое постоянное запоминающее устройство (EEPROM), флэш-память, CDROM, цифровой универсальный диск (DVD) или другие оптические или голографические носители данных; магнитные кассеты, магнитную пленку, запоминающее устройство на магнитных дисках или другие магнитные запоминающие устройства, несущие волны или другой носитель данных, который может быть использован для хранения требуемой информации и к которому может быть осуществлен доступ посредством устройства 200 для конверсии данных и сервера 320. БД 330 служит для хранения данных, представляющих собой: исходные данные первой табличной структуры, шаблоны таксономии, данные, собранные на этапе 130 сбора или анализа данных, подробно описанном со ссылкой на фиг.1 или с помощью блока сбора и анализа данных, описанном в третьем аспекте осуществления настоящего изобретения, во время выполнения процедуры конверсии данных из одной табличной структуры в другую, данные файла регистрации (журнал, лог-файл), данные, полученные после выполнения процедуры конверсии, соответствующие второй табличной структуре данных, и тому подобное.Server 320, as well as device 200, can be, but is not limited to: a supercomputer, a personal computer, a laptop computer, a tablet computer, a PDA, a smartphone, and the like. Server 320 provides data exchange control in system 300 and also provides data processing provided that more than one data conversion device 200 is connected to it or when data conversion device 200 is a thin client, and all computing power is provided the data conversion procedures are located on the server 320. In this case, the conversion procedure is performed by the server 320 with reference to FIGS. 1-3. The server 320 also has the ability to provide a virtual computing environment (Virtual Machine) to facilitate interaction between the device 200 for conversion and the database 330 (DB). A database 330 may be, but is not limited to: a hierarchical database, a network database, a relational database, an object database, an object-oriented database, an object-relational database, a spatial database, and the like. Database 330 stores data in memory, which may be, but not limited to: read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory, CDROM, digital versatile disk (DVD) or other optical or holographic storage media; magnetic cassettes, magnetic tape, magnetic disk storage device or other magnetic memory devices, wave carriers or other storage medium that can be used to store the required information and which can be accessed through the device 200 for data conversion and server 320. DB 330 is used to store data, which is: the initial data of the first tabular structure, taxonomy templates, data collected at the stage 130 of data collection or analysis, described in detail with reference figure 1 or using the data collection and analysis unit described in the third aspect of the implementation of the present invention, during the process of converting data from one table structure to another, the data of the log file (log, log file), data obtained after execution conversion procedures corresponding to the second tabular data structure, and the like.

Claims

1. A method of converting data having a first tabular structure into data having a second tabular structure, comprising the steps of:
1) identify data having a first tabular structure;
2) identify the crawl area of the data having the first table structure;
3) make a pass for each data cell from the mentioned data crawl area, while the steps are performed in which:
4) form a buffer for the identified context in the internal database for each data cell;
5) the context for the first data cell for which passage has been identified, and before the start of the passage for the next data cell, the identified context is written into the buffer, and with the start of the passage for the next data cell, the context identified for the first data cell is transferred to the buffer for the next data cell;
6) the context for said next data cell is identified, and in the case where the identified context for a given cell is different from said identified context recorded in said buffer for a given data cell, the identified context for this data cell replaces the identified context in the buffer of this data cell, moreover, actions are performed on it as on the identified context specified in step 5);
7) iteratively perform steps 5) and 6) until a passage has been completed for each said data cell from said data area;
8) serialize the resulting database into the Extensible Markup Language (XML) internal data format;
9) apply a taxonomy template to the XML file received after the serialization process, and the taxonomy template is selected depending on the second tabular data structure;
10) they convert the resulting XML file into data having a second table structure, in accordance with the taxonomy template applied to it.

2. The method according to claim 1, characterized in that the data having a first tabular structure are, for example, MS Excel format documents, tabular and tabular data documents of MS Word, MS PowerPoint, Open Document Format for Office Applications (ODF ), PDF, web forms.

3. The method according to claim 1, characterized in that the data having a second table structure is data in Extensible Business Reporting Language (XBRL) format.

4. The method according to claim 1, characterized in that at the stage of identifying a data crawl area, more than one data crawl area can be identified.

5. The method according to claim 1, characterized in that they do not pass for each data cell from the data crawl area.

6. The method according to claim 1, characterized in that indicate the number of bypasses of the specified data area.

7. The method according to claim 1, characterized in that at the stage of passage for the said data cell, a context is identified that applies also to cells bordering and / or located at a distance from said current data cell.

8. The method according to claim 1, characterized in that at the stage of traversing the identified data area, context identification also occurs for cells from data areas located outside of the identified identified data traversal area.

9. The method according to claim 1, characterized in that the steps are recorded in a registration file.

10. The method according to claim 1, characterized in that they indicate the direction of the bypass of the specified data area.

11. The method according to claim 10, characterized in that the bypass direction is made in columns.

12. The method according to claim 10, characterized in that the bypass direction is made in rows.

13. The method according to claim 10, characterized in that the bypass direction is made both in columns and in rows.

14. A device for converting data having a first table structure into data having a second table structure containing at least:
one or more processors;
input / output modules (I / O);
I / O ports and
a memory containing program code, which upon execution causes the processor and / or processors to perform actions that are stages in which:
1) identify data having a first tabular structure;
2) identify the crawl area of the data having the first table structure;
3) make a pass for each data cell from the mentioned data crawl area, while the steps are performed in which:
4) form a buffer for the identified context in the internal database for each data cell;
5) the context for the first data cell for which passage has been identified, and before the start of the passage for the next data cell, the identified context is written into the buffer, and with the start of the passage for the next data cell, the context identified for the first data cell is transferred to the buffer for the next data cell;
6) identify the context for said next data cell, and in the case where the identified context for this cell is different from the identified identified context written in the mentioned buffer for this data cell, the identified context for this data cell replaces the identified context in the buffer of this data cell , moreover, actions are performed on it as on the identified context, indicated in step 5);
7) iteratively perform steps 5) and 6) until a passage has been completed for each said data cell from said data area;
8) serialize the resulting database into an internal XML data format;
9) apply a taxonomy template to the XML file received after the serialization process, and the taxonomy template is selected depending on the second tabular data structure;
10) they convert the resulting XML file into data having a second table structure, in accordance with the taxonomy template applied to it.

15. The device according to 14, characterized in that the data having the first tabular structure are, for example, MS Excel format documents, tabular and tabular data documents of MS Word, MS PowerPoint, ODF, PDF, web forms.

16. The device according to 14, characterized in that the data having a second table structure are data in XBRL format.

17. The device according to 14, characterized in that at the stage of identifying the area of the crawl data can be identified more than one area crawl data.

18. The device according to p. 14, characterized in that at the stage of the passage, a passage is not made for each data cell from the mentioned data bypass area.

19. The device according to p. 14, characterized in that it is configured to indicate the number of bypasses of the specified data area.

20. The device according to 14, characterized in that at the stage of passage for the said data cell, the context is identified that applies also to cells bordering and / or located at a distance from the said current data cell.

21. The device according to 14, characterized in that at the stage of walking the identified data area, context identification also occurs for cells from data areas located outside the identified identified data walking area.

22. The device according to 14, characterized in that the steps are recorded in the registration file.

23. The device according to p. 14, characterized in that it is configured to indicate the direction of bypass of the specified data area.

24. The device according to item 23, wherein the bypass direction is made in columns.

25. The device according to item 23, wherein the bypass direction is made in rows.

26. The device according to item 23, wherein the bypass direction is made both in columns and in rows.

27. A system for converting data having a first tabular structure into data having a second tabular structure comprising at least:
a plurality of devices for converting data having a first tabular structure into data having a second tabular structure made in the form of devices according to any one of claims 14-26;
one or more servers providing regulation of data exchange in the system;
a database designed to store data, configured to interact with the aforementioned devices for conversion and one or more servers;
a network that enables the interaction of these devices to perform the conversion of one or more servers and a database.

28. The system of claim 27, wherein the data conversion is performed by one or more servers, and said devices for performing data conversion are a thin client.

29. The system of claim 27, wherein the data of the first tabular structure and the second tabular structure are stored in the said database and are requested by said devices to perform the conversion before the start of the conversion.

30. The system according to p. 28, characterized in that the data of the first table structure and the second table structure are stored in the said database and are requested by the said servers before starting the conversion.

31. The system of claim 27, wherein said servers control data exchange with said devices to perform data conversion through a virtual machine.

32. The system according to item 27, in which the organization of the database can be one of: hierarchical, network, relational, object, object-oriented, object-relational, spatial.

33. The system of claim 27, wherein said network is one of: a local area network (LAN), a wide area network (WAN), the Internet, an intranet, a virtual private network (VPN).

34. A computer-readable storage medium containing program code, which upon execution causes the processor and / or processors to perform the actions of the method according to any one of claims 1 to 13.