EA201991908A1 - Способ и устройство для компактного представления биоинформационных данных с помощью нескольких геномных дескрипторов - Google Patents
Способ и устройство для компактного представления биоинформационных данных с помощью нескольких геномных дескрипторовInfo
- Publication number
- EA201991908A1 EA201991908A1 EA201991908A EA201991908A EA201991908A1 EA 201991908 A1 EA201991908 A1 EA 201991908A1 EA 201991908 A EA201991908 A EA 201991908A EA 201991908 A EA201991908 A EA 201991908A EA 201991908 A1 EA201991908 A1 EA 201991908A1
- Authority
- EA
- Eurasian Patent Office
- Prior art keywords
- data
- compact representation
- multiple genomic
- descriptors
- reads
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
- G16B50/50—Compression of genetic data
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/58—Random or pseudo-random number generators
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
- G16B30/10—Sequence alignment; Homology search
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B45/00—ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
- G16B50/10—Ontologies; Annotations
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
- G16B50/30—Data warehousing; Computing architectures
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B99/00—Subject matter not provided for in other groups of this subclass
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/08—Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
- H04L9/0861—Generation of secret information including derivation or calculation of cryptographic keys or passwords
- H04L9/0866—Generation of secret information including derivation or calculation of cryptographic keys or passwords involving user or device identifiers, e.g. serial number, physical or biometrical information, DNA, hand-signature or measurable physical characteristics
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/30—Public key, i.e. encryption algorithm being computationally infeasible to invert or user's encryption keys not requiring secrecy
- H04L9/3066—Public key, i.e. encryption algorithm being computationally infeasible to invert or user's encryption keys not requiring secrecy involving algebraic varieties, e.g. elliptic or hyper-elliptic curves
- H04L9/3073—Public key, i.e. encryption algorithm being computationally infeasible to invert or user's encryption keys not requiring secrecy involving algebraic varieties, e.g. elliptic or hyper-elliptic curves involving pairings, e.g. identity based encryption [IBE], bilinear mappings or bilinear pairings, e.g. Weil or Tate pairing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L2209/00—Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
- H04L2209/30—Compression, e.g. Merkle-Damgard construction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L2209/00—Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
- H04L2209/34—Encoding or coding, e.g. Huffman coding or error correction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L2209/00—Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
- H04L2209/88—Medical equipments
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Biophysics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Bioethics (AREA)
- General Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Analytical Chemistry (AREA)
- Data Mining & Analysis (AREA)
- Epidemiology (AREA)
- Public Health (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Computer Security & Cryptography (AREA)
- Mathematical Optimization (AREA)
- Mathematical Analysis (AREA)
- Computer Networks & Wireless Communication (AREA)
- Pure & Applied Mathematics (AREA)
- Signal Processing (AREA)
- Genetics & Genomics (AREA)
- Mathematical Physics (AREA)
- Computational Mathematics (AREA)
- Human Computer Interaction (AREA)
- Algebra (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
Abstract
Способ и устройство для сжатия данных геномной последовательности, созданных секвенаторами генома. Прочтения последовательности кодируют путем выравнивания их относительно ранее существующих или построенных референсных последовательностей, причем процесс кодирования состоит из классифицирования прочтений в классы данных с последующим кодированием каждого класса посредством множества блоков дескрипторов. Для каждого класса данных, на которые разбивают данные, и каждого соответствующего блока дескрипторов используют специальные модели источников и энтропийные кодеры.
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/US2017/017842 WO2018071055A1 (en) | 2016-10-11 | 2017-02-14 | Method and apparatus for the compact representation of bioinformatics data |
| PCT/US2017/041591 WO2018071080A2 (en) | 2016-10-11 | 2017-07-11 | Method and systems for the representation and processing of bioinformatics data using reference sequences |
| PCT/US2018/018092 WO2018152143A1 (en) | 2017-02-14 | 2018-02-14 | Method and apparatus for the compact representation of bioinformatics data using multiple genomic descriptors |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| EA201991908A1 true EA201991908A1 (ru) | 2020-01-21 |
Family
ID=68609803
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| EA201991908A EA201991908A1 (ru) | 2017-02-14 | 2018-02-14 | Способ и устройство для компактного представления биоинформационных данных с помощью нескольких геномных дескрипторов |
Country Status (10)
| Country | Link |
|---|---|
| EP (1) | EP3583500A4 (ru) |
| KR (1) | KR102733786B1 (ru) |
| AU (1) | AU2018221458B2 (ru) |
| CA (1) | CA3052824A1 (ru) |
| EA (1) | EA201991908A1 (ru) |
| IL (1) | IL268651A (ru) |
| MX (1) | MX2019009680A (ru) |
| SG (1) | SG11201907418YA (ru) |
| WO (1) | WO2018152143A1 (ru) |
| ZA (1) | ZA201905921B (ru) |
Families Citing this family (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110189830B (zh) * | 2019-05-24 | 2021-06-08 | 杭州火树科技有限公司 | 基于机器学习的电子病历词库训练方法 |
| EP3896698A1 (en) | 2020-04-15 | 2021-10-20 | Genomsys SA | Method and system for the efficient data compression in mpeg-g |
| KR102497634B1 (ko) * | 2020-12-21 | 2023-02-08 | 부산대학교 산학협력단 | 문자 빈도 기반 서열 재정렬을 통한 fastq 데이터 압축 방법 및 장치 |
| CN116206687A (zh) * | 2022-12-30 | 2023-06-02 | 深圳百人科技有限公司 | 一种模糊匹配的k-mer编码方式 |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP1383911A4 (en) * | 2001-04-02 | 2004-12-15 | Cytoprint Inc | METHOD AND APPARATUS FOR DISCOVERING, IDENTIFYING AND COMPARING BIOLOGICAL ACTIVITY MECHANISMS |
| US7698067B2 (en) * | 2002-02-12 | 2010-04-13 | International Business Machines Corporation | Sequence pattern descriptors for transmembrane structural details |
| US7809765B2 (en) * | 2007-08-24 | 2010-10-05 | General Electric Company | Sequence identification and analysis |
| KR101922129B1 (ko) * | 2011-12-05 | 2018-11-26 | 삼성전자주식회사 | 차세대 시퀀싱을 이용하여 획득된 유전 정보를 압축 및 압축해제하는 방법 및 장치 |
| US9679104B2 (en) * | 2013-01-17 | 2017-06-13 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
| CN103336916B (zh) * | 2013-07-05 | 2016-04-06 | 中国科学院数学与系统科学研究院 | 一种测序序列映射方法及系统 |
| US10902937B2 (en) * | 2014-02-12 | 2021-01-26 | International Business Machines Corporation | Lossless compression of DNA sequences |
-
2018
- 2018-02-14 MX MX2019009680A patent/MX2019009680A/es unknown
- 2018-02-14 AU AU2018221458A patent/AU2018221458B2/en active Active
- 2018-02-14 EA EA201991908A patent/EA201991908A1/ru unknown
- 2018-02-14 WO PCT/US2018/018092 patent/WO2018152143A1/en not_active Ceased
- 2018-02-14 CA CA3052824A patent/CA3052824A1/en active Pending
- 2018-02-14 KR KR1020197026877A patent/KR102733786B1/ko active Active
- 2018-02-14 SG SG11201907418YA patent/SG11201907418YA/en unknown
- 2018-02-14 EP EP18753700.6A patent/EP3583500A4/en active Pending
-
2019
- 2019-08-12 IL IL26865119A patent/IL268651A/en unknown
- 2019-09-09 ZA ZA2019/05921A patent/ZA201905921B/en unknown
Also Published As
| Publication number | Publication date |
|---|---|
| KR102733786B1 (ko) | 2024-11-26 |
| AU2018221458A1 (en) | 2019-10-03 |
| EP3583500A1 (en) | 2019-12-25 |
| AU2018221458B2 (en) | 2022-12-08 |
| KR20190113971A (ko) | 2019-10-08 |
| EP3583500A4 (en) | 2020-12-16 |
| CA3052824A1 (en) | 2018-08-23 |
| SG11201907418YA (en) | 2019-09-27 |
| WO2018152143A1 (en) | 2018-08-23 |
| ZA201905921B (en) | 2021-05-26 |
| IL268651A (en) | 2019-10-31 |
| MX2019009680A (es) | 2019-10-09 |
| NZ757185A (en) | 2021-05-28 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| PH12019501879A1 (en) | Method and apparatus for the compact representation of bioinformatics data using multiple genomic descriptors | |
| EA201991908A1 (ru) | Способ и устройство для компактного представления биоинформационных данных с помощью нескольких геномных дескрипторов | |
| EP3944195A4 (en) | METHOD OF ENCODING THREE-DIMENSIONAL DATA, METHOD OF DECODING OF THREE-DIMENSIONAL DATA, DEVICE FOR ENCODING OF THREE-DIMENSIONAL DATA, AND DECODER OF DECODING OF THREE-DIMENSIONAL DATA | |
| MX2017012060A (es) | Derivacion de informacion de movimiento para sub-bloques en codificacion de video. | |
| MX2024009494A (es) | Metodo de codificacion de datos tridimensionales, metodo de decodificacion de datos tridimensionales, dispositivo de codificacion de datos tridimensionales y dispositivo de decodificacion de datos tridimensionales. | |
| MY189223A (en) | Apparatus and method for encoding or decoding a multi-channel signal using a broadband alignment parameter and a plurality of narrowband alignment parameters | |
| ZA201906992B (en) | A communication method and apparatus | |
| MX2019004125A (es) | Estructuras eficientes de datos para la representacion de informacion bioinformatica. | |
| EP3985613A4 (en) | Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device | |
| EP4325727A3 (en) | Data processing method and device | |
| SA519401514B1 (ar) | طريقة وجهاز لضغط تمثيل بيانات المعلومات البيولوجية | |
| MX364028B (es) | Aparato y metodo de procesamiento de imagenes. | |
| MY190014A (en) | Data compression | |
| EA201991906A1 (ru) | Способ и системы для восстановления геномных референсных последовательностей из сжатых прочтений геномной последовательности | |
| MX2019004131A (es) | Metodo y aparato para el acceso a datos bioinformaticos estructurados en unidades de acceso. | |
| SG11201906107QA (en) | Data processing method, and terminal device, and network device | |
| EA201991907A1 (ru) | Способ и системы для эффективного сжатия прочтений геномной последовательности | |
| RU2014145618A (ru) | Устройство кодирования изображения, способ кодирования изображения и программа кодирования изображения, а также устройство декодирования изображения, способ декодирования изображения и программа декодирования изображения | |
| TW201612895A (en) | Method and apparatus for coding or decoding subband configuration data for subband groups | |
| AR110436A1 (es) | Método de codificación de vídeo, método de decodificación de vídeo, dispositivo de codificación de vídeo y dispositivo de decodificación de vídeo | |
| AR107411A1 (es) | Aparato y método para codificar o decodificar una señal multi-canal utilizando repetición de muestreo de dominio espectral | |
| MY189399A (en) | Method and device for encoding video having block size set for each block shape, and method and device for decoding video | |
| TH1901007951A (th) | วิธีการและชุดเครื่องเข้ารหัสเชิงขั้วอุปกรณ์แบบไร้สายและสื่อที่อ่านได้ด้วยคอมพิวเตอร์ | |
| TH1701007730A (th) | เครื่องมือและวิธีการต่างๆ เพื่อทำการเข้ารหัส หรือทำการถอดรหัสสัญญาณ ชนิดหลายช่องสัญญาณความถี่เสี่ยงชนิดหนึ่ง โดยใช้การชักตัวอย่างซ้ำในโดเมนเชิงสเปกตรัม | |
| PE20191228A1 (es) | Metodo y aparato para representacion compacta de datos bioinformaticos |