WO2002056197A1 - Systeme et procede pour la manipulation de documents electroniques - Google Patents
Systeme et procede pour la manipulation de documents electroniques Download PDFInfo
- Publication number
- WO2002056197A1 WO2002056197A1 PCT/NL2001/000013 NL0100013W WO02056197A1 WO 2002056197 A1 WO2002056197 A1 WO 2002056197A1 NL 0100013 W NL0100013 W NL 0100013W WO 02056197 A1 WO02056197 A1 WO 02056197A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- text
- document
- predetermined
- variables
- sep
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/NL2001/000013 WO2002056197A1 (fr) | 2001-01-10 | 2001-01-10 | Systeme et procede pour la manipulation de documents electroniques |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/NL2001/000013 WO2002056197A1 (fr) | 2001-01-10 | 2001-01-10 | Systeme et procede pour la manipulation de documents electroniques |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2002056197A1 true WO2002056197A1 (fr) | 2002-07-18 |
Family
ID=19760733
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/NL2001/000013 WO2002056197A1 (fr) | 2001-01-10 | 2001-01-10 | Systeme et procede pour la manipulation de documents electroniques |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2002056197A1 (fr) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2008028018A1 (fr) | 2006-08-30 | 2008-03-06 | Amazon Technologies, Inc. | Classification automatisée de pages de document |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| GB2336698A (en) * | 1998-04-24 | 1999-10-27 | Dialog Corp Plc The | Automatic content categorisation of text data files using subdivision to reduce false classification |
| US5983170A (en) * | 1996-06-25 | 1999-11-09 | Continuum Software, Inc | System and method for generating semantic analysis of textual information |
| WO2000026795A1 (fr) * | 1998-10-30 | 2000-05-11 | Justsystem Pittsburgh Research Center, Inc. | Procede de filtrage de messages sur la base du contenu, par analyse des caracteristiques des termes a l'interieur du message |
-
2001
- 2001-01-10 WO PCT/NL2001/000013 patent/WO2002056197A1/fr active Application Filing
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5983170A (en) * | 1996-06-25 | 1999-11-09 | Continuum Software, Inc | System and method for generating semantic analysis of textual information |
| GB2336698A (en) * | 1998-04-24 | 1999-10-27 | Dialog Corp Plc The | Automatic content categorisation of text data files using subdivision to reduce false classification |
| WO2000026795A1 (fr) * | 1998-10-30 | 2000-05-11 | Justsystem Pittsburgh Research Center, Inc. | Procede de filtrage de messages sur la base du contenu, par analyse des caracteristiques des termes a l'interieur du message |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2008028018A1 (fr) | 2006-08-30 | 2008-03-06 | Amazon Technologies, Inc. | Classification automatisée de pages de document |
| JP2010503075A (ja) * | 2006-08-30 | 2010-01-28 | アマゾン テクノロジーズ,インク. | ドキュメントページの自動分類 |
| US8306326B2 (en) | 2006-08-30 | 2012-11-06 | Amazon Technologies, Inc. | Method and system for automatically classifying page images |
| US9594833B2 (en) | 2006-08-30 | 2017-03-14 | Amazon Technologies, Inc. | Automatically classifying page images |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Weiss et al. | Text mining: predictive methods for analyzing unstructured information | |
| Weiss et al. | Fundamentals of predictive text mining | |
| Chy et al. | Bangla news classification using naive Bayes classifier | |
| Witten | Text Mining. | |
| Wang et al. | A machine learning based approach for table detection on the web | |
| Duwairi | Machine learning for Arabic text categorization | |
| Khusro et al. | On methods and tools of table detection, extraction and annotation in PDF documents | |
| US7469251B2 (en) | Extraction of information from documents | |
| EP1736901B1 (fr) | Procédé de classification des sous-arborescences dans des documents semi-structurés | |
| CN114254653A (zh) | 一种科技项目文本语义抽取与表示分析方法 | |
| US20120109949A1 (en) | Two stage search | |
| Hadni et al. | Word sense disambiguation for Arabic text categorization. | |
| JP4911599B2 (ja) | 風評情報抽出装置及び風評情報抽出方法 | |
| JP2008165598A (ja) | 風評情報抽出装置及び風評情報抽出方法 | |
| WO2009154570A1 (fr) | Système et procédé d'alignement et d'indexation de documents multilingues | |
| Kanaris et al. | Learning to recognize webpage genres | |
| US7877383B2 (en) | Ranking and accessing definitions of terms | |
| Scharkow | Content analysis, automatic | |
| EP1745396B1 (fr) | Outil d'extraction d'informations dans des documents | |
| Hull | Information retrieval using statistical classification | |
| Yurtsever et al. | Figure search by text in large scale digital document collections | |
| CN112199960B (zh) | 一种标准知识元粒度解析系统 | |
| Bia et al. | The Miguel de Cervantes digital library: the Hispanic voice on the web | |
| Pembe et al. | A tree-based learning approach for document structure analysis and its application to web search | |
| Lama | Clustering system based on text mining using the K-means algorithm: news headlines clustering |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
| AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
| REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
| 122 | Ep: pct application non-entry in european phase | ||
| NENP | Non-entry into the national phase |
Ref country code: JP |