Halteren, 2002 - Google Patents
Writing style recognition and sentence extractionHalteren, 2002
View PDF- Document ID
- 18133780363665572753
- Author
- Halteren H
- Publication year
External Links
Snippet
This paper examines whether feature sets which have been developed for authorship attribution can also be used for the sentence extraction task. Experiments show that the feature sets distinguish significantly better between extract and non-extract sentences than a …
- 238000000605 extraction 0 title abstract description 30
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
- G06F17/3069—Query execution using vector based model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/271—Syntactic parsing, e.g. based on context-free grammar [CFG], unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30716—Browsing or visualization
- G06F17/30719—Summarization for human users
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/2715—Statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30705—Clustering or classification
- G06F17/30707—Clustering or classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30705—Clustering or classification
- G06F17/3071—Clustering or classification including class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/274—Grammatical analysis; Style critique
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2795—Thesaurus; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Stamatatos et al. | Automatic text categorization in terms of genre and author | |
| Stamatatos et al. | Computer-based authorship attribution without lexical measures | |
| Zesch et al. | Wisdom of crowds versus wisdom of linguists–measuring the semantic relatedness of words | |
| Clement et al. | Ngram and Bayesian Classification of Documents for Topic and Authorship. | |
| Corley et al. | Measuring the semantic similarity of texts | |
| Islam et al. | Semantic similarity of short texts | |
| Feng et al. | Characterizing stylistic elements in syntactic structure | |
| El-Shishtawy et al. | Arabic keyphrase extraction using linguistic knowledge and machine learning techniques | |
| Buscaldi et al. | Lipn-core: Semantic text similarity using n-grams, wordnet, syntactic analysis, esa and information retrieval based features | |
| Widyantoro et al. | Citation sentence identification and classification for related work summarization | |
| Halteren | Writing style recognition and sentence extraction | |
| Mekala et al. | A survey on authorship attribution approaches | |
| Yapinus et al. | Automatic multi-document summarization for Indonesian documents using hybrid abstractive-extractive summarization technique | |
| Gupta et al. | Automatic keywords extraction for Punjabi language | |
| Conrado et al. | Exploration of a rich feature set for automatic term extraction | |
| Awajan | Unsupervised approach for automatic keyword extraction from Arabic documents | |
| Boukobza et al. | Multi-word expression identification using sentence surface features | |
| Kiyomarsi et al. | Optimizing persian text summarization based on fuzzy logic approach | |
| Ljubešić et al. | Collocation ranking: frequency vs semantics | |
| Islam et al. | Automatic authorship detection from Bengali text using stylometric approach | |
| Manne et al. | An extensive empirical study of feature terms selection for text summarization and categorization | |
| Pinzhakova et al. | Feature Similarity-based Regression Models for Authorship Verification. | |
| Biggins et al. | University_of_Sheffield: two approaches to semantic text similarity | |
| van Halteren | New feature sets for summarization by sentence extraction | |
| Bartelds et al. | Improving Cross-domain Authorship Attribution by Combining Lexical and Syntactic Features. |