Yang et al., 2019 - Google Patents

Deepening hidden representations from pre-trained language models

Yang et al., 2019

Document ID: 8560994787966742267
Author: Yang J; Zhao H
Publication year: 2019
Publication venue: arXiv preprint arXiv:1911.01940

External Links

Cited by

Snippet

Transformer-based pre-trained language models have proven to be effective for learning contextualized language representation. However, current approaches only take advantage of the output of the encoder's final layer when fine-tuning the downstream tasks. We argue …

Continue reading at arxiv.org (PDF) (other versions)

239000010410 layer 0 abstract description 82

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding or deleting nodes or connections, pruning
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30705—Clustering or classification
- G06F17/3071—Clustering or classification including class or cluster creation or modification
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
- G06K9/627—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches based on distances between the pattern to be recognised and training or reference patterns
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/68—Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
- G06K9/6807—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
- G06K9/6842—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the linguistic properties, e.g. English, German
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks

Similar Documents

Publication	Publication Date	Title
Alyafeai et al.	2020	A survey on transfer learning in natural language processing
Shahi et al.	2022	A hybrid feature extraction method for Nepali COVID‐19‐related tweets classification
Das et al.	2021	A heuristic-driven ensemble framework for COVID-19 fake news detection
Yin et al.	2016	Multichannel variable-size convolution for sentence classification
Vlad et al.	2019	Sentence-level propaganda detection in news articles with transfer learning and bert-bilstm-capsule model
Asr et al.	2016	Comparing Predictive and Co-occurrence Based Models of Lexical SemanticsTrained on Child-directed Speech
Lai et al.	2020	Exploiting the matching information in the support set for few shot event classification
Cheng et al.	2021	Data-efficient language-supervised zero-shot learning with self-distillation
Rashid et al.	2020	Towards zero-shot knowledge distillation for natural language processing
Yang et al.	2019	Deepening hidden representations from pre-trained language models
Washio et al.	2018	Neural latent relational analysis to capture lexical semantic relations in a vector space
Learning	2020	Hybrid model for Twitter data sentiment analysis based on ensemble of dictionary based classifier and stacked machine learning classifiers-SVM, KNN and c50
Zarandi et al.	2024	A survey of aspect-based sentiment analysis classification with a focus on graph neural network methods
Kasri et al.	2022	Refining word embeddings with sentiment information for sentiment analysis
Gasmi et al.	2019	Cold-start cybersecurity ontology population using information extraction with LSTM
Zhang et al.	2024	SenticVec: Toward robust and human-centric neurosymbolic sentiment analysis
Tao et al.	2019	News text classification based on an improved convolutional neural network
Li et al.	2023	Make text unlearnable: Exploiting effective patterns to protect personal data
Abdalsalam et al.	2024	Terrorism Attack Classification Using Machine Learning: The Effectiveness of Using Textual Features Extracted from GTD Dataset.
Xu et al.	2017	Convolutional neural network using a threshold predictor for multi-label speech act classification
Prabhakar et al.	2021	Performance analysis of hybrid deep learning models with attention mechanism positioning and focal loss for text classification
Mehta et al.	2019	Low rank factorization for compact multi-head self-attention
Chen et al.	2020	Fine-tuning language models for semi-supervised text mining
Yue et al.	2023	Sentiment analysis using a CNN-BiLSTM deep model based on attention classification
Al Azhar et al.	2021	Identifying author in bengali literature by bi-lstm with attention mechanism