DemosΒΆ
The LIT team maintains a number of hosted demos, as well as pre-built launchers for some common tasks and model types.
For publicly-visible demos hosted on Google Cloud, see https://pair-code.github.io/lit/demos/.
Classification ΒΆ
Sentiment and NLI ΒΆ
Hosted instance: https://pair-code.github.io/lit/demos/glue.html 
Code: examples/glue/demo.py
- Multi-task demo: - Sentiment analysis as a binary classification task (SST-2) on single sentences. 
- Natural Language Inference (NLI) using MultiNLI, as a three-way classification task with two-segment input (premise, hypothesis). 
- STS-B textual similarity task (see Regression / Scoring below). 
- Switch tasks using the Settings (βοΈ) menu. 
 
- BERT models of different sizes, built on HuggingFace TF2 (Keras). 
- Supports the widest range of LIT interpretability features: - Model output probabilities, custom thresholds, and multiclass metrics. 
- Jitter plot of output scores, to find confident examples or ones near the margin. 
- Embedding projector to find clusters in representation space. 
- Integrated Gradients, LIME, and other salience methods. 
- Counterfactual generators, including HotFlip for targeted adversarial perturbations. 
 
Tip: check out a case study for this demo on the public LIT website: https://pair-code.github.io/lit/tutorials/sentiment
Regression / Scoring ΒΆ
Textual Similarity (STS-B) ΒΆ
Hosted instance: https://pair-code.github.io/lit/demos/glue.html?models=stsb&dataset=stsb_dev 
Code: examples/glue/demo.py
- STS-B textual similarity task, predicting scores on a range from 0 (unrelated) to 5 (very similar). 
- BERT models built on HuggingFace TF2 (Keras). 
- Supports a wide range of LIT interpretability features: - Model output scores and metrics. 
- Scatter plot of scores and error, and jitter plot of true labels for quick filtering. 
- Embedding projector to find clusters in representation space. 
- Integrated Gradients, LIME, and other salience methods. 
 
Sequence-to-Sequence ΒΆ
Gemma ΒΆ
Code: examples/prompt_debugging/server.py
- Supports Gemma 2B and 7B models using KerasNLP (with TensorFlow or PyTorch) and Transformers (with PyTorch). 
- Interactively debug LLM prompts using sequence salience. 
- Multiple salience methods (grad-l2 and grad-dot-input), at multiple granularities: token-, word-, line-, sentence-, and paragraph-level. 
Tip: check out the in-depth walkthrough at https://ai.google.dev/responsible/model_behavior, part of the Responsible Generative AI Toolkit.
Multimodal ΒΆ
Tabular Data: Penguin Classification ΒΆ
Hosted instance: https://pair-code.github.io/lit/demos/penguins.html 
Code: examples/penguin/demo.py
- Binary classification on penguin dataset. 
- Showing using of LIT on non-text data (numeric and categorical features). 
- Use partial-dependence plots to understand feature importance on individual examples, selections, or the entire evaluation dataset. 
- Use binary classifier threshold setters to find best thresholds for slices of examples to achieve specific fairness constraints, such as demographic parity.